Suppose that I try to define a human’s decision process by observing some decisions and conditioning the universal prior on agreement with those decisions (see here). I have argued that the behavior of the result on new decisions is going to be dominated by the winner of a certain simulation arms race–most of the mass of the universal prior will be spread out amongst simulations spread throughout our universe, and whoever controls those simulations determines the resulting posterior.

This state of affairs is pretty much terrible, and in general the obstruction seems so bad that we shouldn’t ever try to specify anything formally by conditioning the universal distribution and hoping to get a pointer to our universe.

The problem is that describing an event as “that thing in the universe which starts off looking like this…” is dangerous. Instead, we would really like to specify a human’s decision process by saying “you arrange all of these atoms like so, and then you apply these laws of physics…” (This would also deal with the problems with specifying counterfactuals, although those seem to be less severe.)

This is likely to lead to a much larger description, but one which can’t be nearly so easily controlled by a simulator. Fortunately, though the resulting description is longer it is still a very good compression of the human’s behavior, and it also has the virtue of being computationally inexpensive. So we may hope to pinpoint this description by using a prior which weights explanations not by the Kolmogorov complexity, but by some different complexity measure which favors computationally inexpensive simulations.

There are two particularly natural resource-bounded complexity measures, which are worth thinking about independently of this exercise. One is Levin complexity, based on the notion of time-bounded Turing machines. A time bounded Turing machine is a pair (T, <M>), where T is a time bound written in binary and M is a Turing machine. To run such a machine you simulate M for up to T steps, and if it hasn’t output by then you just output NULL. The Levin complexity of X is the length of the shortest time-bounded Turing machine which outputs X, and if we weight each string by its Levin complexity we obtain the speed prior of Shmidhuber (modulo some technical details). We obtain a different complexity measure if we use quantum rather than classical computation.

We could do the same thing with space-bounded Turing machines, which are pairs (S, <M>), where S is a space bound for M written in unary. I don’t know what this complexity measure or the resulting distribution is called, but like the Levin complexity it is natural and mostly model-independent.

Recall that our goal is to find a complexity measure which prefers to explain local phenomena by local physical explanations, rather than using a global physical explanation and locating the local phenomenon within the universe. Unfortunately, the Levin complexity only penalizes the global explanation by a small constant (I would be very surprised if it were more than 10000 bits) compared to the likely difference between the global and local explanations. By using a larger dependence on simulation time we can obtain a more substantial division–for example, if we impose a hard limit on the allowed time, we may be able to exclude the global explanation but not the local explanation. But regardless of what sort of dependence on running time we choose, we can only separate the two hypotheses (without just defaulting to the trivial hypothesis) if we have a much higher marginal cost for extra computation in between these two hypotheses than between the between the trivial hypothesis and the local hypothesis. Calibrating a complexity measure to have this property seems to require a very detailed understanding of the computational complexity of physics; this may be tractable but seems quite difficult and dangerous.

The space complexity penalty has a more substantial effect–the space complexity of the local physical simulation seems likely to dominate the description complexity of the initial state, because a complete simulation requires tracking complete intermediate states to the same accuracy that the initial state needs to be specified. This suggests strongly that the local hypothesis will outcompete the global explanation. Unfortunately, space bounded computation without time restrictions can lead to surprising results and there may be some unexpected hypotheses that outperform the local explanation we hoped for. Imposing a (very generous) time limitation may remove some of these problems.

(Note that even using a complexity measure which successfully penalizes a global explanation still may run into severe problems with TDT impersonators.)

Overall, it seems like it may be possible to pin down a local physical description of human behavior by conditioning a simple complexity-based distribution. It is probably more difficult to specify counterfactual universes or designated input channels without running into simulation arms races.

Do you think it is likely that an AGI would be able to find a human given a program describing the universe with a realistic amount of resources or are you considering this possibility even though it is unlikely? I think that it is something we need to be prepared for, but you seem to me to think that it is likely, which surprises me.

If you are using the universal prior, the AI impersonator (say) doesn’t ever have to run and gets access to unlimited resources (the theoretical output is then just a mathematical abstraction about which our AI reasons).

If you are using something like the speed prior and dealing with the concerns of the post, then the issue is just that pointing to a human in the universe is probably harder than pointing to something just a little later in history when the universe has been (spatially) tiled with computronium or what have you. An AI living in our future will have a more direct way of learning about the experimental setup than simulating the universe, so no computational limitation on such an AI will help avoid this failure mode.

Pingback: Solomonoff Induction and Simulations « Ordinary Ideas

Pingback: Specifying a human precisely (reprise) | Ordinary Ideas