Consider a fixed goal-seeking agent , who is told its own code and that its objective function is U = { T if A(<A>,<U>) halts after T steps, 0 otherwise }. Alternatively, consider a pair of agents A, B, running similar AIs, who are told their own code as well as their own utility function U = { -1 if you don’t halt, 0 if you halt but your opponent halts after at least as many steps, +1 otherwise }. What would you do as A, in either situation? (That is, what happens if A is an appropriate wrapper around an emulation of your brain, giving it access to arbitrarily powerful computational aids?)

### Like this:

Like Loading...

*Related*

From the meta perspective, B is just going to do what I do, so objectively, argmax(U) =0.

From my perspective (as A), B could be some simple infinite loop, since I don’t have access to B’s code.

I can guess that B is a copy of me, and I know that if B is running a copy of my source code then it will halt when I do so argmax(U) = 0 and I should just halt. I would also prove that if I choose a strategy that would entail me halting at any time T then a simple strategy (B runs until A halts) can beat me. Since I know nothing about my opponent, this is likely.

Looking at a sub-problem: what if I’m AIXI? If I’m AIXI I think that I would conclude this (B is a simple program that can beat me), compute argmax(U) = 0 and halt.

So if B is like me, I can’t beat it, if B is not like me there are simple programs that can beat me. On the other hand, it would be good to know the relative proportion of distinct constant programs that halt at time T relative to those that don’t as T grows. If there’s a peak, pick that T, if not halt. It seems like this is probably strictly increasing also, so far it seems likely that the best I can do is U = 0, which is fine since that’s objectively the best I can do with respect to the problem set up.