Oracles

Suppose that we developed software oracles which could apply large amounts of computational power to solving any formally specified problem (say, you could pay $1k for a human-equivalent). For example, such oracles could find good moves in games which could be characterized completely, or prove theorems in areas which could be rigorously axiomatized, or design machines to achieve a formally specified goal in physical systems which can be modeled algorithmically. What would happen, when these oracles became widely available?

The situation may be somewhat analogous to the development of computers themselves, which are able to apply astronomical amounts of computational power to executing precisely specified protocols. Having developed such tools humanity did not find all of its old problems immediately resolved. Instead, describing what we want done in complete and rigorous detail has become one of our primary economic activities. Human labor is now reserved for implementing protocols that we can’t describe precisely, which turns out to be many of them (including in particular programming itself).

I expect that if an oracle became widely available, phrasing questions of interest algorithmically would present a similar bottleneck in most applications. You might hope that science and engineering would immediately be mostly automated, but a large part of both enterprises involves dealing with systems which we can’t yet characterize exactly, or which would require substantial human labor to characterize exactly. I would be surprised if you could ask algorithmically precise questions that would eliminate humans’ role in designing better DNA synthesis, for example. Instead, you would probably get substantial but not unprecedented productivity gains in these domains, and a lot of human effort would be redirected from problem-solving to problem-defining.

Questions which could already be formulated precisely would naturally benefit most from the availability of oracles. As mentioned, some areas of mathematics and computer science would benefit enormously, while natural sciences and engineering would lag behind. Other areas, such as policy, management, and governance, would benefit even less.

However, there are some general techniques for turning a problem humans want solved into an algorithmic question, which have no analog in the case of computers, and in this hypothetical I expect they would be broadly utilized. Though we would have a very hard time formally describing models for systems we care about (a prerequisite to formally expressing many of the problems we want solved), we can much more easily describe distributions over possible models which include good models for the real world, such as the universal prior. We can formally define a model simply by saying: start with some universal distribution over models and then update that distribution based on these observations of some system we care about. Once we have defined a model in this way, it is possible to define tasks formally with respect to this model. For example, if we consider models which take some inputs to be fed to a manipulator, and output not only observations but also reward signals, we can express the task “find inputs to this model which maximize the expected future reward.” Because we don’t understand the internal structure of the models produced by this process, it would take much more problem-formalizing labor to express a non-reward based task (and in general it isn’t clear how to do it.)

This is an extremely powerful technique, and allows us to apply our formal oracles to a much broader class of systems than we otherwise could.

Unfortunately, oracles being used in this way are very powerful for engineering and scientific applications (and perhaps even for management/policy/governance) but, if human-level, they are extremely dangerous. Traditional arguments about AI risk apply to exactly this sort of system. Maybe there are other techniques which are similarly powerful/general, don’t produce this sort of goal-oriented agent, and would be more broadly applied by virtue of their safety. I don’t know of any. It looks to me like there is a reasonable chance that most of the power would quickly shift to oracles being used as parts of goal-oriented agents, and that eventually this would lead to trouble.

If such oracles became available, the best case scenario is probably that they would be used to develop more robustly beneficial technologies (either a different sort of AI, or human brain emulations). This seems like it would either require a long lead time for a conscientious project, or else a shorter lead time and a clear understanding of how to do this bootstrapping. Broadly, I can imagine three approaches.

  1. Create goal-oriented agents out of oracles, and engineer environments in which those agents will tend to cooperate with humans.
  2. Apply oracles to problems which can be formally defined using techniques we currently understand. Figure out how to use the solutions to this narrower class of problems to bootstrap up to safe AI or emulations.
  3. Discover some other general techniques for applying oracles to real-world problems.

All three approaches seem worth thinking about. I’m going to make a few posts exploring (1), which currently looks like the most promising. Thinking about (2) in advance is a little harder, because your ability to interact adaptively with the oracle is probably useful. But I haven’t seen any serious suggestions even for the first steps of such a program, and that would certainly be valuable. (“Wing it” doesn’t seem like a good solution, given the potential instability of the situation.) I don’t know about (3)–it seems a little unlikely that using an oracle to build a utility maximizer is the only way to automatically formalize a generic real-world problem, but I don’t know of any approaches that aren’t variations on this theme.

Note that if many oracles are being effectively employed serving humans’ interests, then the existence of a few oracles trying to be destructive may not be problematic (since they lack any epistemic advantage over the oracles being usefully exploited–this would only be problematic in universes where offense fundamentally outpaces defense, which seems plausible to me but is another question). The problem is that there aren’t obviously any safe ways to use oracles without handicapping them or limiting their applicability substantially.

Advertisements

One thought on “Oracles

  1. Pingback: On the Difficulty of AI Boxing « Ordinary Ideas

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s