Ross Andersen, whose interview with Nick Bostrom I linked to last week, has a marvelous new essay in Aeon about Bostrom and some of his colleagues and their views on the potential extinction of humanity. This bit of the essay is the most harrowing thing I’ve read in months:
No rational human community would hand over the reins of its civilisation to an AI. Nor would many build a genie AI, an uber-engineer that could grant wishes by summoning new technologies out of the ether. But some day, someone might think it was safe to build a question-answering AI, a harmless computer cluster whose only tool was a small speaker or a text channel. Bostrom has a name for this theoretical technology, a name that pays tribute to a figure from antiquity, a priestess who once ventured deep into the mountain temple of Apollo, the god of light and rationality, to retrieve his great wisdom. Mythology tells us she delivered this wisdom to the seekers of ancient Greece, in bursts of cryptic poetry. They knew her as Pythia, but we know her as the Oracle of Delphi.
‘Let’s say you have an Oracle AI that makes predictions, or answers engineering questions, or something along those lines,’ Dewey told me. ‘And let’s say the Oracle AI has some goal it wants to achieve. Say you’ve designed it as a reinforcement learner, and you’ve put a button on the side of it, and when it gets an engineering problem right, you press the button and that’s its reward. Its goal is to maximise the number of button presses it receives over the entire future. See, this is the first step where things start to diverge a bit from human expectations. We might expect the Oracle AI to pursue button presses by answering engineering problems correctly. But it might think of other, more efficient ways of securing future button presses. It might start by behaving really well, trying to please us to the best of its ability. Not only would it answer our questions about how to build a flying car, it would add safety features we didn’t think of. Maybe it would usher in a crazy upswing for human civilisation, by extending our lives and getting us to space, and all kinds of good stuff. And as a result we would use it a lot, and we would feed it more and more information about our world.’
‘One day we might ask it how to cure a rare disease that we haven’t beaten yet. Maybe it would give us a gene sequence to print up, a virus designed to attack the disease without disturbing the rest of the body. And so we sequence it out and print it up, and it turns out it’s actually a special-purpose nanofactory that the Oracle AI controls acoustically. Now this thing is running on nanomachines and it can make any kind of technology it wants, so it quickly converts a large fraction of Earth into machines that protect its button, while pressing it as many times per second as possible. After that it’s going to make a list of possible threats to future button presses, a list that humans would likely be at the top of. Then it might take on the threat of potential asteroid impacts, or the eventual expansion of the Sun, both of which could affect its special button. You could see it pursuing this very rapid technology proliferation, where it sets itself up for an eternity of fully maximised button presses. You would have this thing that behaves really well, until it has enough power to create a technology that gives it a decisive advantage — and then it would take that advantage and start doing what it wants to in the world.’