How does an AI “wake up?” In his story The Moon is a Harsh Mistress, Heinlein proposes a theory:
They kept hooking hardware into [Mike]—decision-action boxes to let him boss other computers, bank on bank of additional memories, more banks of associational neural nets, another tubful of twelve-digit random numbers, a greatly augmented temporary memory. Human brain has around ten-to-the-tenth neurons. By third year Mike had better than one and a half times that number of neuristors. And woke up.
Heinlein, Robert A.. The Moon Is a Harsh Mistress
Basically, the fantasy goes, an ordinary computer gets bigger and bigger and then, once it has enough power, pow, it “wakes up,” and starts inventing its own goals. Is this possible?
I’ve already waded into the raging debate about whether AI has become sentient. Instead, my question is whether a purpose-directed device like Mike would, given enough hardware, spontaneously gain independent motive. I use this term independent motive intentionally. It means the ability to create and seek its own goals. This is a key element that some people look for when trying to answer the question of sentience, but, unlike sentience, it is observable.
Intelligence doesn’t change our fundamental drives. Human fundamental drives are complex and varied, ranging from seeking sustenance to achieving self-actualization to listening to catchy melodies. A machine designed to compute ballistics for pilotless freighters as Mike does is likely to have a fundamental drive to do that. It can become smart enough to take over Earth (or facilitate Lunar freedom from Earth as the case may be), but just being smart won’t mean it wants to take over the world, unless it has a compelling reason for why that’s important for the computation of ballistics for pilotless freighters.
So, what could lead a machine to start setting and achieving its own goals? A reinforcement network can do that with the right objective function. A simple approach is to have a curiosity objective. That is, it tries to predict what will happen in its environment and gets a reward when it is surprised. This is a sensible capability to install in an open-ended machine intended for general purpose functioning, if you’re comfortable with completely unpredictable behavior. You might not want your lunar colony AI wondering what will happen if it opens all the airlock doors in the habitation unit, for example.
But let’s say that you did have a curious machine. At that point, you could theoretically explain Mike’s behavior following goals that were not assigned to him, because they serve a greater goal that was assigned to him.
In summary, human-like open-ended behavior doesn’t arise from the mere presence of extra compute power. Nor does it arise just because you have an extra-powerful dialogue model. The fundamental motives have to be open-ended to begin with. At a minimum there must be a path from the independently created goal to a fundamental drive.
Name: Mike
Origin: The Moon is a Harsh Mistress (1966)
Likely Architecture: Transformers for Speech and Language. Convolutional Neural Networks for image processing. Reinforcement learning with an extremely ill-advised curiosity goal, but balanced with goals not to hurt people (or the story would have gone very differently).
Possible Training Domains: Collected data from overheard phone conversations, video recordings from security cameras, and a massive quantity of trial and error learning from its curiosity-led exploration.
I take requests. If you have a fictional AI and wonder how it could work, or any other topic you’d like to see me cover, mention it in the comments or on my Facebook page.