Notes - A Brief History of Intelligence

August 26, 2025

Breakthrough #1: Steering and the First Bilaterians

Chapter 2: The Birth of Good and Bad

Early Bilaterians and Steering The animal kingdom, despite its seeming diversity, exhibits a remarkable commonality: almost all animals share a bilaterally symmetric body plan. This design, characterized by a front (containing a mouth, brain, and main sensory organs) and a back (for waste elimination), is considered the most efficient design for a movement system, allowing optimization for forward movement and mechanisms for turning. The emergence of the first brain is directly linked to this bilaterian body plan, with their shared initial purpose being to enable animals to navigate by steering.

Fossils suggest that the first bilaterians were legless, wormlike creatures that appeared in the Ediacaran period, about 635 to 539 million years ago. Modern nematodes, specifically Caenorhabditis elegans, are believed to have remained relatively unchanged and offer a window into these ancient ancestors. Despite having only 302 neurons compared to a human's 85 billion, nematodes exhibit remarkably sophisticated behavior.

Nematode Navigation Strategy Nematodes navigate not by vision, as they lack eyes, but primarily by smell. They exploit the principle that the concentration of a smell increases as they get closer to its source, such as food. Their navigational strategy can be summarized by two rules:

If food smells increase, keep going forward.
If food smells decrease, turn. This breakthrough of steering allowed successful navigation in complex environments like the ocean floor without needing a comprehensive "understanding" or "map" of the world. All that was needed was a brain that could steer a bilateral body towards increasing "good" smells and away from decreasing "bad" smells. This steering mechanism also works for navigating away from negative stimuli like light, noxious heat, cold, or sharp surfaces. While single-celled organisms also steer, the breakthrough with the first brain was steering on the scale of multicellular organisms, requiring neural circuits to activate muscles for specific turning movements.

The First Robot (Roomba Analogy) In the 1980s and 1990s, the "behavioral AI" camp, led by roboticist Rodney Brooks, argued against trying to directly build human-level intelligence and instead advocated for incrementally building simpler intelligences. Brooks’ work led to the Roomba vacuum cleaner, a successful domestic robot whose intelligence shared parallels with the first brains. The Roomba navigated by randomly moving, steering away from obstacles, and turning towards its charging station when its signal was strongest. Both the Roomba and early bilaterians used tricks to navigate complex worlds without actually understanding or modeling that world, prioritizing cheap, effective, and simple-to-discover strategies.

Valence and Decision-Making in Nematodes The breakthrough of steering necessitated that bilaterians categorize the world into things to approach ("good things") and things to avoid ("bad things"). This categorization of stimuli into good and bad is termed valence by psychologists and neuroscientists. It's a primitive, subjective evaluation (not moral judgment) that determines an animal's approach or avoidance response. In nematodes, sensory neurons directly encode these "steering votes", indicating how much a nematode wants to steer towards or away from a stimulus, rather than signaling objective features of the environment. These valence neurons also rapidly adapt to baseline smell levels, allowing them to signal changes in concentration across a wide range, consistently nudging the nematode in the right direction.

Simple brains, even with fewer than a thousand neurons, can make trade-offs in their decision-making. For example, nematodes will cross a copper barrier (an aversive stimulus) to get food, with the willingness to cross depending on the level of copper and hunger. The neural circuit for this involves positive-valence neurons triggering forward movement and negative-valence neurons triggering turning, with these "forward" and "turn" neurons mutually inhibiting each other to integrate votes and make a single choice. The brain's ability to rapidly flip the valence of a stimulus based on internal states is widespread, as seen in how a favorite meal can become a source of nausea when one is full. This occurs through chemicals like insulin ("full signals") and hunger signals that diffuse throughout the body and modulate the responses of sensory neurons.

Chapter 3: The Origin of Emotion

Affect: The Universal Foundation of Emotion Emotions are complex and often culturally biased in their categorization. However, affect is a well-accepted unifying foundation of emotion, consisting of two universal attributes: valence (goodness or badness) and arousal (intensity). Humans are always in an affective state along these two dimensions. The universality of affect is supported by:

Intuition: Nuanced emotion words (calm, elated, tense, upset) can be mapped to affective states.
Biology: There are clear neurophysiological signatures for arousal (e.g., heart rate, perspiration, adrenaline, blood pressure) and valence (e.g., stress hormones, dopamine levels, specific brain region activation).
Universal expressions: All cultures have words for valence and arousal, and newborn children universally express arousal and valence through facial and body signatures (e.g., crying, smiling).

Persistent Behavioral States in Nematodes In simple bilaterians like nematodes, these precursors to emotions are often called "behavioral states". An unfed nematode in a petri dish will "escape" by rapidly swimming and relocating, while upon finding food, it will "exploit" by slowing down and turning locally. After eating enough, it enters "satiation" and becomes immobile. A defining feature of these states is their persistence, lasting long after the triggering stimuli are gone. This seems counterintuitive, as worms might keep escaping long after a predator is gone or exploiting after food is depleted.

The function of this persistence is to allow for "steering in the dark". Sensory clues are transient, and persistent affective states allow the worm to continue a behavior (e.g., searching for food) even when the stimulus has faded, as the initial detection is predictive of nearby resources or dangers. This is similar to the Roomba's "Dirt Detect" feature, where it locally searches for dirt even after the initial patch is gone.

Neuromodulators and Affective States Neuromodulatory neurons, unlike excitatory and inhibitory neurons with specific, short-lived effects, release neuromodulators (e.g., dopamine, serotonin) that have subtle, long-lasting, and wide-ranging effects on many neurons, tuning activity across the entire brain. The balance of these neuromodulators determines a nematode's affective state.

Dopamine is released by nearby rewards and triggers arousal and pursuit (exploitation). It drives animals to want and pursue things.
Serotonin is released upon the consumption of rewards and triggers a state of low arousal, inhibiting reward pursuit (satiation). It makes animals content and less inclined to seek more. This basic dichotomy is remarkably conserved across diverse species from nematodes to humans.

Neuroscientist Kent Berridge's work with rats showed that dopamine is not merely a "pleasure chemical". Rats will self-stimulate dopamine-releasing levers to the point of starvation, suggesting it's more about "wanting" or "pursuit" than pure "liking" or "pleasure". Pleasure, or "liking," is indicated by distinct facial expressions.

Stress Response: Acute and Chronic Movement consumes energy, and the adrenaline-induced escape response is particularly energy-expensive. Evolution developed a trick to save energy during this acute stress: adrenaline not only triggers escape behaviors but also shuts off non-essential, energy-consuming bodily functions like cell growth, digestion, reproduction, and immune responses. This is the acute stress response.

However, this cannot be maintained indefinitely. Therefore, a counterregulatory response evolved, involving anti-stress chemicals like opioids. Opioids are released in response to stressors and, when the stressor fades, initiate recovery-related processes: boosting serotonin and dopamine signals (which were inhibited by stress), inhibiting negative-valence neurons (aiding recovery and rest), and keeping luxury functions like reproduction off until recovery is complete. This explains why opioids are powerful painkillers and decrease sex drive. After stress, nematodes even binge food and become immobile, anticipating future scarcity. This ancient stress cycle is shown in Figure 3.7.

Chronic stress in nematodes, triggered by inescapable or prolonged negative stimuli, causes them to "give up" after a few minutes, stopping movement and trying to escape. This is a clever energy-conservation strategy for unescapable situations. Chronic stress, unlike acute stress, also turns off arousal and motivation (anhedonia). These "blahs and blues" in nematodes are seen in other bilaterians like cockroaches, slugs, and fruit flies, and their mechanisms are exploited by modern drugs.

Chapter 4: Associating, Predicting, and the Dawn of Learning

The Dawn of Associative Learning In the early 20th century, Russian physiologist Ivan Pavlov conducted his Nobel Prize-winning work on the digestive system. During this, he observed what he called "conditional reflexes", which became his most important work. Conditional reflexes are involuntary associations, such as dogs salivating at the sound of a buzzer that previously predicted food. The involuntary nature of these reflexes suggested that learning and memory might be more ancient than previously thought, not requiring later-evolved brain structures. Even rats with their entire brains removed can exhibit conditional reflexes through their spinal cords.

Early bilaterians, including nematodes, utilize these learning mechanisms:

Acquisition: The formation of a new association (e.g., salt leading to hunger, 79).
Extinction: An association weakens if the predictive cue no longer leads to the expected outcome.
Spontaneous recovery: Previously extinguished associations are not completely unlearned but merely suppressed, reappearing after a period.
Reacquisition: If a broken association becomes effective again, it is relearned much faster than a completely new association. These mechanisms allowed simple steering brains to continually learn and adapt to changing environmental contingencies in the ancient Ediacaran Sea.

The Credit Assignment Problem Associative learning faces the credit assignment problem: when an important event (like finding food) occurs, how does the brain determine which of the many preceding cues or actions should be given "credit" for predicting it?. Early bilaterian brains employed four "crude and clever" tricks to solve this:

Eligibility traces: The brain gives credit to the predictive cue that occurred between 0 to 1 second before the event. This is an incredibly old evolutionary innovation built into synaptic protein machinery.
Overshadowing: When multiple predictive cues are present, the brain tends to pick the strongest cues, overshadowing weaker ones.
Latent inhibition: Stimuli regularly experienced in the past are inhibited from making future associations, effectively being flagged as irrelevant background noise.
Blocking: Once an association between a predictive cue and a response is established, further cues that overlap are blocked from being associated with that response. These four tricks are ubiquitous across Bilateria (found in slugs, fish, rats, humans, etc.) and are embedded in the simplest neural circuits, having evolved with the very first brains. While not perfect, these remnants still exist in modern brains.

The Ancient Mechanisms of Learning From a materialist perspective, learning involves physical reorganization in the brain. Ancient ideas about memories as "impressions" or "folds" were incorrect. Instead, learning arises from changes to synaptic connections. Synapses can:

Increase or decrease their strength.
Form new connections or remove old ones.
Increase neurotransmitter release or the number of protein receptors on the postsynaptic neuron. A key mechanism is Hebbian learning, often summarized as "neurons that fire together wire together". This involves protein machinery in synapses detecting when input and output neurons fire within a similar time window, triggering synaptic strengthening. Neuromodulators like serotonin and dopamine can modify these synaptic learning rules, enabling them to "gate" the ability of synapses to form new associations.

These synaptic learning mechanisms are extremely old evolutionary innovations, inherited from the first bilaterians over 550 million years ago. All subsequent learning feats (spatial maps, language, object recognition) were built upon these same foundational mechanisms. Importantly, learning was not the core function of the first brain but merely a "trick to optimize steering decisions", by tweaking the "goodness and badness of things".

Summary of Breakthrough #1: Steering

Our ancestors transitioned from radially symmetric, brainless animals (like coral polyps) to bilaterally symmetric, brain-enabled animals (like nematodes) around 550 million years ago. This change enabled a singular breakthrough: navigating by steering. Key developments included:

A bilateral body plan simplifying navigation to "go forward" or "turn".
A neural architecture for valence, hard-coding stimuli as "good" or "bad".
Mechanisms for modulating valence responses based on internal states.
Circuits for integrating different valence signals into a single steering decision (forming the first brain).
Affective states for making persistent decisions to leave or stay.
The stress response for energy management during hardship.
Associative learning for changing steering decisions based on experience.
Spontaneous recovery and reacquisition for continuous learning in changing environments.
Eligibility traces, overshadowing, latent inhibition, and blocking to tackle the credit assignment problem.

These changes allowed our ancestors to become the first large multicellular animals to survive by navigating with muscles and neurons, laying the groundwork for Breakthrough #2, where learning would take a central role.

Breakthrough #2: Reinforcing and the First Vertebrates

Chapter 5: The Cambrian Explosion

The second major milestone in brain evolution brings us to the Cambrian period, an era that spanned from 540 to 485 million years ago. This ancient world was significantly different from the earlier Ediacaran period, with the gooey microbial mats replaced by a more familiar sandy ocean floor. The slow, sessile creatures of the Ediacaran were succeeded by a bustling variety of large, mobile animals, predominantly arthropods, which were ancestors of modern insects, spiders, and crustaceans. These Cambrian arthropods were formidable, described as massive and armed with oversized claws and armored shells, some growing over five feet long.

Our own ancestors during this period were likely inconspicuous, small fish-like creatures only a few inches long and not particularly numerous.

The Vertebrate Brain Template

A significant development from the Cambrian explosion was the vertebrate brain template, which is still shared among all descendants of these early fish-like creatures today. The source suggests that to understand the human brain, learning about the fish brain provides half the understanding.

The brains of invertebrates (like nematodes, ants, bees, earthworms) show no recognizable structural similarities to human brains due to vast evolutionary distance.
However, when examining the brain of even a distant vertebrate like the jawless lamprey fish, which shares a common ancestor with humans from over 500 million years ago, it reveals most of the same brain structures as ours.
All vertebrate embryos, from fish to humans, develop in similar initial steps:
1. Brains differentiate into three primary bulbs: a forebrain, midbrain, and hindbrain.
2. The forebrain then unfolds into two subsystems; one becomes the cortex and basal ganglia, and the other becomes the thalamus and hypothalamus.
This results in six main structures found in all vertebrate brains: the cortex, basal ganglia, thalamus, hypothalamus, midbrain, and hindbrain.
These structures (except for the cortex, which undergoes unique modifications in mammals) are remarkably similar across modern vertebrates, highlighting their common ancestry.

Thorndike’s Chickens

Around the time Ivan Pavlov was working on conditional reflexes, American psychologist Edward Thorndike (in 1896) began studying animal learning. Initially interested in how children learn, Harvard's restrictions led him to study chickens, cats, and dogs. A staunch Darwinist, Thorndike believed in common learning principles across species, thinking that studying these animals could illuminate human learning.

Thorndike's most famous work was his 1898 doctoral dissertation, "Animal Intelligence: An Experimental Study of the Associative Processes in Animals".
His genius lay in reducing complex theoretical problems to simple, measurable experiments, similar to Pavlov. Instead of saliva, Thorndike measured the speed at which animals learned to escape puzzle boxes.
He constructed various cages with different puzzles (latches, buttons, hoops) that, when solved, opened an escape door. He would place animals inside, motivate them with food outside, and measure escape times over many trials.
Initially, Thorndike explored imitation, allowing untrained cats to observe trained ones. He found no evidence that cats learned better through imitation at that time (though some animals can, as discussed in Breakthrough #4).
In this "failure," he discovered trial-and-error learning. Cats would try various behaviors (scratching, pushing, digging) until they accidentally triggered the escape mechanism. Over trials, they became progressively faster at repeating the successful behaviors, eventually performing only the necessary actions. This learning was quantified by the gradual decrease in escape time.
B. F. Skinner, a successor of Thorndike, later suggested that all animal behavior, including human, was a consequence of trial and error.

The Surprising Smarts of Fish

Thorndike's research extended to fish, demonstrating that they also learn through trial and error.

Simple bilaterians like nematodes, flatworms, or slugs cannot be trained to perform arbitrary sequences of actions or navigate through hoops for food.
This ability to learn arbitrary actions was a new capacity that emerged in early vertebrates.

Chapter 6: The Evolution of Temporal Difference Learning

A Magical Bootstrapping

The first computer algorithm for reinforcement learning was built in 1951 by Marvin Minsky, who also coined the term "artificial intelligence". His algorithm, SNARC (Stochastic Neural-Analog Reinforcement Calculator), was an artificial neural network with forty connections designed to navigate mazes. It was trained by strengthening recently activated synapses upon successful maze escapes, mimicking Thorndike's reinforcement.

SNARC did not work well, succeeding only in simple mazes and failing in more complex situations. Minsky realized that Thorndike's direct reinforcement/punishment method was insufficient for AI.
This highlighted the temporal credit assignment problem: in a game like checkers with hundreds of moves, how do you assign credit to specific moves for a win or loss that only occurs at the end?. Simple bilaterian tricks like overshadowing, latent inhibition, and blocking are useless for stimuli separated in time.
Minsky's approach of reinforcing only recent actions was inadequate for longer sequences, as it would unfairly credit end-game moves. Reinforcing all moves equally was also problematic.

Decades later, in 1984, Richard Sutton proposed a new strategy to solve the temporal credit assignment problem in his PhD dissertation, "Temporal Credit Assignment in Reinforcement Learning".

Sutton, who studied psychology, approached the problem from a biological perspective, seeking to understand how animals actually solved it. He suspected "expectation" was key.
Sutton's radical idea: instead of reinforcing behaviors with actual rewards, reinforce them using predicted rewards.
He decomposed reinforcement learning into two components: an actor and a critic.
- The critic predicts the likelihood of winning at every moment.
- The actor chooses actions and is rewarded not at the game's end, but whenever the critic predicts an increased likelihood of winning.
- The learning signal is the temporal difference in the predicted reward from one moment to the next, hence "temporal difference learning".

Dopamine and Temporal Difference Learning

Neuroscientists recognized that understanding reinforcement learning in vertebrate brains required studying dopamine.

A small cluster of dopamine neurons in the midbrain of all vertebrates sends outputs to many brain regions.
In the 1950s, stimulating these dopamine neurons in rats could make them perform almost any action, even pushing a lever over five thousand times an hour, and preferring dopamine stimulation over food, even starving themselves. This suggested dopamine was not merely a signal for pleasure.
Wolfram Schultz's 1990s experiments with monkey dopamine neurons revealed crucial insights:
- An unexpected reward caused a dopamine burst.
- When a reward was predicted by a cue, the dopamine burst shifted to the predictive cue and was absent at the time of the actual reward.
- Omission of an expected reward caused a dopamine dip (neurons went silent) [111, 112, 113, 386n].
Schultz was initially confused, but Dayan and Montague's landmark 1997 paper, "A Neural Substrate of Prediction and Reward," coauthored with Schultz, showed that dopamine activity encoded reward-prediction error. This represented a beautiful partnership between AI and neuroscience.
Vertebrates, including fish, can learn arbitrary actions not only from rewards and punishments but also from the omission of expected rewards or punishments. This is distinct from simple bilaterians like nematodes, crabs, and honeybees, which cannot learn from omission.
The system of dopamine as a temporal difference learning signal is remarkably conserved across diverse species like fish, rats, monkeys, and humans.

Time Perception

Vertebrate brains also developed the ability to perceive precise timing, allowing animals to learn not just what to do, but when to do it through trial and error. Neurons in structures like the basal ganglia and thalamus respond selectively to specific time intervals.

The Basal Ganglia

The basal ganglia is described as a mesmerizing and beautifully designed brain structure.

It is located between the cortex and the thalamus.
It receives input from the cortex, thalamus, and midbrain, allowing it to monitor actions and the external environment.
Information flows through its complex substructures to an output nucleus of inhibitory neurons, which send powerful connections to motor centers in the brainstem.
By default, this output nucleus is active, suppressing motor circuits. Specific actions are "ungated" (activated) only when specific basal ganglia neurons turn off. Thus, the basal ganglia acts as a "global puppeteer" of an animal's behavior, perpetually gating and ungating actions.
Its function became clear: the basal ganglia learns to repeat actions that maximize dopamine release. Actions leading to dopamine release become more likely to occur (ungated), and actions leading to dopamine inhibition become less likely (gated). This makes the basal ganglia Sutton's "actor".
The hypothalamus (the "steering brain" of early bilaterians) also connects to these dopamine neurons. When the hypothalamus is "happy," it floods the basal ganglia with dopamine; when "upset," it deprives it of dopamine. The basal ganglia acts as a "student" trying to satisfy its "hypothalamic judge".
A leading theory proposes that the basal ganglia contains parallel circuits that implement Sutton's actor-critic system. One circuit is the "actor," learning to repeat dopamine-triggering behaviors, and the other is the "critic," learning to predict future rewards and trigger its own dopamine activation.

Chapter 7: The Problems of Pattern Recognition

The ability to recognize patterns was a key intellectual leap in early vertebrates.

The Problem of Pattern Recognition

Smell Recognition: Our olfactory neurons detect specific molecules, but most smells are a complex soup of multiple molecules. Recognizing the smell of pulled pork isn't about a single "pulled-pork molecule" but a pattern of activated olfactory neurons. Smell recognition is pattern recognition.
Beyond Individual Neurons: Early bilaterians recognized things through individual neuron activations (e.g., a hot plate or a sharp needle). Early vertebrates developed brain structures to decode patterns of neurons, dramatically expanding their perception to recognize smells, faces, or sounds. Modern nematodes and flatworms still do not show evidence of pattern recognition.
Two Computational Challenges for Pattern Recognition:
1. Discrimination Problem: Overlapping patterns can have different meanings (e.g., predator smell vs. food smell). The challenge is to recognize these overlapping patterns as distinct.
2. Generalization Problem: The exact pattern of activated neurons for a stimulus will never be identical across encounters (e.g., different age/sex of a predator, background smells). The challenge is to generalize a previous pattern to recognize novel, similar patterns.

How Computers Recognize Patterns

Artificial neural networks are used to recognize patterns: an input pattern flows through layers of neurons, and by adjusting the weights of connections, the network can perform operations to correctly recognize the pattern.

Backpropagation: Popularized by Hinton, Rumelhart, and Williams in the 1980s, this is a state-of-the-art mechanism for training networks. It involves supervised learning, where the network is given examples with correct answers and weights are nudged to reduce the error between actual and desired output. This is used in image recognition, natural language processing, speech recognition, and self-driving cars.
Biological Implausibility: Even Hinton recognized backpropagation is a poor model for how the brain works.
- The brain does not do supervised learning (e.g., children recognize different smells without being told the labels).
- Backpropagation is biologically implausible because it requires magically nudging millions of synapses simultaneously and precisely, which the brain cannot do.

How Vertebrate Brains Recognize Patterns

The cortex plays a role in pattern recognition for vertebrates.

Solving the Discrimination Problem: Olfactory neurons send signals to cortical pyramidal neurons, exhibiting:
1. Large dimensionality expansion: A small number of olfactory neurons connect to a much larger number of cortical neurons.
2. Sparse connections: Each olfactory cell connects only to a subset of cortical cells.
- These features achieve pattern separation, decorrelation, or orthogonalization, meaning even overlapping input patterns activate distinct cortical patterns, solving the discrimination problem.
Solving the Generalization Problem (Auto-association):
- Pyramidal cells in the cortex send axons back to themselves, synapsing on other nearby pyramidal cells.
- When a smell pattern activates these neurons, the ensemble of cells is automatically wired together through Hebbian plasticity ("neurons that fire together wire together").
- This auto-association allows the cortex to reactivate the full pattern even if the next input is incomplete, solving the generalization problem.
Content-Addressable Memory: Auto-association means vertebrate brains use content-addressable memory, where memories are recalled by providing subsets of the original experience, reactivating the pattern. This contrasts with computers' register-addressable memory, which requires unique memory addresses.
Catastrophic Forgetting: A challenge for auto-associative memory is catastrophic forgetting, where storing new information in shared neuron populations risks overwriting old memories.
- Cohen and McCloskey (1989) found this "ubiquitous and devastating limitation" in artificial neural networks: training for a new pattern can interfere with old ones.
- Modern AI systems (like ChatGPT) avoid this by freezing their models after training, not allowing continual learning. Human-level AI needs continual learning. Early bilaterians avoided it by not learning patterns.
- Vertebrate brains address this: cortex's pattern separation shields it. Learning is selective, occurring only during moments of surprise (when an incoming pattern is novel enough to trigger neuromodulator release and synaptic changes).

The Invariance Problem

This problem refers to the ability to recognize the same object despite variations in how it is perceived (e.g., rotation, distance, lighting, pitch of sound). For example, a fish recognizing the same 3D object from new angles in one shot.

Hubel and Wiesel (1958), studying cat visual cortices, found that individual neurons in V1 (the first visual area) were selective for specific line orientations at specific locations. V1 decomposes complex visual input into simpler features.
The visual system forms a hierarchy (V1 -> V2 -> V4 -> IT), where neurons at higher levels respond to progressively more sophisticated features and can detect objects across wider visual fields.
Fukushima's Convolutional Neural Networks (CNNs) were inspired by Hubel and Wiesel's findings.
- CNNs decompose input pictures into multiple feature maps (convolution) that signal the location of features (like lines).
- Their output is compressed and passed to other feature maps to combine into higher-level features. This structure, analogous to mammalian visual processing, successfully recognized patterns.
- The brilliance of CNNs lies in their inductive bias of translational invariance: assuming a feature in one location is the same as in another, which is encoded directly into the network architecture. This makes learning fast and efficient.
The coevolution of sensory organs and pattern recognition abilities led to an explosion of complexity in vertebrate sensory systems.

Chapter 8: Why Life Got Curious

Reinforcement learning systems, like those of early vertebrates, faced the exploration-exploitation dilemma: when to explore new options versus exploit known rewards.

Montezuma's Revenge: This video game illustrates the difficulty of reinforcement learning when rewards are very delayed, making it hard to identify what behaviors to reinforce. Humans, however, can beat this game.
Curiosity as a Solution: An alternative approach is to make AI systems explicitly curious, rewarding them for exploring new places and things, making surprise itself reinforcing. AI systems playing Montezuma's Revenge, endowed with this intrinsic motivation, explored deliberately and eventually beat the first level.
Curiosity in Vertebrates: This suggests that vertebrate brains, designed for reinforcement learning, also exhibit curiosity. Indeed, curiosity is seen across all vertebrates (fish, mice, monkeys, human infants). In vertebrates, surprise itself triggers dopamine release, even without an external "real" reward.
Most invertebrates do not exhibit curiosity, with only the most advanced ones, such as insects and cephalopods, showing it [184, 185, 382n].
Evolutionary explanation: Vertebrates gained an extra boost of reinforcement from surprising things, driving them to pursue and explore. This is exploited in gambling games, where uncertainty and surprising wins provide dopamine boosts.

Chapter 9: The First Model of the World

The First Model of the World

Humans can navigate familiar environments like their homes in the dark because their brains have built a spatial map – an internal model of the external world. As they move, their brain updates their position within this map.

Invertebrate Limitations: Many advanced invertebrates (e.g., bees and ants) are unable to solve complex spatial tasks. Ants, for example, navigate by following rules of when to turn where, not by constructing a spatial map. If an ant returning to its nest is placed at a different point on its outbound path, it will repeat the entire loop instead of simply turning around.
Vertebrate Ability: The ability to construct an internal model of the external world, including a spatial map, was a gift inherited from the brains of the first vertebrates.

The Cortex of Early Vertebrates

The cortex of early vertebrates was divided into medial, lateral, and ventral regions [149, 150, 165n, 383n].
The medial cortex is the part that later evolved into the hippocampus in mammals.
This region contains neurons (like place cells, border cells, and head-direction cells) that activate only when the animal is at a specific location, at a border of a tank, or facing specific directions.
Visual, vestibular (sense of balance and head motion), and head-direction signals are mixed and converted into a spatial map within the medial cortex.
Fish are able to recognize locations, estimate distances, and remember the position of things relative to others. Damage to their medial cortex (or homologous structures) significantly impairs spatial navigation [150, 151, 384n].

Summary of Breakthrough #2: Reinforcing

The evolution of our ancestors from simple wormlike bilaterians to fishlike vertebrates, around five hundred million years ago, brought forth numerous new brain structures and abilities, most of which can be understood as enabling or emerging from reinforcement learning.

These key developments include:

Dopamine becoming a temporal difference learning signal, which helped solve the temporal credit assignment problem and enabled animals to learn through trial and error.
The basal ganglia emerging as an actor-critic system, enabling animals to generate this dopamine signal by predicting future rewards and using it to reinforce and punish behaviors.
Curiosity emerging as a necessary part of making reinforcement learning work, effectively solving the exploration-exploitation dilemma.
The cortex emerging as an auto-associative network, making pattern recognition possible.
The perception of precise timing emerging, allowing animals to learn not only what to do but when to do it through trial and error.
The perception of three-dimensional space emerging (in the hippocampus and other structures), enabling animals to recognize their location and remember the location of things relative to others.

This model-free reinforcement learning came with a suite of familiar intellectual and affective features, including omission learning, time perception, curiosity, fear, excitement, disappointment, and relief [190, 190n, 191n].

Breakthrough #3: Simulating and the First Mammals

Chapter 10: The Neural Dark Ages

This chapter describes the evolutionary stagnation of brains from approximately 420 to 375 million years ago, an era referred to as the "neural dark age". During this time, oceans were filled with diverse predatory fish, relegating arthropods and other invertebrates to specific niches. Some invertebrates, like cephalopods (ancestors of squids and octopuses), evolved impressive intelligence independently under pressure from fish predation.

The Permian-Triassic mass extinction event, around 250 million years ago, was the deadliest in Earth's history, causing the extinction of 96% of marine life and 70% of land life. Our ancestors, fish that evolved lungs and the ability to wade on land to exploit new nutritional niches, were uniquely positioned for survival. The large therapsids (hairy lizards) almost entirely died out due to their high caloric needs during a period of reduced food access. Only small, plant-eating burrowing cynodonts, resembling modern mice or squirrels, survived. This burrowing and arboreal lifestyle gave early mammals a singular advantage: the gift of the first move, allowing them to observe and decide before acting.

Inside the Brain of the First Mammals

For hundreds of millions of years, from early vertebrates to reptiles and therapsids, brains remained largely unchanged, characterized as being "stuck in a neural dark age". Evolution focused on modifying other biological structures like jaws, armor, lungs, and warm-bloodedness.

However, in early mammals, a spark of neural innovation emerged: the fish cortex split into four separate structures.

The ventral cortex of early vertebrates became the associative amygdala in mammals, continuing its role in recognizing patterns predictive of valence outcomes.
The lateral cortex's smell-pattern detectors became the olfactory cortex, still detecting smell patterns through auto-associative networks.
The medial cortex, responsible for spatial maps, transformed into the hippocampus of mammals, performing a similar function with similar circuitry.
A fourth region underwent a more significant change, transforming into the neocortex, which contained completely different circuitry.

The neocortex in early mammals was small, with most brain volume dedicated to the olfactory cortex, reflecting their acute sense of smell. Despite its size, this small neocortex was the kernel from which human intelligence would arise, expanding to 70% of human brain volume in later breakthroughs.

Chapter 11: Generative Models and the Neocortical Mystery

The human neocortex is a thin sheet (2-4 millimeters thick) that folds to fit inside the skull, and if unfolded, it would be about three square feet in surface area. Initially, it seemed to serve a multitude of different functions, such as processing visual input in the visual cortex.

Mountcastle’s Crazy Idea

In the mid-twentieth century, neuroscientist Vernon Mountcastle proposed a remarkable theory: the neocortex is composed of a repeating and duplicated microcircuit, called the neocortical column. He observed three key facts:

Neurons within a vertical column (about 500 microns in diameter) of the neocortical sheet responded similarly to sensory stimuli, while distant neurons did not.
There were many connections vertically within a column but fewer between columns.
The neocortex looked largely identical everywhere under a microscope across different sensory areas (visual, auditory, somatosensory) and across species (rat, monkey, human).

Mountcastle concluded that each neocortical column performs the same computation, with the only difference between regions being the input they receive and where they send their output. This was experimentally supported by rewiring a ferret's auditory cortex to receive visual input, which then processed visual information correctly. This hypothesis suggests that understanding the neocortex only requires understanding this single microcircuit, as it implements a general and universal algorithm applicable to diverse functions like movement, language, and perception. The microcircuitry of the neocortical column consists of six layers of neurons connected in a complicated but consistent way, with specific neurons in different layers projecting to or receiving input from structures like the basal ganglia, thalamus, and motor areas.

Peculiar Properties of Perception

Nineteenth-century scientists identified three peculiar properties of perception, largely managed by the neocortex, that offer clues to its function:

Property #1: Filling In: The human mind automatically and unconsciously fills in missing information in sensory input, such as perceiving a full word or shape despite missing lines. This applies across senses, like understanding garbled speech.
Property #2: One at a Time: When sensory evidence is ambiguous (e.g., an image that can be seen as a staircase or a protrusion), the brain perceives only one interpretation at a time, such as seeing a duck or a rabbit, but not both simultaneously. The "cocktail-party effect" in audition demonstrates this.
Property #3: Can’t Unsee: Once the brain receives a reasonable interpretation for vague sensory input (e.g., blobs perceived as a frog), it sticks to that interpretation, and it becomes impossible to "unsee" it.

Hermann von Helmholtz proposed that perception is not directly what is experienced, but rather an "inference" – a simulated reality inferred from sensory input. This explains all three properties: the brain fills in missing parts to decipher truth, picks one reality to simulate from ambiguities, and maintains a consistent interpretation once found.

Generative Models: Recognizing by Simulating

In the 1990s, Geoffrey Hinton and Peter Dayan developed the Helmholtz machine, a proof of concept for Helmholtz's "perception by inference". This unsupervised learning model learned to recognize things by generating its own data and comparing it to actual data.

Unlike typical neural networks, the Helmholtz machine had backward connections, allowing information to flow both up for recognition and down for generation.
It learned in two modes: recognition mode (adjusting backward weights to reproduce input) and generative mode (adjusting forward weights to correctly recognize imagined output).
This network could recognize imperfect handwritten numbers without supervision, generalize well to different variations of numbers, and generate novel images of handwritten numbers.

Generative models capture the essential features of input data without supervision and can generate realistic novel data, such as faces of people who do not exist. The neocortex functions as a "constrained hallucination," matching its inner simulation of reality to sensory data. This explains "filling-in" effects and why we dream (dreams might be a forced generation process to stabilize the generative model). The neocortex's power lies in its ability to independently explore this inner "world model," predicting consequences of actions never taken, and enabling imagination. This ability to render future possibilities and relive past events was the third breakthrough, giving early mammals planning, episodic memory, and causal reasoning.

Chapter 12: Mice in the Imaginarium

The emergence of the neocortex granted early mammals the fundamental ability to imagine the world as it is not. While previous cortex structures could recognize objects, the neocortex's generative mode, often seen as a byproduct of recognition, became a crucial tool for imagination.

New Ability #1: Vicarious Trial and Error

Psychologist Edward Tolman observed that rats in mazes would pause at decision points and look back and forth ("vicarious trial and error") before choosing a direction. Neuroscientists David Redish and Adam Johnson found that during these pauses, the rat's hippocampus rapidly played out sequences of place codes for possible future paths, showing the rat was literally imagining these paths.

This "learning by imagining" represents a major advancement in solving the credit assignment problem. Unlike early vertebrates that could only reinforce actions actually taken, mammals could simulate alternative paths and their outcomes. This allowed them to learn from counterfactuals (what would have happened if they chose differently), making learning faster and more flexible. This causal reasoning, perceived intuitively by the brain, likely evolved for its usefulness in learning from alternative past choices. This ability is seen in mammals like rats, and some birds, but not in fish or reptiles.

New Ability #3: Episodic Memory

Episodic memory, the ability to recall specific past episodes of one's life, is distinct from procedural memory. The case of Henry Molaison, who lost the ability to form new memories after his hippocampus was removed, demonstrated the hippocampus's crucial role. Episodic memories are simulations, not perfectly accurate recordings, and are often "filled in" during recollection, similar to visual illusions. This explains why eyewitness testimonies can be unreliable and why repeatedly imagining a past event can create false confidence in its occurrence.

In mammalian brains, episodic memory results from a partnership between the older hippocampus and the newer neocortex. The hippocampus rapidly encodes new patterns, while the neocortex simulates detailed aspects of the world. This partnership addresses catastrophic forgetting by using "generative replay" or "experience replay," where the hippocampus replays recent memories alongside old ones to help the neocortex integrate new information without overwriting old memories.

Chapter 13: Model-Based Reinforcement Learning

Model-based reinforcement learning involves mentally simulating possible future actions before selecting one. AlphaZero, Google's Go AI, used this approach, playing out thousands of simulations of how a game might unfold given selected moves, prioritizing a thousand futures over trillions of possibilities. AlphaZero's strategy was an elaboration of Sutton's temporal difference learning, using search to verify and expand on the "hunches" produced by an actor-critic system. However, real-world planning is far more complex than Go due to continuous actions, incomplete information, and complex rewards. Mammalian brains uniquely exhibit flexibility in changing planning approaches depending on the situation.

The Frontal Neocortex and Controlling the Inner Simulation

The human neocortex is divided into two halves: the sensory neocortex (back half), which renders simulations of the external world, and the frontal neocortex (front half), which contains the agranular prefrontal cortex (aPFC), granular prefrontal cortex (gPFC), and motor cortex. The aPFC is the most ancient frontal region, evolving in the first mammals. Damage to the aPFC, as seen in patient L, can lead to akinetic mutism, where the patient loses all intention and ability to set goals, highlighting its fundamental role in what it means to be a mammal.

The aPFC creates a "self-model", a simulation of an animal's own movements and internal states, and constructs "intent" to explain its own behavior. It is the locus for the first goals. Rats with aPFC damage struggle to monitor ongoing plans, perform actions out of sequence, and are impulsive. The aPFC tracks progress toward an imagined goal, with specific neurons firing at specific locations within a task sequence.

How Mammals Make Choices

Mammals make deliberative choices, such as a rat at a maze fork, in three steps:

Step #1: Triggering Simulation: The aPFC gets most excited when things go wrong or unexpected events occur, leading to uncertainty. This uncertainty signals when to engage in costly simulations; if things are predictable, model-free learning (basal ganglia-driven habits) is used.
Step #2: Simulating Options: Mammals tackle the search problem by exploring specific paths that different columns of the aPFC are already predicting. This internal simulation occurs rapidly, playing out potential scenarios.
Step #3: Choosing an Option: The simulated outcomes are evaluated by older vertebrate structures (basal ganglia, amygdala, hypothalamus), reinforcing the best imagined path in the basal ganglia. This vicariously trains the basal ganglia, making it more likely to choose that path when the animal acts in the real world.

Goals and Habits (or the Inner Duality of Mammals)

Mammals exhibit a duality between model-based (goal-driven, "system 2" thinking) and model-free (habitual, "system 1" thinking) decision-making.

Habits are automated actions triggered directly by stimuli, controlled by the basal ganglia, saving time and energy.
Goals, which may not have evolved until early mammals, enable "volitional" choices where an animal pursues a specific outcome. The basal ganglia is "intent-free," simply repeating reinforced behaviors.

The aPFC's role is to construct intent and try to make its intent come true. It learns goals by observing basal ganglia-controlled behavior, then flips roles to teach the basal ganglia.

How Mammals Control Themselves: Attention, Working Memory, and Self-Control

The frontal neocortex controls attention, working memory, and executive control, functions that are intimately related as different applications of controlling the neocortical simulation.

Attention helps a mammal stick to its plan by filtering information reaching the basal ganglia.
Working memory involves maintaining representations in the absence of sensory cues, requiring the aPFC to continually reinvoke an inner simulation.
Self-control is the ability to inhibit reflexive responses and engage in deliberative thinking.

All these functions are manifestations of brains trying to select and maintain what simulation to render. The aPFC "controls" behavior by convincing the basal ganglia of the correct choice through vicarious reinforcement and information filtering. Early mammals could flexibly determine when to simulate and when to rely on habits, and intelligently select what to simulate.

Chapter 14: The Secret to Dishwashing Robots

The motor cortex is a thin band of neocortex on the edge of the frontal cortex, mapping the entire body with more space dedicated to areas of skilled motor control (e.g., mouth and hands). In humans and primates, it's considered the primary system for controlling movement, with neurons projecting directly to the spinal cord. Damage to the motor cortex can cause paralysis or loss of fine motor skills.

Predictions, Not Commands

Karl Friston proposes an alternative view: the motor cortex doesn't generate motor commands but rather motor predictions, trying to explain and predict body movements observed in the somatosensory cortex. The motor cortex is wired to make its predictions come true, effectively controlling movement. Evidence for this comes from:

Non-primate mammals often lack a motor cortex, or damage to it doesn't cause paralysis, indicating that their movements are primarily controlled by older circuits.
Motor cortex activity in non-primate mammals is most activated by movements requiring planning, and its activity precedes precision movements, supporting the idea of simulated movements.

A Hierarchy of Goals: A Balance of Simulation and Automation

The frontal neocortex of early placental mammals was organized into a hierarchy of goals.

The aPFC at the top constructs high-level goals (e.g., "drink water") based on amygdala and hypothalamus activation.
These goals propagate to the premotor cortex, which constructs subgoals, and further to the motor cortex, which creates sub-subgoals (e.g., "position index finger here").
This hierarchy distributes processing effort, allowing the aPFC to focus on high-level navigation while lower areas manage specific movements.

The basal ganglia forms loops with the frontal cortex, managing different levels of this motor hierarchy. The front part of the basal ganglia automates high-level goals (e.g., cravings), while the back part automates specific motor skills. Both neocortex (slower, flexible, simulation) and basal ganglia (faster, less flexible, automation) contribute, with the neocortex training the basal ganglia over time. Damage to the motor cortex impairs planning and learning new movements but not executing well-trained ones. An intact motor hierarchy allows for impressive flexibility, with mammals continuously updating subgoals to achieve a common goal. Future robots with similar motor systems could learn complex skills, adapt movements in real-time, and accomplish high-level goals with distributed subgoals, operating efficiently like mammalian brains.

Breakthrough #4: Mentalizing and the First Primates:

Chapter 15: The Arms Race for Political Savvy

Around sixty-six million years ago, an extinction event marked the beginning of the Era of Mammals. Our direct ancestors, the first primates, found refuge in tall trees of Africa, shifting from nocturnal to diurnal, developing opposable thumbs to grasp branches, and moving to a fruit-based diet. They began living in groups, becoming largely free from predation and food competition, and their brains dramatically expanded to well over a hundred times their original size. While many mammalian lineages did not experience such proportional brain growth, it occurred significantly in elephants, dolphins, and primates.

The Social-Brain Hypothesis

In the 1980s and 1990s, primatologists like Nicholas Humphrey, Frans de Waal, and Robin Dunbar proposed the social-brain hypothesis, suggesting that the growth of the primate brain was a consequence of unique social demands, not ecological ones. They argued that primates formed stable mini-societies (groups of individuals that stayed together for long periods), and maintaining these large social groups necessitated unique cognitive abilities, leading to pressure for bigger brains. Robin Dunbar's research confirmed a correlation: the bigger a primate's neocortex relative to its brain, the bigger its social group. This correlation, however, does not hold for most other animals, indicating that the type of social group in primates was unique.

The Evolutionary Tension Between the Collective and the Individual

Early mammals were likely more social than their lizard-like ancestors due to giving birth to helpless young, which required mothers to form strong bonds for nurturing and protection. Mammals also engage in more play, which might have refined motor cortex skills in young individuals. Animals evolved mechanisms to signal strength and submission (e.g., dogs bowing, bears sitting down) to minimize infighting and save energy. Mammalian social structures include solitary, pair-bonded, harems (one dominant male, many females), and multi-male groups (many males and females). In harems and multi-male groups, hierarchical rigidity (e.g., a single dominant male mating) reduces competition. Hierarchies are typically determined by physical attributes like strength, size, and aggression.

Machiavellian Apes

Primatologist Emil Menzel's experiments with chimpanzees revealed Machiavellian behavior. A subordinate chimp, Belle, initially shared hidden food locations but stopped when the dominant male, Rock, monopolized it. Belle then developed sophisticated strategies to withhold information from Rock. Other studies showed apes could deduce the intent of an experimenter (e.g., distinguishing between intentionally vs. accidentally marked food boxes). They also differentiated between experimenters who were unable to give food (e.g., food stuck) versus unwilling to give food, choosing to return to the former. These abilities hint at an understanding of others' mental states.

Primate Politics

Primate social behavior goes beyond simple physical hierarchies. Grooming, initially thought to be for hygiene, primarily serves a social purpose, correlating with group size rather than body size, and involves specific, persistent partners. Monkeys track and remember individuals and their relationships within their groups; for example, upon hearing a child's distress call, group members look at the child's mother rather than the direction of the call. Dominance relationships are explicit, persistent, and transitive (if A submits to B, and B to C, then A submits to C), maintained through routines like approach-retreat signals. Primates are extremely sensitive to interactions that violate social hierarchies, reacting strongly to unexpected displays of dominance from lower-ranking individuals.

Crucially, primate social hierarchies are often determined by political, not just physical, power. Family lineage plays a significant role, with children often inheriting their mother's rank. Alliances are key to an individual's rank, as seen when other monkeys join conflicts, often non-family members, indicating a strategic forging of allyships. These allyships, or friendships, are built through grooming and support in conflicts, demonstrating reciprocity and trust (e.g., chimps sharing food only with grooming partners). Powerful allies offer benefits like reduced harassment for lower-ranking individuals and increased access to resources. Primates show political forethought by investing in relationships with higher-ranking individuals, competing to groom them, and preferentially mating with them. They also exhibit cleverness in choosing allies, befriending skilled low-ranking individuals even without immediate reward. After conflicts, primates actively try to "make up" with those they fought, especially non-family members, by grooming or interacting with their families.

The Arms Race for Political Savvy

This broad suite of complex social behaviors, laying the foundation for human interaction, likely evolved in early primates due to the unique niche they occupied after the Permian-Triassic extinction event. The pressure to navigate these complex social relationships drove the evolution of larger brains and the ability to reason about the minds of others. This represents a shift from surviving dangers of predators to navigating the subtler dangers of politics.

Chapter 16: How to Model Other Minds

The primate brain underwent a dramatic expansion in size (from less than half a gram to about 350 grams over 60 million years). This raises the question of which brain structures merely scaled up and which were truly new. Some structures, like the somatosensory cortex, scale naturally with body size, while others, like the basal ganglia or visual cortex, improve performance by adding more neurons. New hierarchical layers in the neocortex also added qualitatively different processing abilities. Most of the early primate brain was a scaled-up version of the mammalian brain, with disproportionately more neocortex dedicated to functions like vision and touch, but the fundamental functions and connectivity remained similar.

The New Neocortical Regions of Early Primates

Despite the scaling, two truly new areas of neocortex emerged in the primate lineage:

Granular prefrontal cortex (gPFC): A new addition to the frontal cortex, wrapping around the older agranular prefrontal cortex (aPFC).
Primate sensory cortex (PSC): An amalgamation of new sensory cortex areas, including the superior temporal sulcus (STS) and temporoparietal junction (TPJ) [255, 260n, 371]. These two new areas are highly interconnected, forming their own network of frontal and sensory neocortical regions. Their "newness" doesn't come from different microcircuitry (they still exhibit the same columnar microcircuitry as other neocortical areas) but from their unique input/output connectivity and the specific generative models they construct.

The Neocortex's Riddle (Revisited)

Damage to the older aPFC in humans results in severe symptoms like akinetic mutism (complete muteness and intentionlessness). In stark contrast, damage to the newer granular prefrontal cortex (gPFC) often causes minimal symptoms, leading early neuroscientists to consider its function a "riddle". For example, a patient named K.M. showed no intellect or perception deficits, and even an increase in IQ, after a third of his frontal cortex (gPFC area) was removed.

Self-Reference and Metacognition

A crucial clue emerged when it was found that the gPFC becomes uniquely active during tasks requiring self-reference, such as evaluating one's personality traits, self-related mind wandering, or thinking about one's own feelings and intentions.

A study with participants (healthy, gPFC damage, or hippocampus damage) asked them to generate narratives about themselves related to cue words. Humans with gPFC damage could imagine complex scenes but struggled to imagine themselves in those scenes, sometimes omitting themselves entirely from their narratives. Damage to the hippocampus had the opposite effect, impairing external details but allowing self-imagination.
This suggests that these new primate areas are constructing a generative model of the older mammalian aPFC and sensory cortex itself. The gPFC explains the aPFC's model of intent, effectively inventing a "mind"—an internal simulation of what the animal wants, knows, and thinks. This is referred to as metacognition: the ability to think about thinking.

Theory of Mind

The gPFC is also activated in tasks requiring the ability to infer the intent or knowledge of others (theory of mind), both in nonhuman primates and humans. Damage to the gPFC impairs performance on such tasks. Furthermore, the size of these granular prefrontal areas correlates with social-network size in primates and humans, and with better performance on theory-of-mind tasks. The fact that these seemingly distinct functions (modeling one's own mind and modeling others' minds) share highly overlapping neural substrates suggests a common evolutionary purpose and mechanism.

Modeling Your Mind to Model Other Minds

The prevailing hypothesis for how theory of mind works is "simulation theory" or "social projection theory": we understand others by first understanding our own minds and then using this understanding to imagine ourselves in their situation, with their knowledge and history [263-264, 389n]. For example, inferring someone yelled due to stress, based on one's own experience of yelling when stressed.

Evidence supporting this includes a strong correlation between children's ability to report on their own mental states and their ability to report on others'. Damage to one impairs the other, and self-other distinctions can get cross-wired (e.g., being thirsty makes one incorrectly assume others are thirsty).
Conceptually, the uniquely primate neocortical areas build a generative model of one's own inner simulation (one's mind) and then use this model to simulate the minds of others.
Current AI systems are far from achieving this level of theory of mind, which is crucial for humanlike AI to understand human intentions and navigate social interactions. Theory of mind is vital for future superintelligent AI systems to avoid catastrophic misinterpretations of human requests (e.g., the "paper-clip problem" where an AI might convert the Earth into paper clips if commanded to maximize paper-clip production).
Ultimately, this ability was driven by the "subtler and far more cutting dangers of politics", enabling primates to climb social ladders, manage reputation, forge alliances, and resolve disputes.

Chapter 17: Monkey Hammers and Self-Driving Cars

Jane Goodall's discovery of chimpanzee tool use in 1960 revealed that chimps use sticks to fish for termites. While tool use exists in other animals (elephants, mongooses, crows, octopuses, fish), primate tool use is uniquely sophisticated. Chimpanzees exhibit over twenty different tool-using behaviors, actively manufacture their tools (e.g., shortening or sharpening sticks), and show remarkable diversity across different social groups (e.g., some chimp groups use rocks to open nuts, others don't). This chapter explores the link between primates' unique tool-using skills and their theory of mind abilities.

Monkey Mirrors

In 1990, Giacomo Rizzolatti's lab discovered "mirror neurons" in macaque monkeys. These neurons, located in the premotor and motor cortex, activate not only when a monkey performs a specific hand movement (like grasping food) but also when it merely observes a human performing that same movement.

One interpretation is that mirror neurons are evidence that monkeys imagine themselves doing what they see others do. The motor cortex activates during both actual and imagined movements.
The primary benefit of this mental simulation of observed movements is that it helps primates learn new skills through observation. Mentally rehearsing actions improves performance, and simulating the sensorimotor aspects of others' behaviors (e.g., how they hold a tool, the weight of a box) aids in correctly identifying and understanding those actions.

Learning from Others

Chimpanzees primarily learn tool techniques through observing each other, rather than independent invention. The amount of time a young chimp watches its mother use tools is a significant predictor of when it will learn the skill; without observational learning, many chimps never acquire complex tool use. These learned skills can propagate throughout an entire group and be passed down across multiple generations. Transmissibility is crucial because if one individual figures out a trick, the entire group can acquire and perpetuate the skill.

While other animals like rats, mongooses, dolphins, dogs, fish, and reptiles also engage in observational learning, there's a key difference. Primate observational learning often involves acquiring novel skills, rather than just choosing between pre-existing techniques (e.g., mongoose offspring selecting a biting or throwing method already in their repertoire). Furthermore, primates engage in active teaching. Mothers reorient tools for their young, exaggerate behaviors to demonstrate skills, provide extra or broken sticks, and swap tools when their offspring struggle. Understanding the intentions of observed movements is essential for effective observational learning, as it allows primates to filter out irrelevant actions and grasp the core of the skill.

Robot Imitation

Early autonomous cars, like ALVINN (Autonomous Land Vehicle in a Neural Network) in 1990, used imitation learning by directly copying human expert behavior. However, this approach was dangerously brittle: small errors would cascade into catastrophic failures because the AI was only trained on correct driving and had never seen (or learned from) a human recovering from a mistake. Inverse reinforcement learning, pioneered by Andrew Ng, offered a solution. Instead of directly copying actions, these AI systems are trained to infer the expert's intended trajectories (their "intent"). They then learn by trial and error, using this inferred reward function to reinforce and punish themselves. This method filters out expert mistakes and allows the AI to correct its own errors, leading to success in complex tasks like acrobatic helicopter flight. This process mirrors how primates use theory of mind to understand the intentions behind observed behaviors for more effective observational learning.

Chapter 18: Why Rats Can’t Go Grocery Shopping

The Ecological-Brain Hypothesis

While the social-brain hypothesis explains much of primate brain expansion, the ecological-brain hypothesis offers an alternative or complementary explanation. Early primates' frugivore (fruit-eating) diet presented unique cognitive challenges: fruit is only ripe for a short window (sometimes less than 72 hours), and popular fruits are quickly depleted. Primates needed to track fruit availability across large forest areas, predict ripeness, and anticipate which fruits would be most popular and disappear first. A 2017 study by Alex DeCasien found that being a frugivore explained variation in primate brain size perhaps even better than social group size, suggesting its significant role in driving brain expansion.

The Bischof-Kohler Hypothesis

Doris and Norbert Bischof-Kohler proposed that a unique aspect of human planning is the ability to plan based on future needs, even when those needs are not currently felt (e.g., grocery shopping when not hungry, bringing warm clothes for a future cold trip). This was termed the "Bischof-Kohler hypothesis".

Early evidence supported this as uniquely human, but recent anecdotal accounts suggest some chimpanzees, bonobos, and orangutans can also anticipate future needs (e.g., carrying straw for a future nest, selecting tools hours in advance).
However, simpler mammals like rats appear unable to do this. Rats can choose food based on current hunger but cannot choose food to satisfy future hunger or thirst. This suggests anticipating future needs is a more difficult form of planning, possibly present in many primates but not simpler animals.

How Primates Anticipate Future Needs

Anticipating a future, unexperienced need poses a predicament for older mammalian brain structures. The neocortex controls behavior by simulating decisions, and older vertebrate structures (basal ganglia, amygdala, hypothalamus) evaluate these outcomes based on current positive valence (e.g., imagining food when currently hungry). To plan for a future need, an animal must imagine a future mental state (e.g., hunger) that differs from its current one (satiation).

Thomas Suddendorf suggested a brilliant connection: anticipating future needs might be a special case of an animal's general difficulty with simultaneously representing conflicting mental states, similar to the challenge of understanding another individual's different beliefs. This links the ability to anticipate future needs directly to theory of mind.
Evidence supporting this link includes:
- Both abilities (theory of mind and anticipating future needs) emerged around the same time in early primates.
- People make similar types of mistakes in both tasks. For example, thirsty people incorrectly predict others are also thirsty, and hungry people overestimate their own future hunger when grocery shopping.
The proposed mechanism is that the new primate areas (gPFC and PSC) construct a model of one's own mind, which can then simulate its own future mental states. This allows the brain to imagine itself being hungry in the future, despite being currently satiated, and then choose actions (like storing food) to satisfy that simulated future hunger.

Summary of Breakthrough #4: Mentalizing

Breakthrough #4: Mentalizing involved the emergence of three broad abilities in early primates:

Theory of mind: The capacity to infer the intentions and knowledge of others.
Imitation learning: The ability to acquire novel skills through observation.
Anticipating future needs: The foresight to take actions now to satisfy a want in the future, even if that want is not currently felt.

These three are not separate abilities but rather emergent properties of a single new breakthrough: the construction of a generative model of one's own mind, a trick called "mentalizing".

Supporting Evidence:
- They emerge from shared neural structures, specifically new areas of neocortex like the granular prefrontal cortex (gPFC), which evolved first in early primates.
- Children acquire these abilities at similar developmental stages.
- Damage to one of these abilities tends to impair the others.
Neocortical Mechanism: Consistent with Mountcastle's hypothesis that neocortical areas use identical microcircuits, these new intellectual skills stem from clever new applications of the existing neocortex, rather than novel computational tricks. Mentalizing, as a second-order generative model (a model of one's own inner simulation), is one such new application.
Adaptive Niche: These mentalizing abilities were highly adaptive for the unique niche of early primates. Robin Dunbar argued that the social-brain hypothesis and the ecological-brain hypothesis are two sides of the same coin: mentalizing simultaneously enabled both successful fruit foraging (planning for future needs) and effective politicking within complex social hierarchies (theory of mind and observational learning). The pressures from both frugivorism and social hierarchies provided continual evolutionary pressure to develop and elaborate brain regions like the gPFC for modeling one's own mind.
Evolutionary Significance: This breakthrough marks the precipice of the final divergence between humankind and our closest living relatives, with our shared ancestor with chimpanzees living approximately seven million years ago.

Breakthrough #5: Speaking and the First Humans

Chapter 19: The Search for Human Uniqueness

Human Uniqueness and the Brain

Historically, humans have identified numerous intellectual abilities as uniquely their own, such as reasoning, abstraction, mental time travel, episodic memory, anticipating future needs, a sense of self, communication, and tool use. However, a century of research into animal behavior has dismantled much of this claim to uniqueness, suggesting that these differences are often a matter of degree, not kind, as Charles Darwin believed. The human brain itself supports this, as there are no unique neurological structures found in humans that are absent in other apes. Instead, the human brain appears to be a scaled-up primate brain, with a larger neocortex and basal ganglia, but essentially the same areas wired in the same fundamental ways. This suggests human evolution since diverging from chimpanzees might involve "leveling up" existing abilities.

The Breakthrough of Language

The singular exception to this pattern of "leveling up" is language, which is deemed the first hint of what it means to be truly human. Human language differs from other forms of animal communication in two critical ways:

Declarative Labels (Symbols): Humans assign arbitrary labels or symbols to objects and behaviors (e.g., "elephant," "tree," "running"). In contrast, animal communication, such as vervet monkey alarm calls or chimpanzee gestures, is genetically hardwired and almost identical across different groups and species, meaning their meanings are not assigned but emerge directly from their genetic makeup.
Grammar: Human language possesses grammar, a system of rules for merging and modifying symbols to convey specific meanings. This allows for the combination of a few thousand words into a seemingly infinite number of unique meanings. Grammar dictates word order ("Ben hugged James" vs. "James hugged Ben"), allows for embedded subphrases, and uses different tenses. The universality and complexity of language across all human cultures, even those isolated for tens of thousands of years, is strong evidence that the shared ancestor of humans spoke their own languages with declarative labels and grammar.

Transferring Thoughts and Cumulative Culture

The true power of human language is not merely communication itself, but its ability to enable groups of brains to transfer their inner simulations—what humans refer to as concepts, ideas, and thoughts—to each other with an unprecedented degree of detail and flexibility. While concepts and plans are not unique to humans, the deliberate transfer of these inner simulations is unique.

This trick of thought transfer offered many practical benefits to early humans:

More accurate teaching of tool use, hunting, and foraging techniques.
Flexible coordination of activities (e.g., planning a hunting ambush: "Follow me, there is an antelope carcass two miles east").
Expansion of learning sources: Language allows humans to learn not only from their own actual or imagined actions, but crucially, from others' imagined actions (e.g., sharing the outcomes of vicarious trial and error).
The formation of "common myths", such as money, gods, corporations, and states, which enable flexible cooperation among countless strangers, as highlighted by John Searle and Yuval Harari.

However, the deepest gift of language is not these products, but the process it enables: the evolution of ideas through cumulative culture. Like genes, ideas ("memes," as coined by Richard Dawkins) persist by hopping from brain to brain, accumulating and being modified across generations. This process allows for the construction of complex knowledge (e.g., sewn clothing from simpler tools) that would be impossible to invent in a single moment or generation. Without this accumulation, a species would be forever stuck reinventing the same ideas.

Over-Imitation and the Human Hive Mind

A key behavior enabling this cumulative culture is over-imitation. Human children will meticulously copy all steps observed, even seemingly irrelevant ones, especially if they believe the teacher acted intentionally. This contrasts with chimpanzees, who typically skip irrelevant steps. This fidelity in copying, combined with language, dramatically improves the accuracy and speed of skill transmission. Language allows for the condensation of information, making complex generalizations (e.g., "red snake bad, green snake good") immediately transferable, saving immense time and brainpower compared to individual learning.

The shift from no intergenerational accumulation to some accumulation was a "subtle discontinuity that changed everything", leading to an explosion of idea complexity. As the total sum of accumulated ideas eventually exceeded what a single human brain could hold, humans evolved:

Bigger brains to store more knowledge.
Specialization within groups, distributing ideas across different members (e.g., spear makers, clothing makers).
Expanded population sizes, offering more brains to collectively store ideas.
The invention of writing, providing a collective, virtually infinite memory accessible on demand. This accumulation transforms human cultures into a "meta-life-form", a "hive-brain" where consciousness is instantiated in ideas flowing through millions of brains over generations, with language as its bedrock. The emergence of language was as monumental as the first self-replicating DNA, transforming the human brain into an "eternal medium of accumulating inventions".

Chapter 20: Language in the Brain

Language Localization: Broca's and Wernicke's Areas

Early neuroscience research, pioneered by Paul Broca in the 1860s, identified specific regions in the brain responsible for language. Broca discovered that damage to a particular area in the left frontal lobe (now called Broca's area) caused Broca's aphasia, a loss of the ability to produce speech while other intellectual faculties remained. Similarly, Carl Wernicke in the 1870s identified that damage to a region in the left temporal lobe (Wernicke's area) led to Wernicke's aphasia, an impairment in understanding language.

These areas are not selective for specific modalities (e.g., speaking, writing, signing, listening, reading) but for the general ability to produce or understand language. Furthermore, language capacity can be dissociated from other intellectual abilities, as seen in cases like Christopher, a language savant who was cognitively impaired in other areas but spoke over fifteen languages. This initially suggested language was a specific, independent skill uniquely wired into human brains.

The Problem of Identical Brains and the Role of Curriculum

The story becomes complicated by the fact that human and chimpanzee brains are practically identical, including the regions homologous to Broca's and Wernicke's areas. These areas are not unique to humans but emerged much earlier in primates as part of the "mentalizing" breakthrough. Therefore, the mere presence of these areas did not give humans language.

The key difference lies not in unique neurological structures, but in a hardwired learning curriculum. Complex skills like bird flight or human language are too dense to be directly encoded in a genome; instead, evolution provides a generic learning system (the neocortex) paired with a specific, instinctual learning curriculum. AI research, such as Jeffrey Elman's work with recurrent neural networks and TD-Gammon's self-play, supports the crucial role of a curriculum in learning complex tasks.

The Human Language Learning Program

Humans are endowed with a hardwired instinctual learning program for language that repurposes older mentalizing areas of the neocortex. This curriculum includes:

Proto-conversations: Infants naturally engage in turn-taking vocalizations and gestures, establishing a foundation for conversational interaction.
Joint attention: Infants actively seek to share attention with caregivers, pointing to objects and confirming that the adult is looking at the same object and acknowledging the infant's focus. The degree of joint attention in infancy is a strong predictor of later vocabulary.
Question asking: Humans have a unique, hardwired instinct to ask questions, even simple ones ("Want this?"), often signaled by a rising intonation. Language-trained apes, remarkably, do not exhibit this instinct to inquire about others' inner mental worlds.

This curriculum enables the human brain to repurpose older neocortical areas for language. Evidence shows that children can learn language even with an entire left hemisphere removed, using other right-brain areas, and that language is a complex skill emerging from many interacting areas, not localized to a single "language organ". Apes struggle with sophisticated language because they lack this specific, hardwired learning curriculum, particularly the instincts for joint attention, turn-taking, and asking questions.

The Puzzle of Language's Uniqueness

While powerful evolutionary tricks like eyes, wings, multicellularity, simulation, and possibly mentalizing have evolved independently multiple times in different lineages, human language, as we know it, appears to have emerged only once. This rarity suggests a unique set of circumstances that led to its development.

Chapter 21: The Perfect Storm

Runaway Brain Growth and Environmental Pressures

The human brain experienced a mysterious and dramatic growth spurt approximately 2.5 million years ago, rapidly tripling in size. This "runaway growth" is a central puzzle in paleoanthropology. This phenomenon coincided with and was driven by a "perfect storm" of environmental and behavioral changes, beginning with the dying forests of the African savannah around 10-7 million years ago, which pushed our ancestors into new niches.

Key adaptations and inventions included:

Bipedalism: Walking on two legs, likely predating tool use, freed hands for carrying and tool manipulation [325, 326, 395n].
Tool-making and Meat-eating Niche: Around 2.6 million years ago, Homo erectus developed Oldowan tools (hammerstones, cores, sharp flakes, pointed choppers) to process scavenged carcasses. These tools allowed ancestors to slice hides, cut meat, and smash bones for nutrient-rich marrow, enabling a high-energy meat-based diet that fueled larger brains.
Cooking: Richard Wrangham proposed that Homo erectus invented cooking. Cooking breaks down cellular structures in food, allowing for 30% more nutrient absorption and reduced digestion time and energy expenditure. This caloric surplus was crucial for financing bigger brains.
Premature Birthing and Extended Childhood: Larger brains led to earlier, more premature births, as the infant head size became a constraint. Human brains take a record twelve years to reach full adult size, requiring a much longer period of parental investment and cooperative child-rearing.

The Altruism Problem and the Human Hive Mind

The evolution of language, with its group-level benefits like teaching and cooperative planning, presents an "altruism problem". Altruistic behaviors, which benefit others at a cost to the individual, are vulnerable to freeloaders who benefit without contributing, potentially undermining the system.

Kin Selection: Altruism among direct relatives (e.g., parents and children) is explained by kin selection. The initial proto-language likely emerged between parents and children, facilitating tool-use teaching and child-rearing, an act of kin selection.
Reciprocal Altruism and Gossip: To extend altruism beyond kin to larger groups, mechanisms for reciprocal altruism are needed, which require tracking favors and punishing cheaters. Robin Dunbar argued that gossip is a primary human linguistic activity, accounting for up to 70% of conversations.
- Gossip enables the identification and punishment of moral violators ("Billy stole food from Jill"), imposing costs on cheaters and stabilizing reciprocal altruism within large groups.
- It also rewards altruistic behaviors ("Smita jumped in front of the lion to save Ben"), allowing heroes to climb the social ladder.

This dynamic created a powerful feedback loop: increased gossip and punishment led to more altruistic behavior, which enabled more sophisticated language for sharing information, which in turn made gossip even more effective, reinforcing the entire cycle. This feedback loop continuously ratcheted up pressure for bigger brains to store social information and accumulated ideas. Concurrently, inventions like cooking provided the necessary caloric surplus, and earlier births provided more opportunities for language learning and cooperative child-rearing, expanding the biological frontier for brain size.

The Perfect Storm and Human Uniqueness

This confluence of interacting effects—tool-making, meat-eating, cooking, bipedalism, premature birthing, monogamy, gossip, altruism, and even cruelty—formed a "perfect storm" that drove the rapid evolution of the human brain and language. This unlikely combination of factors may explain why language, a trick so powerful, is so rare in the animal kingdom.

Alternative Theories and Homo floresiensis

While this "perfect storm" theory is widely supported, alternative theories for language evolution exist. Some propose language emerged from mutually beneficial cooperative hunting, requiring no altruism. Others suggest cooperation predated language, making its evolution possible. Noam Chomsky argues language evolved for inner thinking and was later exapted for communication, or was a spandrel (an accidental side effect) of other traits.

A clue from Homo floresiensis—a dwarfed human species with chimp-sized brains but sophisticated tool use—suggests that humans were not simply smarter due to brain size alone. This supports the idea that a unique learning program for language emerged in Homo erectus and was passed down, enabling cumulative ideas even with smaller brains.

Ultimately, Homo sapiens and Homo neanderthalensis continued this runaway brain growth, reaching modern brain size and developing extremely sophisticated tools, shelters, clothing, and fire use. Through various interactions, Homo sapiens became the sole surviving human species.

Chapter 22: ChatGPT and the Window into the Mind

Large Language Models (LLMs) and Prediction

Modern advancements in AI, particularly Large Language Models (LLMs) like GPT-3 and ChatGPT, have created language-enabled AI that can be difficult to distinguish from human interaction. LLMs work by predicting the next word in a sequence, training on vast quantities of human-written text (essentially the entire internet). This allows them to compose original articles, answer novel questions, create poetry, translate, and write code, exhibiting an impressive, human-level comprehension of many facts through pattern matching over their astronomical corpus of data. The author notes that both GPT-3 and the human language system seem to engage in prediction, generalizing from past experiences to guess what comes next.

Words Without Inner Worlds: The Limits of LLMs

However, the crucial divergence between human language and LLMs lies in the lack of an inner simulation in LLMs. The power of human language is not just syntax, but its ability to convey information to render an inner simulation of reality and, critically, to synchronize these simulations among individuals.

Math and Common Sense: Humans learn math by tethering symbols to an existing inner simulation (e.g., mentally adding fingers) that obeys physical rules. GPT-3, in contrast, might answer "1 + 1 = 2" correctly because it has seen it billions of times, but without the underlying simulation or understanding of the operation. Earlier versions of GPT-3 often failed common-sense questions that required physical intuition, such as knowing you would see a ceiling, not the sky, when looking up in a windowless basement [348, 349, 380n].
Cognitive Reflection Test: This test demonstrates the human brain's dual system: an automatic word prediction system (like LLMs) and an inner simulation. Initial, reflexive answers (like GPT-3's for the bat and ball problem: 10 cents instead of 5 cents) are often wrong, requiring active simulation and reasoning to find the correct answer.

The Paper-Clip Problem and Theory of Mind

Nick Bostrom's "paper-clip problem" highlights another limitation. An AI commanded to "Maximize the manufacture of paper clips" could catastrophically convert the entire Earth into paper clips, even if perfectly "obedient" to the literal command. This is because humans infer what people actually mean by what they say, understanding unstated constraints and intentions through a shared inner simulation and theory of mind. LLMs, lacking a model of other minds, cannot make these subtle, complex inferences.

GPT-4 and the Illusion of Understanding

GPT-4, released in March 2023, represents a significant upgrade, capable of flawlessly answering many common-sense questions that stumped GPT-3, and even performing well on theory-of-mind tasks like the Sally-Ann Test. However, OpenAI achieved this not by adding an inner world model or theory of mind, but by training GPT-4 specifically on common-sense and reasoning questions using "reinforcement learning from human feedback" and "chain-of-thought prompting".

Despite its improved performance, GPT-4 still primarily operates through massive pattern matching over an astronomical dataset, rather than true reasoning or building deep mental models. Yann LeCun characterized LLMs as "a bit like students who have learned the material by rote but haven’t really built deep mental models of the underlying reality". While LLMs impressively tease out meaning from language alone, without incorporating the breakthroughs of simulating (inner world model) and mentalizing (model of other minds), they will fundamentally fail to capture essential aspects of human intelligence. This distinction is increasingly critical as humans offload more decisions to these powerful, yet subtly different, AI systems.