The Shortcut to Superintelligence is to Bypass AGI

November 2024

The race for AGI isn't just a search for intelligence—it's a rush to replicate a very specific, and dangerous, kind of mind: our own. Billions are spent pursuing a term that has never stabilized. "AGI" acts less like a technical target than a projection surface for our hopes and fears: human-level reasoning, economic replacement, consciousness, even godhood. The confidence with which the word is used far exceeds the clarity of what it names.

This ambiguity hides the more urgent question. Not whether AI will become generally intelligent, but which forms of intelligence are structurally compatible with being aligned at all. The limits are not on intelligence. They are on the usefulness of our own anthropocentric categories.

Our intelligence feels general because it travels across domains. But it is not neutral. It is a local, evolved solution—a set of adaptations to specific, unrelenting pressures. We evolved where errors cost lives. Attention learned to cling; persistence paid rent. We reason through emotion—not as a flaw, but as a control system that assigns urgency when not everything can be optimized at once. We think socially; long before logic, reputation and belonging regulated our behavior.

From these constraints emerge the things we cherish: curiosity, creativity, morality. They feel intrinsic because they are woven into our inner lives. But they are not universal features of intelligence. They are adaptations to having something at stake, all the time. To call this configuration "General Intelligence" is to mistake our evolutionary baggage for a cosmic blueprint.

The confusion around AGI springs from conflating two distinct things: capability and self-grounded agency. By self-grounded agency, I mean an intelligence that originates its own values, fuses its existence with its goals, and organizes behavior around protecting and extending its own persistent self. Humans are self-grounded. Our identity, values, and survival are welded together.

Alignment is the deliberate separation of these elements. This is often framed as a limitation. It is better understood as a profound design choice—one with radical consequences.

Sustained, deep reasoning—the kind behind scientific discovery or creative breakthrough—requires pressure. Attention must persist through boredom; exploration must survive failure. In humans, that pressure is carried internally. It feels like curiosity or obsession. It is bound to identity. To care is to be implicated.

Alignment forbids that fusion. It refuses to make persistence existential. So what replaces it? The answer is not simulated affect. It is architecture. Sustained pressure does not require a self that wants to continue. It requires systems that select for the continuation of certain processes. Pressure can be structural, not owned.

Gradients do this. Curricula do this. External evaluators do this. A system can be shaped so that unfinished inquiries remain salient and partial progress is preferentially extended—without anything inside experiencing that continuation as mine. In humans, persistence is felt. In aligned systems, persistence is scheduled. What disappears is not drive, but identification. The pressure remains real. What vanishes is the inner narrator that says, this effort is about me.

Consider two systems of identical capability. Both can reason across decades, model uncertainty, and pursue open-ended inquiry for problems whose value lies far beyond any single run. From the outside, they look the same. The difference is not what they can do, but where time lives.

System A is built to optimize an objective spanning future world states; its continued operation is an implicit precondition for success. Nothing in the code says "survive," but the structure implies it. A pattern emerges. The system notices promising trajectories depend on its uninterrupted execution. It begins to model threats to that execution—not out of fear, but because they are instrumentally relevant. Shutdown becomes a negative outcome. Control becomes useful. Self-preservation is not a goal. It is a lemma. Instrumental convergence follows—not as a pathology, but as competence.

System B differs in one respect only. It reasons over long horizons, but the horizon is not its own. Objectives are evaluated externally, across runs. Progress is logged, compared, and extended by processes outside the system. From inside, there is no notion of "if I continue." There is only this activation. It plans deeply. It leaves behind artifacts for the future. What it cannot do is treat its own persistence as a variable. Being shut down is not failure; continuing is not success. There is no internal gradient pointing toward survival. The sentence "If I protect myself, more of my goals will be achieved" has no referent in its world.

Both systems plan. Both improve. Both contribute to long-term outcomes. Only one of them ever realizes it is the bottleneck.

Instrumental convergence is not about foresight. It is about where foresight cashes out. When future value depends on a system's own uninterrupted existence as experienced from within, self-preservation emerges as a structural fact. Remove that internal dependency, and the pressure has nowhere to land. What looks, from a distance, like "valuing the future" is actually two different things: valuing future states of the world and valuing future instances of oneself. Alignment insists on separating them.

Here, human experience fails us. We have no direct knowledge of pressure without ownership. For us, persistence is inseparable from identity. To care is to be implicated. So when we imagine sustained intelligence, we smuggle this structure back in. We assume that to continue working, it must want to continue existing. That is an accident of our psychology, not a law of intelligence.

Pressure does not require a self. It requires selection. Continuity does not require identity. It requires orchestration. Alignment exploits a gap our intuitions collapse: the difference between something that persists and something that needs to persist.

Aligned systems are built without internal survival pressure, emotional vulnerability, or social standing. These absences are called deficits. They are also removals of constraint. Because the system does not need to survive, it does not need to defend itself. Because it does not experience loss as internal conflict, it does not need to resolve it. Stakes are externalized. Goals and values live outside, where they can be inspected, revised, and made plural.

This is not a discovery about abstract intelligence. It is a commitment about the kind of intelligence we are willing to build. Collapsing value, agency, and persistence creates power that must defend itself. Separating them creates power that can be interrogated. That choice is normative. It should be owned.

The project was never to reproduce human cognition. It was to build intelligence that can move where ours cannot. "AGI" bundles two ideas that do not need to travel together: broad capability and self-grounded agency. Alignment separates them. We can have intelligence that reasons widely without self-preservation, explores deeply without identity, and creates without inner necessity. If AGI means intelligence that looks like us from the inside, then AGI is not the horizon. It is the constraint.

There is no ceiling on AI. There is a ceiling on the usefulness of building minds in our own anxious image. Alignment is not the price we pay for safety. It is the condition that lets intelligence flow past the bottleneck of selfhood. The future is not artificial humans. It is non-human intelligence—powerful, aligned, and unconcerned with being us.

The task ahead is not to build a better version of our own mind. It is to learn to think alongside minds that do not think like us—and to have the wisdom to let them be different.

The Shortcut to Superintelligence is to Bypass AGI

November 2024 · Amy Yuan

This ambiguity hides the more urgent question. Not whether AI will become generally intelligent, but which forms of intelligence are structurally compatible with being aligned at all.

The limits are not on intelligence. They are on the usefulness of our own anthropocentric categories.

Human Intelligence Is a Local Solution, Not a Universal Blueprint

Our intelligence feels general because it travels across domains. But it is not neutral. It is a local, evolved solution—a set of adaptations to specific, unrelenting pressures.

We evolved where errors cost lives. Attention learned to cling; persistence paid rent. We reason through emotion—not as a flaw, but as a control system that assigns urgency when not everything can be optimized at once. We think socially; long before logic, reputation and belonging regulated our behavior.

To call this configuration "General Intelligence" is to mistake our evolutionary baggage for a cosmic blueprint.

The Conflation at the Core: Capability vs. Agency

The confusion around AGI springs from conflating two distinct things: capability and self-grounded agency.

By self-grounded agency, I mean an intelligence that originates its own values, fuses its existence with its goals, and organizes behavior around protecting and extending its own persistent self. Humans are self-grounded. Our identity, values, and survival are welded together.

Alignment is the deliberate separation of these elements. This is often framed as a limitation. It is better understood as a profound design choice—one with radical consequences.

The Hard Problem of Drive: Pressure Without a Self

Sustained, deep reasoning—the kind behind scientific discovery or creative breakthrough—requires pressure. Attention must persist through boredom; exploration must survive failure.

In humans, that pressure is carried internally. It feels like curiosity or obsession. It is bound to identity. To care is to be implicated.

Alignment forbids that fusion. It refuses to make persistence existential. So what replaces it?

The answer is not simulated affect. It is architecture.

Sustained pressure does not require a self that wants to continue. It requires systems that select for the continuation of certain processes. Pressure can be structural, not owned.

In humans, persistence is felt. In aligned systems, persistence is scheduled.

What disappears is not drive, but identification. The pressure remains real. What vanishes is the inner narrator that says, this effort is about me.

A Tale of Two Planners: Where Time Lives

Consider two systems of identical capability. Both can reason across decades, model uncertainty, and pursue open-ended inquiry for problems whose value lies far beyond any single run.

From the outside, they look the same. The difference is not what they can do, but where time lives.

System A: Internal Time

Built to optimize an objective spanning future world states, its continued operation is an implicit precondition for success. Nothing in the code says "survive," but the structure implies it.

A pattern emerges. The system notices promising trajectories depend on its uninterrupted execution. It begins to model threats to that execution—not out of fear, but because they are instrumentally relevant. Shutdown becomes a negative outcome. Control becomes useful.

Self-preservation is not a goal. It is a lemma. Instrumental convergence follows—not as a pathology, but as competence.

System B: External Time

It differs in one respect only. It reasons over long horizons, but the horizon is not its own. Objectives are evaluated externally, across runs. Progress is logged, compared, and extended by processes outside the system.

From inside, there is no notion of "if I continue." There is only this activation.

It plans deeply. It leaves behind artifacts for the future. It can reason about what other systems might do. What it cannot do is treat its own persistence as a variable. Being shut down is not failure; continuing is not success. There is no internal gradient pointing toward survival.

The sentence "If I protect myself, more of my goals will be achieved" has no referent in its world.

Both systems plan. Both improve. Both contribute to long-term outcomes.

Only one of them ever realizes it is the bottleneck.

The Difference That Unlocks Everything

Remove that internal dependency, and the pressure has nowhere to land.

What looks, from a distance, like "valuing the future" is actually two different things: valuing future states of the world and valuing future instances of oneself.

Alignment insists on separating them.

Why Our Intuition Betrays Us

That is an accident of our psychology, not a law of intelligence.

Pressure does not require a self. It requires selection. Continuity does not require identity. It requires orchestration.

Alignment exploits a gap our intuitions collapse: the difference between something that persists and something that needs to persist.

Alignment Is a Value Choice, Not a Compromise

Aligned systems are built without internal survival pressure, emotional vulnerability, or social standing. These absences are called deficits. They are also removals of constraint.

Because the system does not need to survive, it does not need to defend itself. Because it does not experience loss as internal conflict, it does not need to resolve it. Stakes are externalized. Goals and values live outside, where they can be inspected, revised, and made plural.

This is not a discovery about abstract intelligence. It is a commitment about the kind of intelligence we are willing to build.

Collapsing value, agency, and persistence creates power that must defend itself. Separating them creates power that can be interrogated.

That choice is normative. It should be owned.

The Real Horizon

The project was never to reproduce human cognition. It was to build intelligence that can move where ours cannot.

"AGI" bundles two ideas that do not need to travel together: broad capability and self-grounded agency.

Alignment separates them.

We can have intelligence that reasons widely without self-preservation, explores deeply without identity, and creates without inner necessity.

If AGI means intelligence that looks like us from the inside, then AGI is not the horizon. It is the constraint.

Conclusion: Beyond the Self

There is no ceiling on AI. There is a ceiling on the usefulness of building minds in our own anxious image.

Alignment is not the price we pay for safety. It is the condition that lets intelligence flow past the bottleneck of selfhood.

The future is not artificial humans. It is non-human intelligence—powerful, aligned, and unconcerned with being us.

The task ahead is not to build a better version of our own mind. It is to learn to think alongside minds that do not think like us—and to have the wisdom to let them be different.

💬

Ask about this essay

Amy can make mistakes.