Andrej Karpathy came on Dwarkesh’s podcast recently, and I have a number of thoughts.
Many are saying that Karpathy thinks AGI is 10 years away, and therefore Gary Marcus is right, and people like myself, Sholto, and all the other people saying AGI is within a few years have just lost the war.
Compelling, but it’s not that simple.
Debates like these usually hinge on definitions, and the definition that Karpathy is using came from when he was back at OpenAI:
I don’t think this is the best definition to use at this moment.
I think it’s a good pure definition, or Computer Science definition, but I think we should focus our definition more around the thing that matters most to humans (as opposed to AI people).
I’m worried—as Karpathy and Dwarkesh are as well—about human work replacement. Specifically human knowledge work. And that’s why I’ve been using this definition since 2023:
For me, this is better for two reasons:
- It focuses on the fact that it’s an AI system, and not one particular component of a system (like a model)
- It provides a more direct benchmark for the thing we care about, i.e., Are companies actually replacing workers with this system? Yes or no?
The system part is key.
I have no reason—or ability—to disagree with Karpathy on the limitations of pure LLMs. He recently wrote yet another one in 1,000 lines of code. He’s the actual sensei here, and I know .00017% of what he knows about LLMs. The problem is AI systems aren’t just the LLMs themselves. They’re not naked neural nets.
When you go to chatgpt.com and talk with gpt-5
you’re not talking to a base neural net; you’re talking to an AI system.
You’re talking to the result of that initial LLM being shaped and molded with colossal amounts of extra scaffolding and engineering work to be the best possible system it can be for doing its particular task. Being a chatbot/assistant, in its case.
This distinction is everything because replacing human jobs will also be done through composite, stitched-together systems that are many times more powerful than their parts.
To replace a project manager or an executive assistant, the companies building human worker replacement aren’t going to sit back and wait for GPT-9 or Gemini 7.5.
Human worker replacement will happen through AI products/systems that work around the pure limitations of LLMs and of individual model intelligence.
Claude Code is a brilliant example of this.
Just soccer pitching numbers, Claude Code—when it launched—was like 5x better than Opus or Sonnet at helping developers write code.
Well it’s less than 10 months later and it’s gotten many times better than that already.
Like night and day.
Yes, the models got better, but that’s not what made the difference. It was constant iterative improvements grinding towards improving how the AI talks to itself. Coordination. Context Management / Engineering. And just now they added skills, which takes the whole thing to 11-million.
This is exactly the type of efficiency ratchet that will apply to human work replacement. Where we don’t have enough context window to read all the company’s docs, companies will/have invented systems to do that.
Where they’re not general enough to match human flexibility, they’ll add so many great use cases and capabilities—based roughly around the Agent Skills paradigm—that we eventually won’t notice because it’ll cover most.
The part that concerns me the most about the speed of progress towards AI replacing human knowledge workers is not the speed of AI system improvement. It’s the fact that the bar is so low.
A good portion of our culture’s comedy is based on the utter incompetence of like half of our workforce.
- The worst possible customer service
- People bragging about how little work they do
- Making a sport out of doing the bare minimum
- People absolutely detesting their jobs
- Even decent workers just mindlessly punching in and out
Mediocrity is the baseline. Almost by definition.
That is what a multi-billion dollar human worker replacement startups are competing with—not the top 10% performers you know.
Think of it this way: In the time that we went from Claude Code not existing, to it getting really good, to it now having shareable work task replacement skills, the bottom 50% of knowledge workers improved how much?
Zero.
In the time since ChatGPT came out, the bottom 50% of knowledge workers improved their capabilities by how much?
Again, 0%.
The bar for human work replacement is not moving, while the capabilities of AI Systems are going absolutely apeshit.
You might push back by saying this is only for the people not trying very hard, or who aren’t that smart or whatever.
I largely agree with you, but it doesn’t matter.
You and me and Dwarkesh and Karpathy are going to be fine. So what?
If AI only eats the absolute worst, bottom 50% of knowledge workers in the next 5-10 years, we’re still talking about hundreds of millions of jobs.
This is why I disagree with Karpathy on this.
It’s not because he’s wrong. He’s not. But he’s focused on the wrong thing.
If the thing we care about is AI’s near-term and practical impact on humanity, the thing to watch is not how smart individual models are, or the specific technical limitations of RL to achieving continuous learning.
It’s the trillions of dollars being invested in replacing the static worst 50% of human workers.
Those trillions are being spent on having the Worker Replacement System be just general enough to hit that mark.
So my question to you is, given what we see in model improvement and systems like Claude Code that exponentially magnify model capability, do you really want to bet against that happening?
I don’t.
And this is why I think “AGI” will be here before 2028. Not because all the stuff Karpathy is talking about will be solved, but because it won’t matter if they are.