It Still Can't Do My Job

Four years of moving goalposts, with receipts

I started keeping notes in December 2022, mostly to document why the panic was overblown. The notes turned into this. The quotes in orange boxes are real. You can look them up. The gray comments are paraphrased from a few thousand comment sections. You know the ones. You may have written some. I did.

November 2022

The party trick

ChatGPT launches on a Wednesday. By the weekend it has a million users and my whole feed is screenshots of it apologizing for code that doesn't compile. It invents functions. It hallucinates whole APIs. I asked it for Snake, the game you write in an afternoon as a teenager. It gave me a snake that ate itself on move one. Five days in, Stack Overflow bans it:

"Because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site."

Stack Overflow temporary policy, December 5, 2022

The verdict was easy, and it was also mine: a stochastic parrot that learned to sound like a senior dev without ever meeting a compiler.

The goalpost

Call me when it stops making things up. It can't even do Snake.

March 2023

The exam season

GPT-4 ships. One prompt now gets you a working Snake. The same game it face-planted on four months earlier. The comment sections adjust instantly and never slow down:

It's just a simple game bro. There are ten thousand Snake tutorials on GitHub, it's literally copy-pasting. Wake me up when it does something that's NOT in the training data.

the comment section, spring 2023, paraphrased

Meanwhile the party trick starts passing exams. OpenAI claims the bar exam at the 90th percentile. Microsoft researchers publish a paper called "Sparks of Artificial General Intelligence". A real paper, with that real title. To be fair, the skeptics landed punches here. A later re-evaluation put the bar exam closer to the 60th percentile, and around the 48th among people who actually passed. Both sides were flinging numbers. Only one side was flinging them at a thing that kept improving.

The goalpost

Toy scripts and exams aren't engineering. Call me when it builds something real. A proper game, say. In 3D.

March 2024

The staged demo

A startup called Cognition announces Devin, "the first AI software engineer". The demo video is everywhere for a week. A month later a veteran developer named Carl Brown (YouTube channel: Internet of Bugs) goes through it almost frame by frame. The impressive parts were curated. Devin didn't do the Upwork task from the demo. It generated its own errors, then heroically fixed them. The skeptics take a well-earned victory lap. I watched the takedown twice. It felt great.

That same spring, the CEO of Nvidia stands on a stage in Dubai:

"It is our job to create computing technologies that nobody has to program, and that the programming language is human. Everybody in the world is now a programmer."

Jensen Huang, World Governments Summit, February 2024

Nobody I know quit programming that year. But everybody I know quietly installed Copilot.

The goalpost

Demos are staged. Call me when real developers use this for real work, daily.

October 2024

The earnings call

"More than a quarter of all new code at Google is generated by AI, then reviewed and accepted by engineers."

Sundar Pichai, Alphabet earnings call, October 2024

The comment sections don't blink. That's just autocomplete acceptance metrics. Boilerplate doesn't count. Half of it is import statements. And fine, some of it probably is. But "a quarter of Google" is a strange thing to keep calling a party trick.

The goalpost

Generating lines isn't the job. Call me when it takes a ticket and ships the feature.

February 2025

The vibes

"There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists."

Andrej Karpathy, February 2, 2025

Three weeks later Pieter Levels prompts a multiplayer 3D flight simulator into existence. It takes him about three hours. He has zero gamedev experience. He puts it online at fly.pieter.com. Remember the 2023 goalpost? A proper game, in 3D? Here it is. It sells $29.99 fighter jets and blimp ads to real customers, and he claims a $1M annual run rate within seventeen days. The comment sections know exactly what to do:

It has no vibe bro. It doesn't even feel like a fun game to play. Floaty physics, asset-flip graphics, zero game design. This is a tech demo with a Stripe account.

the comment section, March 2025, paraphrased

Same season: Zuckerberg tells Joe Rogan that Meta expects AI that codes like a "midlevel engineer" within the year. Dario Amodei says AI may be writing 90 percent of code within six months. And vibe coding grows its own disaster genre. Leaked API keys. Wide-open databases. "My app got hacked and I don't know where to look" postmortems. The seniors are unimpressed, and they have receipts. The slop is real. The security holes are very real.

The goalpost

Toys and prototypes, sure. Call me when it touches production and survives.

July 2025

The month the skeptics were right

A research group called METR takes sixteen experienced open-source developers, gives them AI tools on their own mature repos, and measures. The developers are 19 percent slower with AI. They believed they'd been 20 percent faster. Even after seeing the clock. The comment sections feast, and they've earned it. Best day the skeptics had since Devin.

Same month: OpenAI and Google DeepMind both hit gold at the International Math Olympiad. Five problems out of six, solved in plain language, inside the human time limit. Both things are true at once. That's the part nobody wants to sit with.

The goalpost

For one month, nobody had to move anything.

July 2026

Now

Agents run for hours unattended. They open pull requests. The pull requests get merged. Some of you reviewed one this week without noticing. Stack Overflow's question volume is back to where it was when I learned to code. Not because the questions got answered. Because nobody asks a forum anymore.

Maybe the current goalposts hold. I'd just point out that every entry above held too. For about eighteen months each.

The goalpost

Call me when it handles our legacy codebase. When it can be held accountable. When it knows what to build, not just how.

YOU ARE HERE

Nothing below this line has happened yet. It's a guess. Laugh freely. People laughed at the top half of this page too, and I have the screenshots.

~2027 (forecast)

The one-shot game, for real this time

One prompt returns a polished, playable open-world game. Coherent art direction. Tuned physics. Working multiplayer. A soundtrack. Not a floaty tech demo. Something your kid plays for a month.

No soul bro. Real games come from a designer suffering for years, not from a prompt. This is procedurally generated slop with good lighting. Name ONE mechanic in it that's actually original.

the comment section, 2027, predicted

The goalpost

Remixing isn't creating. Call me when it makes something genuinely new.

~2028 (forecast)

The legacy codebase

An agent digests a fifteen-year-old monolith. The one with the cron job held together by a comment that says "do not remove". It maps the undocumented business rules and refactors the whole thing over a quarter, tests green the whole way. The big goalpost falls quietly on a Tuesday.

It didn't understand the business bro, it just pattern-matched every migration ever pushed to GitHub. A consultant would have asked WHY the invoice logic works that way. It never asked why.

the comment section, 2028, predicted

The goalpost

Call me when it owns a system end to end. Pager and all.

~2030 (forecast)

The pager

The on-call rotation is a model. Incidents open, get diagnosed, get fixed, and get post-mortemed before any human wakes up. Uptime improves. The people this replaced point out, correctly, that keeping systems alive was never the hard part. By now that's a lot of us.

Ops was always automatable, that's why we wrote runbooks bro. The hard part is knowing WHAT to build. It can't want something. It's never been annoyed by a product enough to fix it.

the comment section, 2030, predicted

The goalpost

Call me when it comes up with the idea.

~2033 (forecast)

The founder

An AI notices an unmet need, builds the product, finds the customers, and runs the company to a billion-dollar valuation with zero employees. The final think-piece comes out that same week. The argument is airtight: it's still just fancy autocomplete.

It doesn't even drink craft beer.

the last comment section, 2033, predicted

The goalpost

Call me.

The goalpost graveyard