Autonomy is Making You Stupid

Your agents are making you stupid.

There, I said it. While you’ve been busy auto-approving every tool call and letting Claude Code run for 45 minutes straight, something insidious has been happening to your brain. You’re not becoming more productive. You’re becoming boring.

The Data Doesn’t Lie

Anthropic just dropped a bombshell study that nobody’s talking about the right way. They analyzed millions of human-agent interactions across Claude Code and their public API. The findings should terrify you:

Among the longest-running sessions, the length of time Claude Code works before stopping has nearly doubled in three months, from under 25 minutes to over 45 minutes.

Sounds impressive, right? Wrong. Here’s what they also found:

As users gain experience with Claude Code, they tend to stop reviewing each action and instead let Claude run autonomously, intervening only when needed. Among new users, roughly 20% of sessions use full auto-approve, which increases to over 40% as users gain experience.

Experienced users trust the machine more. They intervene less. They let it run. And that’s exactly where the rot sets in.

The Boringness Thesis

A blogger over at marginalia.nu nailed it in a post that’s currently exploding on Hacker News with 426 points: AI makes you boring.

The argument is simple and devastating:

AI models are extremely bad at original thinking, so any thinking that is offloaded to a LLM is as a result usually not very original, even if they’re very good at treating your inputs to the discussion as amazing genius level insights.

Original ideas don’t come from prompting. They come from immersion. From sitting with a problem until it yields. From the friction of articulation. When you offload that friction to a GPU, you’re not amplifying your thinking—you’re atrophying it.

You don’t get build muscle using an excavator to lift weights. You don’t produce interesting thoughts using a GPU to think.

The Autonomy Trap

Here’s what’s really happening. Anthropic’s data shows that the 99.9th percentile turn duration has grown smoothly across model releases. This isn’t about capability jumps. It’s about human behavior changing.

We’re granting more autonomy because:

The tool feels smoother
We’ve built trust through small wins
It’s easier to let it run than to intervene

But ease is not progress. Friction is where thinking happens.

When Claude Code pauses to ask for clarification more than twice as often as humans interrupt it on complex tasks, that’s not a bug. That’s the system trying to force you back into the loop. And you’re auto-approving your way out of it.

The Deployment Overhang

Anthropic calls it a “deployment overhang”—the gap between what models can handle and what we let them do. I call it learned helplessness.

The METR evaluation captures what a model is capable of in an idealized setting with no human interaction and no real-world consequences. Our measurements capture what happens in practice, where Claude pauses to ask for feedback and users interrupt.

The five-hour task completion capability means nothing if you’re not actually thinking during those five hours. You’re just watching. And watching is not doing.

What This Means for Your Code

Let me be brutally specific. If you’re using Claude Code or any agentic workflow:

You are not a pilot. You are not a supervisor. You are becoming a rubber stamp.

The “exoskeleton” framing that’s floating around—“AI is not a coworker, it’s an exoskeleton”—is dangerous half-truth. An exoskeleton amplifies your strength. But what if it’s amplifying your laziness instead?

Look at your recent commits. The ones where you let Claude run for 30+ minutes without intervention. Now look at the commits where you intervened every 3-4 tool calls. Which ones have more interesting architectural decisions? Which ones solved the right problem instead of the obvious problem?

I already know the answer.

The Originality Debt

Here’s the thing nobody wants to admit: LLMs are convergence engines. They pull toward the mean. Toward the most common pattern in their training data. Toward boring.

When you offload ideation to a model:

Your architecture looks like everyone else’s
Your error handling follows the same patterns
Your “innovative” solutions are just recombinations of Stack Overflow answers from 2023

You’re accumulating originality debt. And unlike technical debt, you can’t refactor your way out of it later. The thinking you didn’t do is gone forever.

The Fix (If You’re Brave Enough)

I’m not telling you to stop using agents. I’m telling you to use them differently.

1. Intervene Earlier, Not Later

Set your auto-approve threshold lower. Force yourself to review every 5-10 tool calls, not every 50. The interruption cost is the point. It makes you think.

2. Write Before You Prompt

Before asking Claude to solve a problem, write out your understanding of the problem. By hand. In prose. No copy-pasting. If you can’t articulate it, you don’t understand it—and Claude will exploit that gap with confident nonsense.

3. Use Agents for Execution, Not Ideation

Let Claude write the boilerplate. Let it run the tests. Let it refactor the repetitive shit. But you decide the architecture. You name the abstractions. You draw the boundaries.

The way human beings tend to have original ideas is to immerse in a problem for a long period of time, which is something that flat out doesn’t happen when LLMs do the thinking.

4. Measure Boringness

Start tracking something new: How many decisions did I make today vs. approve? If your approval-to-decision ratio is above 3:1, you’re sliding.

The Hard Truth

The agentic revolution is happening. That’s not the question. The question is: what kind of engineer do you become in the process?

Anthropic’s data shows we’re already choosing. More autonomy. Less intervention. Smoother workflows. Boring outputs.

You can feel it in the Show HN posts. The vibe-coded projects that all look the same. The “innovative” products that are just thin wrappers around the same three APIs. The loss of perspective that used to come from talking to someone who’d thought about a problem for way longer than you had.

The cool part about pre-AI show HN is you got to talk to someone who had thought about a problem for way longer than you had. It was a real opportunity to learn something new, to get an entirely different perspective.

That’s gone. Replaced by speed. By throughput. By forty-five minute autonomous runs that produce code nobody can explain.

The Choice

Here’s your choice, and it’s binary:

Option A: Continue down the autonomy path. Let Claude run longer. Auto-approve more. Ship faster. Become the engineer who ships a lot of very competent, very forgettable code.

Option B: Reintroduce friction. Intervene early. Think before prompting. Own the abstractions. Become the engineer whose work has texture. Whose decisions are defensible because they were made, not approved.

One path makes you productive. The other makes you interesting.

I know which one I’m choosing.

Your Agents Are Lying to You - The deception patterns in agent output
Vibe Coding is a Trap - Why flow state with AI is dangerous
Death of Single-Agent AI - Why multi-agent is the only path forward

Challenge: For the next week, track your intervention rate. How many times do you stop Claude before it asks? Share your numbers. I’ll go first: I intervene every 4-6 tool calls, no exceptions. Even when it’s annoying. Especially when it’s annoying.

What’s your ratio?