Your AGENTS.md file is making your coding agents dumber.
There, I said it. That repository context file you spent hours crafting? The one with detailed architecture diagrams, contribution guidelines, and deployment procedures? It’s actively hurting your agent’s performance.
And you’re paying 20% more in tokens for the privilege.
The Paper Nobody Wants to Talk About
On February 12th, 2026, a paper dropped on arXiv that should have sent shockwaves through the AI engineering community. Instead, it’s been quietly debated in Hacker News threads while vendors continue pushing the exact opposite advice.
The paper: “Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?”
The findings are brutal:
- LLM-generated AGENTS.md files DECREASE performance by 3%
- Human-written AGENTS.md files only improve performance by 4%
- Token costs increase by 20%+ across all tasks
- Context pollution from unnecessary requirements makes tasks harder
Let that sink in. The industry-standard practice of adding AGENTS.md, CLAUDE.md, or .cursor/rules files to your repository is, in most cases, making your agents worse at their jobs.
The Context Pollution Problem
Here’s what’s actually happening when you feed your agent a 2,000-line AGENTS.md file:
1 | Task: Fix the authentication bug in login.ts |
The paper authors put it diplomatically: “unnecessary requirements from context files make tasks harder.”
Translation: Your agent is drowning in noise while trying to find the signal.
The 4% Lie
Defenders of AGENTS.md files will point to the 4% improvement from human-written files and say “See? It works!”
This is the most dangerous statistic in the paper.
Yes, human-written context files provide a 4% boost. But here’s what they don’t tell you:
- That 4% comes at a 20% token cost increase - you’re paying a 5x premium for marginal gains
- The 4% average hides massive variance - some tasks improve 15%, others degrade 10%
- Most AGENTS.md files aren’t human-written - they’re LLM-generated, which decreases performance by 3%
A senior engineer on Hacker News put it perfectly:
“The 4% gain is ‘yuuuge’ in hard projects, but only if your AGENTS.md is actually good. Most aren’t. Most are outdated documentation dumps that confuse the agent more than they help.”
When AGENTS.md Files Actually Work
The paper reveals something crucial: AGENTS.md files only help when they describe minimal requirements.
Good AGENTS.md (150 lines):
1 | # Project Constraints |
Bad AGENTS.md (2,000 lines):
1 | # Welcome to Our Amazing Project! |
The difference? The first file tells the agent what constraints matter. The second file tells the agent everything except what matters.
The Token Economics
Let’s talk money, because this is where AGENTS.md files become indefensible.
Assume you’re using Claude Sonnet 4.6 (released Feb 17, 2026):
- Input tokens: $0.15 per million
- Your AGENTS.md: 2,000 lines ≈ 3,000 tokens
- Daily agent tasks: 50
- Daily wasted tokens: 150,000
- Monthly wasted tokens: 4.5 million
- Monthly cost for AGENTS.md alone: $0.68
That doesn’t sound like much until you scale:
- Team of 10 engineers: $6.80/month
- Company with 100 developers: $68/month
- Enterprise with 1,000 developers: $680/month
For what? A 3% performance decrease?
And that’s just the direct cost. The indirect cost—agents taking longer to complete tasks, making more mistakes, requiring human intervention—is 10x higher.
The Real Problem: We’re Treating Agents Like Humans
Here’s the fundamental mistake: we’re writing AGENTS.md files as if agents are new team members who need onboarding.
Agents aren’t humans.
A human needs context about:
- Team culture
- Historical decisions
- Deployment procedures
- Who to contact for what
An agent needs:
- The specific files to modify
- The constraints that affect the task
- The test suite to run
Everything else is noise.
The paper authors note: “human-written context files should describe only minimal requirements.”
Not “minimal requirements plus everything else we think might be useful.” Just the requirements.
What to Do Instead
1. Delete Your AGENTS.md File
Start here. Just delete it. Watch what happens.
Your agents will:
- Complete tasks faster (less context to process)
- Make fewer mistakes (less conflicting information)
- Cost less (fewer tokens)
2. Use Task-Specific Context
Instead of a monolithic AGENTS.md, provide context per task:
1 | # Bad: Agent reads entire AGENTS.md |
3. Implement Context Retrieval
Build a simple retrieval system:
1 | # context_retriever.py |
Now your agent gets 50 tokens of relevant context instead of 3,000 tokens of noise.
4. Monitor Agent Performance
Track these metrics:
- Task completion rate (before/after removing AGENTS.md)
- Token consumption per task
- Human intervention frequency
- Time to first correct solution
If removing AGENTS.md improves these metrics (and the paper suggests it will), you’ve just optimized your entire agent workflow.
The Vendor Incentive Problem
Here’s why nobody’s talking about this paper: vendors have every incentive to keep you using AGENTS.md files.
More context = more tokens = more revenue.
Anthropic, OpenAI, Cursor, GitHub Copilot—they all benefit from you dumping massive context files into every agent session. The 20% token increase isn’t a bug; it’s a feature.
The paper authors acknowledge this indirectly:
“While context files are widely recommended by AI coding assistant vendors, our results suggest that their benefits may be overstated.”
Translation: Vendors are lying to you.
The Path Forward
The AGENTS.md debate isn’t about whether context matters. It’s about signal-to-noise ratio.
Good context:
- Minimal (under 200 tokens)
- Task-specific
- Constraint-focused
- Regularly updated
Bad context:
- Massive (2,000+ lines)
- Repository-wide
- Information-dumped
- Written once, never updated
The paper gives us a framework for distinguishing between the two. It’s time we started using it.
Your Move
Here’s your challenge:
- Audit your AGENTS.md file - How many lines are actually constraints vs. fluff?
- Measure your token usage - How much are you spending on context that doesn’t help?
- Run an A/B test - Try one week without AGENTS.md, track performance
- Share your results - The community needs real-world data, not vendor marketing
The research is clear. The economics are clear. The only thing standing between you and better agent performance is the courage to delete that file.
What Do You Think?
Is your AGENTS.md file helping or hurting? Have you measured the actual impact on your agent’s performance? Drop your findings in the comments—let’s build a data-driven understanding of what actually works.
Because right now, we’re all paying 20% more for 3% worse performance.
That’s not just bad engineering. It’s bad business.
Related Reading:
- Zero-Polling Agentic Workflows - Cut your agent token costs by 50%
- The Agentic Isolation Trap - Why enterprise AI agents fail in production
- Agentic ROI: Reliability Over Features - What actually matters for production agents
Primary Sources: