The Mythical Agent-Month

agents

thoughts

Author

Wes McKinney

Published

February 17, 2026

Like a lot of people, I’ve found that AI is terrible for my sleep schedule. In the past I’d wake up briefly at 4 or 4:30 in the morning to have a sip of water or use the bathroom; now I have trouble going back to sleep. I could be doing things. Before I would get a solid 7-8 hours a night; now I’m lucky when I get 6. I’ve largely stopped fighting it: now when I’m rolling around restlessly in bed at 5:07am with ideas to feed my AI coding agents, I just get up and start my day.

Among my inner circle of engineering and data science friends, there is a lot of discussion about how long our competitive edge as humans will last. Will having good ideas (and lots of them) still matter as the agents begin having better ideas themselves? The human-expert-in-the-loop feels essential now to get good results from the agents, but how long will that last until our wildest ideas can be turned into working, tasteful software while we sleep? Will it be a gentle obsolescence where we happily hand off the reins or something else?

For now, I feel needed. I don’t describe the way I work now as “vibe coding” as this sounds like a pejorative “prompt and chill” way of building AI slop software projects. I’ve been building tools like roborev to bring rigor and continuous supervision to my parallel agent sessions, and to heavily scrutinize the work that my agents are doing. With this radical new way of working it is hard not to be contemplative about the future of software engineering.

Probably the book I’ve referenced the most in my career is The Mythical Man-Month by Fred Brooks, whose now-famous Brooks’s Law argues that “adding manpower to a late software project makes it later”. Lately I find myself asking whether the lessons from this book are applicable in this new era of agentic development. Will a talented developer orchestrating a swarm of AI agents be able to build complex software faster and better, and will the short term productivity gains lead to long term project success? Or will we run into the same bottlenecks – scope creep, architectural drift, and coordination overhead – that have plagued software teams for decades?

Revisiting The Mythical Man-Month (TMMM)

One of Brooks’s central arguments is that small teams of elite people outperform large teams of average ones, with one “chief surgeon” supported by specialists. This leads to a high degree of conceptual integrity about the system design, as if “one mind designed it, even if many people built it”.

Agentic engineering appears to amplify these problems, since the quality of the software being built is now only as good as the humans in the loop curating and refining specs, saying yes or no to features, and taming unnecessary code and architectural complexity. One of the metaphors in TMMM is the “tar pit”: “everyone can see the beasts struggling in it, and it looks like any one of them could easily free itself, but the tar holds them all together.” Now, we have a new “agentic tar pit” where our parallel Claude Code sessions and git worktrees are engaged in combat with the code bloat and incidental complexity generated by their virtual colleagues. You can systematically refactor, but invariably an agentic codebase will end up larger and more overwrought than anything built by human hand. This is technical debt on an unprecedented scale, accrued at machine speed.

In TMMM, Brooks observed that a working program is maybe 1/9th the way to a programming product, one that has the necessary testing, documentation, hardening against edge cases, and is maintainable by someone other than its author. Agents are now making the “working program” (or “appears-to-work” program, more accurately) a great deal more accessible, though many newly-minted AI vibe coders clearly underestimate the work involved with going from prototype to production.

These problems compound when considering the closely-related Conway’s Law, which asserts that the architecture of software systems tends to resemble the organizations’ team or communication structure. What does that look like when applied to a virtual “team” of agents with no persistent memory and no shared understanding of the system they are building?

Another “big idea” from TMMM that has stuck with people is the n(n-1)/2 coordination problem as teams scale. With agentic engineering, there are fewer humans involved, so the coordination problem doesn’t disappear but rather changes shape. Different agent sessions may produce contradictory plans that humans have to reconcile. I’ll leave this agent orchestration question for another post.

No Silver Bullet

“There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity” – “No Silver Bullet” (1986)

Brooks wrote a follow-up essay to TMMM to look at software design through the lens of essential complexity and accidental complexity. Essential complexity is fundamental to achieving your goal: if you made the system any simpler, it would fall short of its problem statement. Accidental complexity is everything else imposed by our tools and processes: programming languages, tools, and the layer of design and documentation to make the system understandable by engineers.

Coding agents are probably the most powerful tool ever created to tackle accidental complexity. To think: I basically do not write code anymore, and now write tons of code in a language (Go) I have never written by hand. There is a lot of discussion about whether IDEs are still going to be relevant in a year or two, when maybe all we need is a text editor to review diffs. The productivity gains are enormous, and I say this as someone burning north of 10 billion tokens a month across Claude, Codex, and Gemini.

But Brooks’s “No Silver Bullet” argument predicts exactly the problem I’m experiencing in my agentic engineering: the accidental complexity is no problem at all anymore, but what’s left is the essential complexity which was always the hard part. Agents can’t reliably tell the difference. LLMs are extraordinary pattern matchers trained on the entirety of humanity’s open source software, so while they are brilliant at dealing with accidental complexity (refactor this code, write these tests, clean up this mess) they struggle with the more subtle essential design problems, which often have no precedent to pattern match against. They also often tend to introduce unnecessary complexity, generating large amounts of defensive boilerplate that is rarely needed in real-world use.

Put another way, agents are so good at attacking accidental complexity, that they generate new accidental complexity that can get in the way of the essential structure that you are trying to build. With a couple of my new projects, roborev and msgvault, I am already dealing with this problem as I begin to reach the 100 KLOC mark and watch the agents begin to chase their own tails and contextually choke on the bloated codebases they have generated. At some point beyond that (the next 100 KLOC, or 200 KLOC) things start to fall apart: every new change has to hack through the code jungle created by prior agents. Call it a “brownfield barrier”. At Posit we have seen agents struggle much more in 1 million-plus line codebases such as Positron, a VSCode fork. This seems to support Brooks’s complexity scaling argument.

I would hesitate to place a bet on whether the present is a ceiling or a plateau. The models are clearly getting better fast, and the problems I’m describing here may look charmingly quaint in two years. But Brooks’s essential/accidental distinction gives me some confidence that this isn’t just about the current limitations of the technology. Figuring out what to build was the hard part long before we had LLMs, and I don’t see how a flawless coding agent changes that.

Agentic Scope Creep

When generating code is free, knowing when to say ‘no’ is your last defense.

With the cost of generating code now converging to zero, there is practically nothing stopping agents and their human taskmasters from pursuing all avenues that would have previously been cost- or time-prohibitive. The temptation to spend your day prompting “and now can you just…?” is overwhelming. But any new generated feature or subsystem, while cheap to create, is not costless to maintain, test, debug, and reason about in the future. What seems free now carries a future contextual burden for future agent sessions, and each new bell or whistle becomes a new vector of brittleness or bugs that can harm users.

From this perspective, building great software projects maybe never was about how fast you can type the code. We can “type” 10x, maybe 100x faster with agents than we could before. But we still have to make good design decisions, say no to most product ideas, maintain conceptual integrity, and know when something is “done”. Agents are accelerating the “easy part” while paradoxically making the “hard part” potentially even more difficult.

Agentic scope creep also seems to be actively destroying the open source software world. Now that the bar is lower than ever for contributors to jump in and offer help, projects are drowning in torrents of 3000-line “helpful” PRs that add new features. As developers become increasingly hands-off and disengaged from the design and planning process, the agents’ runaway scope creep can get out of control quickly. When the person submitting a pull request didn’t write or fully read the code in it, there’s likely no one involved who’s truly accountable for the design decisions.

I have seen in my own work on roborev and msgvault that agents will propose overwrought solutions to problems when a simple solution would do just fine. It takes judgment to know when to intervene and how to keep the agent in check.

Design and Taste as our Last Foothold

Brooks’s argument is that design talent and good taste are the most scarce resources, and now with agents doing all of the coding labor, I argue that these skills matter more now than ever. The bottleneck was never hands on keyboards. Now with the new “Mythical Agent-Month”, we can reasonably conclude that design, product scoping, and taste remain the practical constraints on delivering high quality software.

The developers who thrive in this new agentic era won’t be the ones who run the most parallel sessions or burn the most tokens. They’ll be the ones who are able to hold their projects’ conceptual models in their mind, who are shrewd about what to build and what to leave out, and exercise taste over the enormous volume of output.

The Mythical Man-Month was published in 1975, more than fifty years ago. In that time, a lot has happened: tremendous progress in hardware performance, programming languages, development environments, cloud computing, and now large language models. The tools have changed, but the constraints are still the same.

Maybe I’m trying to justify my own continued relevance, but the reality is more complex than that. Not all software is created equal: CRUD business productivity apps aren’t the same as databases and other critical systems software. I think the median software consulting shop is completely toast. But my thesis is more about development work in the 1% tail of the distribution: problems inaccessible to most engineers. This will continue to require expert humans in the loop, even if they aren’t doing much or any manual coding. As one recent adjacent example, my friend Alex Lupsasca at OpenAI and his world-class physicist collaborators were able to create a formulation of a hard physics problem and arrive at a solution with AI’s help. Without such experts in the loop, it’s much more dubious whether LLMs would be able to both pose the questions and come up with the solutions.

For now, I’ll probably still be getting out of bed at 5am to feed and tame my agents for the foreseeable future. The coding is easier now, and honestly more fun, and I can spend my time thinking about what to build rather than wrestling with the tools and systems around the engineering process.

Thanks to Martin Blais, Josh Bloom, Phillip Cloud, Jacques Nadeau and Dan Shapiro for giving feedback on drafts of this post.