The AI Engineering Manager

Jun 11, 2026 · 6 min read

“Review is now the bottle neck" is the new saying on Twitter. It’s totally right.

This might not be surprising to you at all if you’ve been heavily using AI coding tools. In the last 6-9 months not only have models improved at an exponential rate, we’ve also seen all kinds of new harnesses, control paradigms, and use cases (outside of writing code) for LLM-based “coding” agents.

One of the biggest side effects with this exponential rate of improvement is the volume of output that comes along with it. Not only is slop a problem, reviewing actual quality work done by LLMs is also becoming significantly more time consuming. With models like Opus 4.5+ (and now Fable) the concern is becoming less about the objective correctness of a task and more about the subjective correctness of a task.

Objectively, a model can write Next.js API routes or bash scripts or write technically accurate documentation all day long.

Subjectively, though, a model can’t read your mind about that route or script’s functionality and can’t perfectly predict all the things you’d add to that documentation to tailor it to your users.

A good example of this is building UI. I could say to Opus, for example, “build me a table UI that doesn’t look vibe coded.” Now, to Opus “not vibe coded” could mean something completely different than my idea of vibe coding with purple gradients and colored borders. This is a simple example of just not explaining yourself well to an agent but it scales up with harder and harder engineering tasks. AI can advance as much as it wants but if we can’t explain ourselves to it, we leave a lot of “intelligence” on the table.

This is where the bottleneck of review comes into play. Review at this stage in the AI hype cycle is less about functioning code but more about how close AI got to what you expected it to do. For codebases, this might be upholding certain semantics or less programmatically enforceable rules. It also, like the example above, might be about how close the agent’s definition of “done” is to your definition of “done” for that given task.

This is where the beauty of a planning agent comes in.

The AI Engineering Manager

The following is what I’d consider to be the “AI engineering manager”. In projects of mine recently, I’ve thoroughly enjoyed this approach which involves roughly three components:

Planning Agent

A planning agent is just a normal coding agent session, strictly for planning though.

This is what’d I’d actually consider to be the “AI engineering manager” but the entire system brings it all together. This agent allows you to talk through ideas, decoupled from the place where the work gets done in a way that traditional teams might, discussing function over form until form becomes a needed part of the discussion.

This interface allows you to become crystal clear with your agentic tools on what you’re actually trying to accomplish. I’ve been asking my planning agent to ask me questions on features or parts of an implementation it’s not perfectly clear on so we can try to become as aligned as possible.

When you’ve reached a place of aligned objectives, the next component of the system comes into play:

Planning Substrate

The planning substrate is the place where the actual plan is stored. AI chat is great for idea discussions but it’s pretty bad for storing structured information in a way that humans can easily understand and reference.

There's tons of options here: a markdown folder, Notion pages, or a traditional issue tracking tool (Linear, Jira, etc.). I’ve been using Linear as my “planning substrate” and it has been awesome for modeling true work and inter-task dependency. Linear has all the primitives: teams, projects, milestones, and issues. Linear allows my planning agent, through the MCP, to break down work into bite sized chunks that I can hand off to “worker bee” agents to actually be implemented.

I’m mainly using projects to model, well, projects and milestones and issues within them to break up work into those digestible chunks. I would be curious to see how teams in Linear could be used as an agent control paradigm (you might have a backend team, frontend team, etc.).

This leads to the last and most obvious component of the system:

Execution Surface

An execution surface is where the work planned by the planning agent actually gets completed by other agents. This has been Conductor or Devin for me as of late. These tools, using the Linear MCP, can read the issues or chunks of work to be done and just do them without having to have any additional context. These tools are great for fanning out work that isn’t dependent upon each other. I usually ask the planning agent to give me a parallelization diagram so that I can fan out work as much as possible.

The Disconnect

My main motivation for writing this article is the observation of the lack of communication/automation between the first component (the planning agent) and the third component (the execution surface). Up to this point, I’ve used AI to do the planning and ideation, create the plan, read the plan, and do the work. But, I’m left with still having to ask the planning agent, which has no knowledge of the work bee agents, for the plan to parallelize work. I also find the dance of managing a bunch of agent sessions quite tedious, particularly in reviewing and merging changes.

This article is less of a proposition of a solution and more of an observation of a problem. That being said, I’d very much like to see these three components of this AI software factory connected into a loop to create this true AI engineering manager. These three components, when disconnected, are still incredibly useful but I, as the human, am playing this managerial role and routing information and tasks between agents. To a certain degree, I want my only interface in this to be the planning agent which will then kick off work and manage all the worker bee agents for me, only bringing an individual worker bee to my attention when necessary. I think for the top 90% of work being done that’s all I would need with the current model capabilities (that 90% will only go up as models improve).

I think we're going to see this pattern a whole lot more in agentic tools in general. A pattern of a manager agent (that the human user interacts with) and worker bee agents (that the manager agents spins up and manages).

At the end of the day, I’m incredibly excited to see where development continues to go. The feeling of being able to “engineer” engineering as it stands today is such a wonderful experience as somebody who gets geeked up about automation.

Just some thoughts : )