← Writing

22 April 20268 min read

Agentic development is a discipline, not a tool

Claude Code and similar tools are widely available. The discipline required to use them at production speed is not. Here is what it actually takes.

Agentic devClaude CodeWorkflowAI engineering

The tool is not the constraint

Twelve months ago, the bottleneck in AI-assisted development was the tool. The available tools were limited, brittle, and required constant supervision.

That is no longer the bottleneck. Claude Code is good. The models underlying it are good. The tooling ecosystem has matured to the point where the tool is not what separates teams that ship fast from teams that do not.

The constraint now is discipline.

Discipline is the set of habits, practices, and decisions that determine whether an agentic development workflow compounds into leverage or compounds into chaos.

Most teams that adopt Claude Code experience the former for a week and the latter for a month. They get fast results on isolated tasks, then get burned by an agent that misunderstood scope, touched the wrong files, or produced code that passes tests but fails in production. They conclude that the tool is unreliable and return to slower workflows.

The tool is not the problem.

What discipline means in agentic development

Scope clarity before invocation. An agentic workflow is only as good as its initial context. Vague instructions produce vague results. Precise instructions — including explicit scope boundaries ("only modify files in /src/api/", "do not change the database schema", "the test suite must pass") — produce precise results.

This requires more upfront thinking than traditional development workflows. You have to know what you want before you ask for it. The payoff is that the implementation comes back faster and requires fewer revision cycles.

Verification at every checkpoint. Agentic development workflows can move faster than human review can follow if you let them. This is when errors compound. The discipline is to review at each logical checkpoint — not every file, but every meaningful unit of change — before proceeding to the next.

The checkpoint structure depends on the task. For a new feature: review the interface design before implementation, review the implementation before tests, review tests before integration. For a refactor: review the scope identification before any changes, review each module after its changes, verify the test suite after each module.

Test infrastructure that runs in seconds. Agentic workflows generate code faster than traditional workflows. The bottleneck shifts from code generation to code verification. If your test suite takes ten minutes to run, you cannot verify in the tight loops that agentic development enables.

Investment in fast, reliable test infrastructure is not optional in an agentic workflow — it is the thing that determines whether you can trust the output.

Context management as a first-class concern. LLMs have context windows. An agentic workflow that accumulates context over a long session eventually becomes confused by its own history — the early decisions crowd out the current task.

The discipline is to start fresh sessions for new tasks, to summarize context rather than preserve it verbatim, and to notice when a session is getting confused and reset rather than trying to correct in the same context.

The compounding dynamic

The teams that use agentic development well have a specific property: each week, they are faster than the week before.

This is not primarily because the tool is improving (though it is). It is because their own practices are compounding. They learn which task types benefit most from the workflow. They build libraries of patterns the agent can follow. They establish verification checkpoints that catch errors early. They get better at writing precise initial context.

The teams that do not compound are stuck in a cycle: use the tool enthusiastically, get burned by an unexpected failure, restrict usage, get used to the restriction, forget why they restricted it, try again. Their relationship with the tool oscillates rather than grows.

The tasks that benefit most

Not all development tasks benefit equally from agentic workflows. The ones that benefit most share properties:

Well-specified interfaces. Tasks where the inputs and outputs are clearly defined — implement this API endpoint, write tests for this function, refactor this module to follow this pattern — are excellent candidates. The agent has a clear success criterion.

High repetition-to-novelty ratio. Tasks that involve applying a known pattern many times (adding logging to every function, writing type definitions from existing code, generating boilerplate for a new module type) benefit enormously. The agent does not tire; you review once and spot-check.

Contained blast radius. Tasks where the scope of potential errors is limited — a single file, a module with a clear boundary, a feature behind a feature flag — allow for faster iteration because failures are contained.

The tasks that benefit least: tasks requiring deep domain knowledge the agent does not have, tasks with ambiguous success criteria, tasks where the failure mode is silent (the code runs but produces wrong results that only manifest in production).

What to install in a team

When I work with engineering teams on agentic workflow adoption, the output is not a set of tool configurations. It is a set of practices:

A task template. A standard format for writing agentic task descriptions that includes: goal, constraints, files in scope, files out of scope, success criteria, verification steps. Takes five minutes to fill out. Eliminates most scope failures.

Checkpoint protocols. Team-specific definitions of when to stop and review, and what to look at. Not generic checklists — specific to the team's codebase, test infrastructure, and risk profile.

A failure taxonomy. A running log of the failure modes the team encounters, with the root cause (usually: scope ambiguity, missing context, or insufficient verification) and the practice change that would prevent recurrence. This is how practices improve over time.

Pairing habits. Agentic development works best when a human is reviewing output in real time, not after the fact. This changes the nature of pair programming — one person writes the task, the agent implements, the partner reviews continuously.

The honest picture

Agentic development is not a shortcut to software quality. It is a multiplier on the quality of thinking that precedes it. If you think clearly about what you are building and why, if you verify rigorously, and if you establish discipline before you go fast — the speed compounds into something remarkable.

If you go fast first and establish discipline later, you spend the discipline-building phase untangling the damage from the fast phase.

The teams I have seen use this workflow well built the discipline first. The speed followed.

← All writingWork with me →