AI Development · Workflow

The Dark Factory Fallacy

Everyone is writing longer specifications before building. I think they are solving the wrong problem — and my own experience building Pastoral Rhythm taught me why.

June 2026 AI Development Workflow Est. reading time: 6 min

There is a pattern I keep noticing in how people talk about agentic development right now. The conversation goes something like this: the AI agent made a mess of the build because the specification was not detailed enough. The solution? Write a better specification. A more complete one. One that anticipates every decision before a single line of code is written.

I understand the instinct. I have felt it myself. If I could just describe the product perfectly upfront, the agent could execute perfectly end to end — no back and forth, no corrections, no human in the loop at all. A dark factory for software. You put in a specification, a finished product comes out the other side.

The more I have built with agents, the more I think this is a category error. Not a workflow error. A conceptual one. And the distinction matters, because it changes what you should actually be trying to do.

Where the dream comes from

The dark factory idea is borrowed from manufacturing. In a car plant, the product is already known before the factory is built. The car exists as a complete design. The factory's job is to reproduce it identically, at scale, with minimal variation. Uncertainty has been removed. What remains is execution.

Software is almost the opposite of this. The product is not known before you build it. It is discovered during construction. You build a feature, realise the feature was the wrong thing, change direction, discover a constraint you could not have anticipated, then change the architecture accordingly. The building itself generates the information you needed to build it correctly.

"The product is not known before you build it. It is discovered during construction."

This is not a failure of planning. It is the nature of the work. And a specification, no matter how detailed, cannot remove that uncertainty — it can only move it. Instead of discovering things during the build, you discover them during the specification phase. The uncertainty does not disappear. It relocates.

What building Pastoral Rhythm actually looked like

I built Pastoral Rhythm — a platform for managing pastoral care in churches and ministries — with agents doing a significant portion of the implementation work. Before starting, I put real effort into specification. I documented the problem, the user flows, the data model, the key screens. I thought I had a clear picture.

What I had was maybe 20 percent of the decisions I actually needed to make.

The other 80 percent emerged from seeing the UI in the browser. Once I could actually interact with the care record flow, I realised the structure I had specified felt right on paper but awkward in practice. The way I had modelled pastoral notes did not match how a pastoral coordinator actually thinks about a person's journey. The assignment logic I had planned needed a concept — a "follow-up status" — that I had not anticipated at all.

None of this could have been written into the spec. Not because I was bad at specifying, but because the information did not exist until I could see and interact with the thing I was building. The UI made it visible.

What I planned

Specify everything

↓

Agent builds

↓

Done

What actually happened

Specify roughly

↓

Agent builds

↓

I see the UI

↓

New decisions surface

↓

Refine and continue

The second loop is not a failure mode. It is the process. The spec was always going to be incomplete. The question is whether you design your workflow around that reality or fight against it.

Two kinds of work agents are being asked to do

The reason the "write a better spec" instinct keeps appearing is that it works for one kind of agentic work but not the other. I have started thinking about this as a fundamental distinction.

Type 1

Execution Work

Build a CRUD screen
Write test coverage
Refactor a component
Generate an API client
Migrate a framework version

Type 2

Discovery Work

What should this feature actually be?
What is the right UX here?
What tradeoffs should we make?
What does a user actually need?
What does this data model miss?

Agents are genuinely excellent at Execution Work. The task is deterministic, the output is verifiable, and a good specification maps almost directly to what needs to be built. The more detail you provide, the better the result.

Discovery Work is completely different. No amount of specification resolves it, because the information required to make the decision does not exist until you start building. Trying to solve Discovery Work with a more detailed spec is not just inefficient — it is impossible. You are asking the spec to contain answers it cannot contain yet.

The mistake most people are making right now is applying the logic of Execution Work to Discovery Work. They write longer specifications hoping to eliminate the uncertainty that emerges from building. They cannot. They are just deferring it to an earlier, less informed moment.

What a true dark factory would actually require

A genuinely autonomous dark factory for software — one that truly needs no human in the loop — would require something far more ambitious than a detailed specification. It would require the agent to handle the full Discovery loop itself.

That means building a feature, observing how it actually behaves, forming a hypothesis about what is wrong, making a judgment call about the right direction, and then changing course accordingly. Not executing a predetermined plan. Actively deciding as reality reveals new information.

At that point, the agent is not acting as a developer. It is simultaneously the product manager, the designer, the architect, and the person who decides whether any of it was the right thing to build in the first place. That is a fundamentally different system — and we are not close to it yet. What we have today are agents that are excellent at execution and still quite weak at judgment.

What I think actually works

The mental model I have settled into is not "write the perfect spec and let the agent run." It is closer to how a founder works with a strong engineering team. You do not specify every detail. You hold the vision, set the direction, and make the calls that actually determine whether the product succeeds. The team fills in the thousands of smaller decisions along the way.

With agents, the division looks something like this:

Owner	What they hold
Human	Vision, goals, constraints, risk tolerance, and all strategic direction
Agent	Implementation, design options, refactoring, research, and tactical decisions

The key word in the human column is strategic. You stay responsible for the decisions that actually determine whether the product is the right thing — the ones that require judgment about users, markets, and tradeoffs that cannot be derived from a codebase. The agent handles the decisions that can be derived: the architecture choices, the component structure, the edge cases.

In practice for Pastoral Rhythm, this meant my specifications were deliberately lighter than I initially thought they should be. I would describe the problem and the goal clearly, let the agent build something, then sit with it and make the decisions that only became visible once I could interact with what it had produced. The spec was not the source of truth. The running product was.

The spec is not the source of truth. The product is the source of truth. The spec is a snapshot of your current understanding — and as soon as reality disagrees with it, you update the spec, not the product.

What is actually worth preserving

If specifications are not the thing to invest in, what is? The thing I have found most valuable is not detailed specs at all — it is the reasoning behind decisions.

Six months into a project, you forget why you chose a particular data model. You forget why a feature was cut. You forget why the onboarding flow works the way it does. Agents, with no persistent memory across sessions, forget immediately. Without a record of the reasoning, both you and the agent will revisit and relitigate settled decisions indefinitely.

What I try to maintain now is closer to a living project memory than a living specification. Four things: the product vision, which is stable; the current state of the codebase, which the agent can inspect directly; a log of significant decisions and the reasoning behind them; and the next goal, which is short-lived and replaced after every iteration. That is enough context for most agent sessions, and it does not require keeping a sprawling spec document in sync with a codebase that is changing weekly.

The specification is not the bottleneck

I do not think the people writing longer and longer specs are wrong to want clarity before building. Clarity is genuinely valuable. The problem is the assumption buried inside the effort — that if the spec is detailed enough, the human can exit the loop entirely.

That exit is not available yet. Not because agents are not capable enough, but because the information that drives the most important decisions does not exist at specification time. It emerges from building. The role of the human is not to front-load every decision into a document. It is to be present when reality surfaces the decisions that could not be anticipated.

The dark factory for software is probably coming. But it will not be powered by better specifications. It will be powered by agents becoming good enough at judgment to make the thousands of Discovery decisions themselves — and to know which ones still need a human in the room.

We are not there yet. In the meantime, the workflow that actually works is one that embraces the loop rather than tries to eliminate it.