The Duplicate Code Problem — Building the Same Thing Seven Times

By the time we built the eighth WordPress plugin, the same code had been written seven times. The component that connects to Claude’s AI API — handling authentication, request formatting, rate limiting, error management, and response extraction — existed as seven slightly different implementations across seven different products. Same core logic. Seven separate files. Seven separate maintenance obligations. Seven separate points of failure when the API changed.

This is the most expensive mistake documented in this case study. Not the most dramatic, not the most technically interesting — the most expensive in actual hours, actual maintenance cost, and actual quality variance. And it is also the most common pattern in any fast-moving development environment, AI-assisted or otherwise. Understanding precisely how it happened and what it cost is the clearest possible argument for building shared infrastructure early.

The Psychology of How It Happened

The root cause is not carelessness. It is the natural psychology of fast, product-focused iteration applied without a systems lens. When building a new product, you need an API integration. The locally efficient thing is to build it for this product, in this product’s codebase, optimized for this product’s context. Building a shared, generalized version would take longer now, and the focus is on shipping this product.

That reasoning is locally rational. The product-local implementation is faster to build today. The shared generalization takes longer today. What the local reasoning misses is the global cost: every subsequent product that needs the same integration will also build it locally. Every bug fixed in one implementation needs to be fixed in six others. Every API update requires six parallel changes. Every improvement discovered in one product’s implementation never propagates to the others.

In a human development team, this pattern typically gets caught because senior developers or architects notice duplication accumulating and escalate it. In an AI-assisted solo practice, no one has that perspective except the human practitioner — and the human practitioner is focused on building the current product. The signal of duplication accumulates invisibly until the cost of maintaining it becomes impossible to ignore.

The AI contributed to the invisibility. Each new product session started fresh, with no memory of previous sessions. The AI did not notice that it had built the same API client six times before. It built the seventh from first principles, as it had built the first — producing a nearly identical but subtly different implementation, because the exact specification was slightly different each time. And because the build was fast, the cost felt negligible each time. The cumulative cost was invisible until the audit that preceded building the shared library.

The Full Inventory of Duplication

The audit before building the shared library found more duplication than expected:

Claude API client: six-plus independent implementations across six products. Each had slightly different error handling, different configuration structure, different method signatures for equivalent operations, and different approaches to API key management. Some were more robust than others. The most recent was better than the first. The first five products never received the improvements discovered in the sixth.

Tavily web search client: seven-plus implementations. Same underlying API. Seven different approaches to managing API keys, formatting search queries, handling pagination in results, and processing the response format. The seventh had better error handling and result formatting than the first. The first six products remained on their original implementations.

Pinecone vector database client: five-plus implementations. Vector database integration is complex enough that the differences between implementations were significant — some had better semantic search handling, some supported namespace isolation, some had performance optimizations. All five products had diverged from each other in ways that made each opaque without reading its specific code.

WordPress database base class: four-plus implementations. The same pattern of table name definition, CRUD operations, and query preparation, rewritten with product-specific naming and slightly different conventions. Four products, four maintenance burdens, zero shared improvement.

Total: approximately twenty-two implementations that should have been four. Eighteen redundant implementations at an average of two to four hours each. That is thirty-six to seventy-two hours of work — one to two full weeks of directed effort — that produced no net capability. And that is before counting the ongoing maintenance cost: every API change, every bug fix, every improvement to be made six times instead of once.

The Real Cost: A Bug Fixed Six Times

The most concrete illustration of the maintenance cost: when Anthropic updated their API to deprecate a specific parameter in Claude’s request format, products using the old parameter would receive errors. The fix was a single line change per implementation.

Six implementations. Six separate products. Six separate sessions required to load the project, identify the relevant file, make the change, test the fix, and verify it worked in the actual plugin environment. What should have been a fifteen-minute maintenance task — one file, one change, one test — took the better part of a day. Not because any individual fix was difficult, but because the same fix had to be made six times with full context loading and testing each time.

With a shared library, that API change is one fix in one file. The change propagates to all six products immediately. One session, fifteen minutes, done. The difference between six sessions and one session for every API update, every bug fix, and every improvement multiplied across the remaining life of the products is substantial. The shared library that eventually captured this work took about thirty to forty hours to build and produces that return on every single maintenance event across the entire portfolio.

When the Library Should Have Been Built

After the second product. This is the precise answer, and it is worth being specific about because the temptation to defer is always present.

After the second product, there is sufficient evidence that a component will be needed again — it has already been built twice. The cost of generalizing it is roughly equivalent to building it a third time as a project-specific implementation. The maintenance benefit begins with the third product. The break-even on the generalization investment is the third use. Every use after the third is compound return.

The signal was visible from the very first product build — session notes from the period include an observation that the API integration pattern would be reusable. That observation did not translate into action because it felt like a future optimization, something to do when there was “more to share.” The second product should have triggered the decision. It didn’t. The third should have. It didn’t. By the time the shared library was built after the tenth product, fourteen to twenty redundant implementations had been created that should not have existed.

The Generalizable Rule

The rule that should have been operating from the beginning, and that applies to any knowledge work domain: if you have built the same component twice, build a shared version before the third use.

Not after the tenth. After the second. When you have built the same analysis framework twice, generalize it into a template before using it a third time. When you have written the same document structure twice, build a template before writing it a third time. When you have applied the same research approach twice, document it as a methodology before applying it a third time.

In each case, the cost of the generalization is roughly equivalent to one more project-specific implementation. The return begins immediately on the third use. By the tenth use, the return is substantial. By the thirtieth, it has produced more value than almost any other single investment in the practice.

AI accelerates this pattern because AI is very good at generalizing specific implementations into reusable templates — once the decision to generalize is made. The decision is the hard part. Everything else the AI handles well. Make the decision after the second implementation. Not later.

The Detection Problem

One of the reasons the duplicate code problem persisted through ten products without being fully addressed is that it was invisible in the individual product view. Looking at any single product, the API client implementation was appropriate and well-built for that product’s context. The duplication was only visible in the cross-product view — and the cross-product view was never the default perspective during active product development.

This is a general pattern in complex systems: local optimization produces globally suboptimal outcomes. Each product-local decision to build the API client inline was the locally optimal choice. The global optimum — build once, share everywhere — was invisible from the local perspective. Only a periodic cross-portfolio audit would have revealed it, and periodic audits were not built into the workflow until the shared library project was specifically undertaken.

The fix is structural: create the habit of the cross-portfolio audit as a standing event, not as something triggered by visible problems. A quarterly thirty-minute review asking “what have we built multiple times that should be shared?” would have triggered the shared library project after the second or third product. The audit costs almost nothing. The problem it prevents costs weeks of rework and ongoing maintenance overhead.

What the Shared Library Feels Like to Use

After the shared library was in place, starting a new product with API integrations changed in a concrete, experiential way. Previously: “Let’s build the Claude API integration. Here’s what we need it to do…” — and a session of architecture and implementation would follow. After: “Include the shared library’s Claude API client. The configuration follows the standard pattern. We’ll customize these three settings for this product’s context.” — and the integration was operational in a few minutes of configuration rather than a session of implementation.

That shift — from build to configure — is the experiential meaning of the technical debt reduction numbers. Seventy percent code reduction means the session that previously produced the API integration is now a few lines of configuration. The time saved is not abstract. It is concrete: one hour of directed session time recovered per API integration per product, multiplied across every subsequent product that uses the shared library.

The Shared Library also changed the quality trajectory. When a bug was found in the Claude API client — a subtle issue with how error responses were being parsed — the fix went into the shared library and immediately applied to all products. Previously, that fix would have been applied to the product where the bug was found, and the other six implementations would have continued to carry the same bug until it was independently discovered and fixed in each of them. The shared library made the improvement propagation automatic. Every product became as good as the best product, not as good as its own implementation history.

How Recent AI Innovations Change This Picture

The duplicate code problem described in this post — building the same foundational components seven times before systematizing them — is one of the most instructive failures in this case study. The good news about AI innovation is that the tooling now available makes this failure substantially easier to avoid from the start.

Agent Skills are the direct platform-supported answer to the problem this post describes. The shared library that was laboriously built through painful retrospective consolidation is exactly what Agent Skills are designed to formalize. Skills contain instructions, patterns, templates, and resources that persist across sessions and projects. Building a “shared library skill” from the beginning — defining your foundational patterns once and loading them automatically — is now a platform feature rather than a custom infrastructure project.

The 1-million-token context window also changes the duplicate code detection problem. In the methodology described here, the AI was typically unaware of conventions established in other products because they were in separate sessions with separate contexts. With a context window large enough to hold the entire shared library plus the active project, the AI can check a new implementation against every existing implementation and flag inconsistencies in real time rather than having them accumulate undetected.

MCP filesystem connections mean the AI can scan the entire shared library before starting new work, flagging patterns that already exist and should be reused rather than reimplemented. The human discipline of “check before you build” becomes AI-assisted: the AI checks automatically, surfaces relevant existing code, and asks for confirmation before proceeding with a new implementation that duplicates something existing. The architectural discipline is the same; the automation supporting it is stronger.

Leave a Comment