Six Lessons We Should Have Learned Sooner — The Honest Retrospective
The most useful part of any case study is the retrospective — the honest accounting of what should have been done differently, stated plainly enough that a reader can apply the lesson without repeating the mistake. This post provides that accounting for the twelve-month, twenty-product case study documented in this series. Six lessons. Stated directly. With the specific rule or habit that would have made the difference for each one.
Lesson 1: Build the Shared Component After the Second Use, Not the Tenth
The rule that should have been operating from the beginning: when you have built the same component twice, generalize it before the third use. Not when there is “more to share.” Not when the duplication “becomes a problem.” After the second use. At that point, the evidence that the component will be needed again is established. The cost of generalizing it is roughly equivalent to building a third project-specific version. The maintenance benefit begins immediately. The break-even on the generalization investment is the third use.
The signal was visible from the first major product build: session notes from that period observe that the API integration pattern would be reusable. That observation did not translate into action because it felt like a future optimization. By the time it translated into action, fourteen to twenty redundant implementations had been created. Applying the “after the second use” rule would have prevented most of them.
The practice: create a component registry after the first product — a simple document listing components that have been built. When a component appears a second time, the registry is the trigger to stop and generalize. The registry costs nothing to create and maintain. What it prevents costs weeks of rework.
Lesson 2: Create the Context File Before the First Session
The context file does not need to be comprehensive on day one. It needs to exist. A context file with five sentences on day one, updated after each session, is more valuable than a comprehensive context file written from memory six months later. The cumulative context accuracy of a file that grows session by session is substantially higher than a file reconstructed retrospectively, because session-by-session updates capture the reasoning at the moment of the decision rather than reconstructed reasoning at a distance.
The practice: before any AI session on a new project, create the context file. Five sentences: project purpose, primary technology choice, main constraint, first architectural decision, what success looks like for v1. That is enough. Update it at the end of every session with what changed. By session ten, you have a comprehensive context file that reflects ten sessions of accurate, timely documentation. By session thirty, it is the most valuable single artifact in the project.
Lesson 3: Scope Discipline Must Be Imposed Before the First Build Session
AI systems build what you describe. They do not push back on scope. They do not have a backlog grooming function. They do not have competing priorities that limit what they can take on. The scope governance that protects a project from feature bloat is entirely the human’s responsibility, and it has to be applied before the first build session begins — not after scope has already expanded.
The practice: write the v1 list before writing any other requirements. Define the minimum set of features that must be true for the product to be useful at all. Everything else goes on the later list. The v1 list is what gets built first. The later list does not get touched until v1 is stable and tested. This discipline applied from the beginning of the first product would have saved a week on that product and established a habit that would have compounded across all subsequent products.
Lesson 4: Test Before Building the Next Feature — Without Exception
The most consequential single discipline in the full twelve months: test what was just built before building anything that depends on it. Twenty minutes of manual testing against the primary user scenarios that exercise the new feature, before any new build session begins. Non-negotiable.
The cost of a bug caught immediately after the session that introduced it is approximately equal to the cost of catching it at end-of-session testing. The cost of the same bug caught three sessions later — after features have been built on the assumption that the broken component works — is three to five times higher. The discipline of immediate testing eliminates the higher-cost scenario. Every session skipped without testing is a risk transfer from “cheap to fix now” to “expensive to fix later.”
Lesson 5: Start Specialist Agent Profiles After the Third Specialist Task, Not the Twentieth
The specialist agent system was built after fifteen products. It should have been started after three or four. At the third occurrence of a recurring specialist task type — architectural decisions, database design, debugging complex failures — the pattern is established, the recurring context requirements are known, and the cost of building the profile is justified by the benefit of having it for all future occurrences.
The practice: when you have performed the same category of AI task three times, create a specialist context profile for that category. You have the content — it was created during those three tasks. Building the profile is organizing and formalizing what already exists. The fifteen minutes it takes is recovered in the next session that uses the profile, and every session after that.
Lesson 6: Document the Why in the Moment, Not in Retrospect
Code captures decisions. Documentation captures reasoning. Reasoning not captured at the time of the decision is recovered at a high cost or not at all. Future sessions that encounter an undocumented decision must either infer the reasoning from context — which is slow and error-prone — or make new decisions without knowledge of why the previous decision was made the way it was — which produces inconsistency.
The practice: every non-obvious decision gets one sentence of explanation in the context file at the time the decision is made. “Using the token bucket approach for rate limiting rather than fixed window because fixed window allows burst traffic at window boundaries that could exceed API limits.” One sentence. At the moment. The cost is thirty seconds. The value is every future session that references the context file and can apply the decision intelligently rather than re-discovering or accidentally reversing it.
The Common Thread
All six lessons share a structure: invest in infrastructure earlier than feels necessary, and maintain it more rigorously than feels strictly required. The shared component should have been built earlier. The context file should have existed from day one. Scope discipline should have been formalized on the first project. Testing should have been non-negotiable from the start. Specialist profiles should have been started earlier. Architectural decisions should have been documented in the moment, always.
In every case, the correct behavior felt like over-engineering or premature optimization at the time. In every case, the retrospective judgment is that it should have been done sooner. This is the reliable pattern in any compounding practice: the investments that feel premature are typically exactly on time, and the ones deferred until they feel obviously necessary are already late. Build the infrastructure when it first seems like it might be needed. The cost of premature investment is small. The cost of delayed investment is large and grows with each session of delay.
The Systematic Pattern Behind All Six Lessons
Looking at all six lessons together, the systematic pattern is clear: each lesson involves an infrastructure investment that was deferred until it became obviously necessary, at which point the cost of building it had increased substantially relative to building it when it first became appropriate.
This is not a unique pattern. It is the standard pattern of deferred maintenance and deferred investment across almost every type of complex system. The technical debt in software systems. The maintenance backlog in physical infrastructure. The organizational debt in growing companies that didn’t invest in processes when the team was small. The pattern is universal: early investment in infrastructure is consistently cheaper and more effective than late investment, and the gap between early and late cost grows with the complexity of what is being maintained.
In an AI-assisted practice, the gap is particularly large because the infrastructure compounds. A context file created on day one and maintained consistently is worth substantially more than the same context file created on day sixty from memory. A shared component built after the second use is worth substantially more than the same component built after the tenth use, because it prevented eight redundant implementations and their associated maintenance cost. The infrastructure’s value is not just its intrinsic usefulness — it is the compounding of that usefulness across all the sessions and products that benefit from it.
What Applying These Lessons Actually Looks Like
Abstract lessons produce limited behavior change. Concrete implementation practices produce durable behavior change. For each of the six lessons, here is the specific, schedulable action that implements the lesson:
For Lesson 1 (shared components after the second use): add “component registry review” as a standing agenda item at the start of every new project. Five minutes to scan the registry for components the new project will need. If any appear for the second time, the first session of the new project includes a generalization session rather than a build-from-scratch session.
For Lesson 2 (context file before the first session): add “create context file” as the first step in every new project checklist, before any AI sessions. The context file creation is not an AI session — it is a human writing exercise that produces the document that the first AI session will load.
For Lesson 3 (scope discipline before the first build): add “v1 list and later list” as the last step in the requirements document before it goes to an AI session. If the document does not have these two clearly labeled lists, it is not ready for a build session.
For Lesson 4 (test before building next feature): add “testing complete” as a checkbox at the start of every build session brief. The checkbox cannot be checked unless the previous session’s output has been tested. The session does not start until the checkbox is checked.
For Lesson 5 (specialist profiles after the third specialist task): add “specialist profile review” as a standing monthly practice. Review which task types have appeared three or more times without a specialist profile. Create profiles for any that qualify.
For Lesson 6 (document why in the moment): add “decision record” as a standing component of every context file update. For every architectural decision made in the session, one sentence explaining why. If no non-obvious decisions were made, write “no new decisions requiring documentation.”
These six practices, implemented as standing habits and scheduled actions, prevent the six most expensive patterns documented in this case study. They are available to implement from day one. The professionals who implement them from day one will not need to learn them the expensive way.
How Recent AI Innovations Change This Picture
The six lessons documented in this retrospective were learned through expensive failures in the original methodology. It is worth examining which of them are made less painful by current AI innovations and which remain hard lessons that no amount of tooling will shortcut.
The lesson “build shared infrastructure from day one” is the one most directly addressed by AI platform innovations. Agent Skills provide a standardized infrastructure path that did not exist when this case study was conducted. A professional starting today can build their shared knowledge system as an Agent Skill from the first day, using a platform-native format that scales with their practice. The lesson is still true — start with infrastructure — but the cost of acting on it is substantially lower.
The lesson “maintain explicit requirements” is unchanged in importance but improved in toolability. The Files API allows requirements documents to persist across sessions as accessible, AI-queryable artifacts. Rather than pasting requirements at the start of each session and risking drift, requirements can be stored once and referenced continuously. Requirements drift becomes detectable: the AI can compare current session direction against the stored requirements document and flag divergences.
The lessons about human judgment — “never outsource architectural decisions,” “test everything in real environments” — are unchanged and unaffected by AI capability improvements. More capable AI produces more plausible outputs, which if anything makes rigorous human judgment more important, not less. The failure mode of trusting a plausible-sounding wrong answer is more dangerous when the AI is better at sounding right. The retrospective lesson about maintaining critical evaluation applies with full force to current models. Capability improvements make this lesson more important to internalize, not less.