How to buy a software company: check every invoice, skim the software
Let me describe how a software company gets bought, and I want you to listen for the moment it goes strange.
The financials get a quality-of-earnings review: outside accountants reconstruct revenue transaction by transaction, test the cohorts, chase every deferred dollar. Weeks of work, to the decimal. The contracts get lawyers — every customer agreement read, every change-of-control clause flagged. Insurance, tax, employment: specialists, checklists, months.
Then there's the codebase. The thing being, in the most literal sense, purchased — the machine that generates every number the accountants just verified. Its diligence, at most deals, is: a senior engineer from the acquirer gets repository access and looks around for a couple of days. They skim the architecture, run a dependency scan, check the commit graph, interview the CTO (Part 17 covered what the CTO structurally can't know), and write a memo with the word "reasonable" in it.
Two days. For the asset. Did you hear it go strange? Everyone in M&A has heard it; the process rolls on anyway, because — as with every blindspot in this series — the instrument for doing better famously didn't exist.
Buying a house from photographs of the blueprints
Here's what the two-day repo review actually is, in house terms: the buyer's inspector examined the blueprints. Good inspector! Real findings — the blueprints are tidy, the materials list is modern, there's a weird load-bearing note on page 40. But nobody visited the house. Nobody ran the taps, tested the furnace under load, or checked which rooms are heated to no purpose and which staircase sways when three people use it.
Code review is blueprint review. It answers "is this well made, as text?" — a question worth asking. It cannot answer "what does this DO all day?", and every question an acquirer actually cares about lives on the DO side:
- Which parts of this asset produce the revenue? Production traffic concentrates brutally through a minority of functions (Part 14; in CodeNSM's own fleet telemetry, strongly enough that we pre-registered it as a research hypothesis). The repo shows all functions as equally-sized text files. The buyer cannot see which handful of functions ARE the business.
- How much of the purchase is inventory that doesn't exist? Dormant code (Part 13) — routinely a quarter or more of functions in our fleet's censuses — compiles beautifully, reads plausibly, and does nothing. Blueprint review counts every room, including the ones that have been sealed since 2023. The buyer is paying maintenance-forever on square footage no one can identify from the text.
- What's the real defect economics? Tornhill and Borg's Code Red study quantified what buyers actually inherit: their lowest-quality code band carried up to fifteen times more defects and dramatically slower, less PREDICTABLE change times than healthy code. Predictability is the acquirer's whole game — the integration plan, the roadmap synergies, the retention math all assume the asset responds to investment on schedule. That variance is invisible in a skim and fully visible in runtime history.
- Is the crown jewel actually installed? The pitch deck celebrates the proprietary algorithm. Runtime data answers a question nobody thinks to ask: does production traffic actually FLOW through it — or did a 2023 workaround quietly bypass the moat, leaving it as ceremonial code the sellers themselves forgot isn't load-bearing? (Part 19's glue-versus-gold census is, in a diligence room, the difference between buying an engine and buying its wrapper.)
Why the gap persists (it's not laziness)
Acquirers aren't dumb, and their two-day review isn't negligence — it's the rational amount of effort for the instrument available. Static inspection has steeply diminishing returns: day ten of reading unfamiliar code yields little that day two didn't, because the limiting factor isn't reading time, it's that the TEXT doesn't contain the answers. Peter Naur's old point (the program is the theory in its builders' heads, not the text) becomes, in an acquisition, a genuinely alarming observation: the buyer is purchasing the residue and interviewing the theory, and the theory has retention risk and an earn-out incentive to sound calm.
Meanwhile every incentive rounds the code review UP to "fine." The deal has momentum. The engineer writing the memo knows a scathing assessment kills months of work by people senior to them, on the strength of a two-day impression they can't fully defend. So codebases pass diligence the way Victorian buildings pass a drive-by: no visible smoke.
Then the deal closes, and the 2024-style integration story plays out: the roadmap assumed the asset could absorb change at the modeled rate; Ward Cunningham's interest — his original 1992 framing of debt as the ongoing cost of every minute spent on not-quite-right code — starts compounding on the buyer's books instead of the seller's; and eighteen months later an engineering VP is explaining to a disappointed steering committee that the synergies are "taking longer than modeled." Nobody connects this back to the two-day skim, because by then the skim is ancient history and the memo said reasonable.
The missing exhibit
Here's what should sit in every software data room next to the quality-of-earnings report — call it the quality-of-code exhibit, and notice that every line is a RUNTIME artifact, impossible to fake retroactively and impossible to compile from text:
- The traffic map. Functions ranked by observed production calls. Which code is the business.
- The occupancy schedule. The dormant share, enumerated. What fraction of the purchase does no work.
- The seam map. Glue versus unique IP, as a census — what part of this could any team rebuild, and what part is the moat.
- The fragility register. Fragile-AND-load-bearing intersections, plus the quiet experts (Part 18) — the rarely-run critical paths with their last-successful-run dates.
- The trend lines. Latency and error baselines over trailing quarters, per function. Is the asset appreciating or decaying in place?
(Producing exactly that pack from a few months of telemetry is what CodeNSM amounts to in a diligence context — and the incentive cuts both ways: a seller who can hand this over six months early is negotiating from evidence; a seller who can't is inviting a discount priced off the buyer's imagination.)
One detail in that parenthetical deserves its own sentence, because it changes who has to act and when: runtime evidence takes MONTHS of observation to accumulate, which means it cannot be produced inside a deal window. The quality-of-earnings team can reconstruct three years of revenue from records that already exist; there is no equivalent archaeology for code behavior that was never recorded. If the instrument wasn't running before the letter of intent, the exhibit simply doesn't exist for this transaction — for either side. Which makes this the rare diligence gap that only the SELLER can close, and only by starting early.
The precedent for this shift is exact. Quality-of-earnings reviews weren't always standard — buyers once took revenue largely on the seller's word, until enough deals blew up that verified became the price of admission. Runtime evidence for code is the same transition, one instrument behind schedule. The 2024 DORA report keeps demonstrating that software-delivery performance is measurable and predictive of organizational outcomes; the acquisition market just hasn't yet demanded the measurement at the moment it matters most — the moment of purchase.
Until it does, here's the honest summary of this entire series, compressed into one transaction: a company can spend nine figures acquiring a codebase — after checking every invoice, reading every contract, and verifying every dollar — without anyone, on either side, ever having SEEN it run.
The invisible codebase isn't just an operating problem. It's a pricing problem. And somewhere out there, the exhibit that would have changed your biggest number was never in the room.
References
- Tornhill, A. & Borg, M. (2022). Code Red: The Business Impact of Code Quality.
- Cunningham, W. (1992). The WyCash Portfolio Management System. OOPSLA '92 — origin of the technical-debt metaphor.
- Naur, P. (1985). Programming as Theory Building.
- Google Cloud (2024). DORA Accelerate State of DevOps Report.