The Moat That Compounds

Why Behavioral Data Creates Competitive Distance That Physical Bottlenecks Never Could

The Moat That Compounds

The Moat Regulators Know How to Break

Most competitive advantages are things you own. A railroad owns track. A utility owns the transmission grid. A drug company owns a patent. The moat is positional — it sits between you and the competitor, and the competitor cannot cross it because you control the asset that makes crossing possible. Regulators have spent a century learning how to contest moats like these. The toolkit is mature: mandate access, set prices, require interconnection, separate ownership from operation. It is not a perfect toolkit. But it is a proven one.

Google's search advantage is not that kind of moat.

It does not rest on an asset Google controls and others cannot build. Bing has a search index. DuckDuckGo has a crawler. Any well-capitalized team can replicate the infrastructure of web indexing. The hardware is available. The engineering talent is hireable. The open-source model weights are downloadable. What cannot be replicated — what has not been replicated, across fifteen years of serious attempts — is the behavioral signal that makes the index relevant. Every query Google processes returns a signal: what the user clicked, how long they stayed, whether they reformulated the question, whether they came back. That signal feeds the ranking model. The model improves. The next query in that category is served more accurately; more accurate serving generates more engagement; more engagement generates more signal.

This is not a positional moat. It is a procedural one. The advantage is not in what Google holds — it is in what Google's system does, continuously, at a scale that makes the improvement self-sustaining.

The EU's Digital Markets Act search data mandate is the first serious attempt to apply access logic to a compounding process. My read is that it addresses the right market at the wrong structural level. Understanding why requires being precise about the mechanism — and precise about the gap between the mechanism and the instrument.


How the Loop Runs

A query arrives. Google serves a ranked list. The user scans it — clicks the third result, stays for ninety seconds, does not return to the results page. That behavior is a signal: the third result was probably the right answer for that query, at that moment. Now multiply that signal by billions of queries per day, across more than two decades of operation, across every language, geography, and intent category that exists in human information-seeking. The result is not a database. It is a continuously running relevance factory — one that gets more precise with every cycle, labels its own training data through user behavior, and produces a product that improves without anyone explicitly instructing it to.

This is what distinguishes a behavioral feedback loop from a static data asset. A database accumulates. A feedback loop compounds. Accumulation is additive: each new record increases the size of the corpus. Compounding is multiplicative: each new signal improves the model's ability to interpret the next signal, which improves the ranking, which generates better signals still. The corpus and the model co-evolve. Neither is separable from the other without destroying what makes the system work.

The loop runs on three interlocking operations. Every user interaction — click, dwell, skip, reformulation, abandonment — is a labeled data point; the labeling cost to Google is zero, because the user's attention and reformulation effort produces it at no marginal cost to the platform. Those signals feed into ranking models that update continuously, adjusting the weight assigned to hundreds of document and behavioral features. And a more accurate ranking generates higher-quality signals on the next cycle, because users who find what they need produce cleaner positive signals than users who do not. Accuracy begets signal quality. Signal quality begets accuracy.

This is where the two dimensions that will structure the rest of this analysis make their first precise intervention. 

Access — the formal ability to obtain data — is a threshold condition. The DMA's mandate addresses this dimension directly, requiring Google to share query, click, view, and ranking signals under FRAND terms. Formal access is what the instrument provides, and it is not nothing. But Effort — the work required to convert that data into competitive relevance parity — is where the compounding mechanism lives, and it is almost entirely untouched by access mandates. The data Google is required to share is anonymized, aggregated, and delivered with temporal delay. Those three modifications are not incidental privacy protections — they are the precise operations that sever the data from the feedback loop. Anonymization removes user-level behavioral trajectories, collapsing the multi-step reformulation sequences that tell a ranking model what a query actually meant after the first attempt failed. Aggregation dissolves long-tail signal density: tail queries occur at low frequency per category, so aggregating them across users collapses the within-category reformulation pattern into statistical noise. Temporal delay converts a live training input into a historical record — during which gap Google's loop continues to run uninterrupted.

What arrives at the challenger's door is a photograph of the factory's output, not access to the factory itself.


The Loop Doesn't Care Whose Name Is on the Door

Before applying this framework to Google specifically, I want to establish it in markets where Google never competed — because the strongest version of the behavioral moat thesis is the one that holds across industries, not the one that might reflect something particular about Google's history or distribution position.

Amazon's product search is the first case. Amazon's ranking system is driven substantially by behavioral signals: click-through rates, conversion rates, session dwell time, and the post-click behavioral record of a shopper's interaction with a listing. A new merchant with a superior product — better specifications, lower price, stronger materials — but no sales history enters the ranking system at a structural disadvantage. The model has no behavioral evidence that this listing satisfies intent. The incumbent seller with an inferior product but two years of purchase history, dense click-conversion signals, and accumulated reviews ranks above it because the model has learned, from behavioral evidence, that users who clicked that listing bought it and did not return it. The distance between the two merchants is not a quality distance. It is a behavioral history distance. The only path to closing it is accumulation — which requires sales, which requires ranking, which requires the accumulated signal.

The loop is the barrier. Third-party seller data and academic analysis of Amazon marketplace dynamics have documented this pattern consistently: new entrants with demonstrably superior products routinely lose ranking position to incumbents with inferior products and dense behavioral histories, particularly in mid-tail categories where no navigational query disambiguates intent. I note this framing — "documented pattern" rather than "established algorithmic fact" — because Amazon's ranking weights are not publicly disclosed. The mechanism is observable in seller behavior and market outcomes, not confirmed in technical specification. The directionality is robust. The precision claim about algorithmic weighting is not mine to make.

TikTok makes a different argument — one that settles the firm-specificity question. When TikTok entered short-form video, YouTube had fifteen years of engagement data, billions of labeled watch-history signals, and infrastructure that dwarfed any new entrant's resources. By the behavioral moat thesis, YouTube should have been unassailable. Instead, TikTok built a dominant recommendation engine within four years — not by defeating YouTube's behavioral corpus, but by entering a content format for which YouTube had almost no behavioral history. Short-form vertical video was a cold-start environment for every participant. TikTok's algorithm began accumulating signal in an empty field and compounded it into structural dominance before any incumbent could transfer their existing corpus.

The behavioral feedback loop built the moat in real time, from zero, in favor of the entrant.

The obvious objection this case raises is also the strongest counterargument the thesis will face: if TikTok could build a dominant behavioral moat in four years from nothing, why couldn't a well-funded DMA-enabled challenger do the same in search? The answer is the specific condition TikTok exploited. Short-form vertical video was genuinely new. No platform held a behavioral corpus in that format. The competitive field was empty, and TikTok accumulated first. That empty-field condition does not exist in web search. The DMA challenger is not entering a new query format. They are entering a market where every query category — including the furthest reaches of the long tail — has been behaviorally mapped by Google for two decades. The tail query about a specific neurological condition in a specific regional context has been issued by prior users, reformulated, clicked, and accumulated into Google's training corpus. The intent disambiguation signal exists. It exists on one side of the competitive field.

The DMA challenger arrives at a farm that is already fully cultivated. The mandate gives them seeds, not soil.

TikTok's success is evidence for the mechanism, not against it. It confirms that behavioral compounding builds structural dominance from zero in empty fields. It does not suggest that behavioral compounding is reversible in occupied ones.


Four Conditions the Railroad Met That the Feedback Loop Cannot

The history of bottleneck regulation is not a record of failure. It is a record of interventions that worked reliably within a specific structural envelope and failed when applied outside it. I am not arguing that regulation cannot produce contestability in complex markets. I am arguing that the instrument has to fit the structure of the problem — and that the fit, here, is poor in ways that are precise and diagnosable.

The successful cases share four conditions. This taxonomy is my own, derived from working through the case history of infrastructure regulation, not extracted from a prior published framework. The bottleneck asset was physically isolatable from the service built on top of it. It was static or slowly-changing. It was standardizable across providers. And it was fully transferable without quality degradation. In railroads, all four conditions were present: track is separable from freight services, a track laid in 1910 still carries trains in 1970, rail gauges can be standardized and mandated, and sharing track access does not degrade the track for the incumbent operator. The conditions were not consciously designed into bottleneck regulation — they were simply present in the industries that attracted it, which is why the instruments worked.

None of the four conditions hold for behavioral search data. The signal is not isolatable from the model it trains — separating them produces a record, not a training input. The data is not static: a click in March interacts with every subsequent click in ways that alter what the model concludes about the query, in a manner a missed credit payment does not. It is not standardizable: Google's behavioral labels are proprietary, the feature engineering is unpublished, and there is no industry-wide signal taxonomy that would make a shared standard enforceable. And it does not transfer without quality degradation — the anonymization and aggregation requirements that accompany any legally viable mandate are precisely the operations that sever the data from the compounding process.

The credit bureau case deserves a precise reading here, because it is the one prior instance where behavioral data was regulated as shared infrastructure — and it is frequently invoked as the template for search. Under the Fair Credit Reporting Act, the major bureaus are required to share consumer payment history, credit utilization, and delinquency records with lenders. The system functions. New entrants to lending markets can access behavioral credit histories they did not generate. That is a genuine competitive opening.

The template does not transfer, and the four conditions explain why with precision. Credit data is isolatable: a payment event is separable from the credit model that uses it, in a way that search behavioral data is not. Credit data is static: a missed payment in March is a missed payment in March regardless of subsequent payments, and it does not improve the model's ability to interpret the next default in a non-linear way. Credit data is standardizable: all bureaus collect the same event types — payment status, utilization, delinquency — against the same consumer identifiers, with definitions that can be made enforceable. And credit data transfers without quality degradation: Equifax's credit file on you does not become less accurate because Experian also holds one. Google's behavioral corpus, shared under anonymization and aggregation requirements, becomes a different and lesser thing in the very act of transfer.

The credit bureau comparison, examined closely, does not strengthen the case for search data regulation. It clarifies exactly why the case is harder — and why applying the proven toolkit to this problem is not so much a mistake as a structural mismatch.


The Mandate Is Well-Constructed at the Wrong Level

The Commission's proposed measures against Google are, by the standards of digital regulation, unusually specific. The document does not simply require data sharing — it specifies what data (click, query, view, and ranking signals), at what frequency (daily or near-real-time), under what pricing constraint (FRAND terms), and with what eligibility logic (search engines and, notably, AI chatbots with search functionality). This level of operational specificity is deliberate: the Commission's prior DMA proceedings taught it that vague obligations produce formal compliance without competitive effect. The measures are designed to close that gap.

Read against the framework this monograph has built, they close it at the wrong level.

Every component targets Access — the formal threshold that prevents challengers from obtaining behavioral data at all. Eligibility criteria remove the exclusion barrier. Data scope specifications define what must be shared. Frequency requirements prevent strategic delay. FRAND pricing prevents exclusion through prohibitive cost. On the Access dimension, the instrument is well-constructed. A challenger operating in complete behavioral darkness today would receive something real: a daily feed of query, click, and view signals from the world's largest search corpus, at terms Google cannot arbitrarily control.

What the instrument does not touch is Effort — the continuous asymmetry between converting that data into competitive relevance and inheriting the live feedback loop. The anonymization, aggregation, and delay requirements that accompany the mandate each sever a specific thread of the compounding process, in the ways the previous sections established. The result is an instrument that delivers access to the outputs of a compounding process without delivering access to the process itself. Google keeps the factory. The challenger receives the invoice.

On the Perceived Risk dimension, the mandate does shift something real. Google's ability to silently exclude challengers through data starvation is now legally constrained. That constraint changes the deterrence landscape for investors and builders evaluating search-adjacent AI products — not because the Effort gap has closed, but because the legal architecture around the Access gap has changed. Whether that shift is enough to trigger the accumulation sequence the Commission's theory of change depends on is the question Monograph 2 takes up.


The Distance That Remains, and the Bet Embedded in Accepting It

Here is what I believe the evidence has established. Behavioral feedback loops compound in a way that physical bottlenecks do not — and the compounding is the mechanism of the advantage, not its side effect. Equivalent capital cannot substitute for behavioral history, as Bing's fifteen-year effort demonstrates on its own terms, controlling for the strongest alternative hypothesis. The mechanism is not firm-specific: it operates identically in e-commerce and content recommendation wherever behavioral signals drive relevance, and it builds from zero in favor of whoever accumulates first in an empty field. The regulatory instruments that successfully contested prior bottlenecks worked because those bottlenecks satisfied four structural conditions that behavioral feedback systems do not satisfy — a conclusion the credit bureau comparison makes precise rather than approximate.

The competitive distance that remains after the DMA mandate is implemented is not zero, and not trivially reducible. On the Access dimension, the distance narrows. On the Effort dimension — the asymmetric cost of bootstrapping behavioral relevance against an incumbent whose corpus continues to compound uninterrupted — the distance holds.

But I want to be honest about the limit of what that conclusion establishes. The distance holding is not the same as the distance being permanent. Partial reductions in competitive distance sometimes change market structure in ways that full reductions could not: they alter the entry calculation, shift the deterrence landscape, and create conditions under which challengers can move on one dimension while building toward parity on another. The Access reduction the mandate delivers is real. The question is whether it is sufficient — not to close the behavioral gap, but to change what rational actors do in the presence of it.

That is a different kind of question than the mechanism question. It requires reading the instrument against the market it is trying to create, and the challengers it is trying to enable. And it surfaces the possibility I find most interesting, and most unresolved: that the Commission may not have designed this instrument to break the behavioral moat at all. It may have designed it for what was regulatorily possible, with a calculated bet embedded in the architecture — a bet that the Access reduction is enough to change behavior at the margin, and that changed behavior at the margin is enough to start an accumulation sequence the mechanism alone cannot produce.

Whether that bet is well-placed is not a diagnostic question. It is a strategic one. And I don't yet know the answer.