[ FIELD NOTES ] · PART 2 OF 5 · THE PROBLEM · PART 2

Citing is not the same as being supported

A citation is a pointer, and a pointer can point at anything — a paper that supports you, a paper that contradicts you, or a paper that merely gestured at a third paper which never actually made the claim at all. Follow enough chains and you find they bottom out in nothing. This is how a field ends up certain of things no experiment ever showed.

2026-06-218 minwritten for · Postdocswritten for · Principal investigatorswritten for · R&D leads

Read a sentence like this often enough and you stop seeing it: "Treatment X is effective in condition Y [12, 13, 14]." Three references. It reads as three independent lines of support, a small consensus. But a citation is only a pointer. It tells you where the author looked; it does not tell you what they found there. And when someone finally follows the pointers, the little consensus has a habit of collapsing.

The citation network that authorised a belief no one had shown

The definitive demonstration of this belongs to Steven Greenberg, who in 2009 published in the BMJ a forensic analysis of the citation network behind a specific biomedical claim — that a particular protein caused a particular disease. Greenberg traced every citation path in the literature and found something that should be taught in every first-year methods course. The belief was supported by a dense web of citations, which made it look robust. But when he followed the paths back to their origins, the empirical papers that were supposed to ground the claim either did not support it, or were review papers and opinion pieces citing other reviews. He gave the phenomenon its proper names: citation bias (citing supportive papers and ignoring contrary ones), amplification (reviews citing reviews, inflating apparent support without adding evidence), and invention (citing papers that, read closely, do not actually make the claim attributed to them).

The result was a self-reinforcing network of what Greenberg called "unfounded authority": a claim treated as established fact, supported by hundreds of citations, resting on almost no data. The citations were real. The support was not.

A citation says "look here." It does not say "and here you will find what I told you was there." The entire authority of a literature depends on a promise that almost no one ever checks.

The three ways a chain bottoms out in nothing

Once you know the pattern, you see its species everywhere:

  • The chain that circles. Paper A cites B for a claim; B cites C; C cites A. No experiment anywhere on the loop — just three papers vouching for each other. It looks like triangulation. It is a rumour with footnotes.
  • The chain that drifts. The original paper reported an effect under a narrow condition, with hedges. The first citation drops the hedges. The second drops the condition. By the fifth, a cautious, boundaried finding has become a universal law, and every link was individually defensible.
  • The chain that dead-ends in an opinion. You follow the references expecting data and arrive at an editorial, a textbook, or a conference abstract — a place where the claim is asserted, not shown. The empirical floor you assumed was under the field was never poured.

None of these require anyone to have acted in bad faith. They are the natural sediment of a system where citing is cheap, checking is expensive, and no one is assigned to check.

Why "it's highly cited" makes this worse, not better

The intuitive defence is that heavily-cited claims must have been scrutinised — surely someone, among all those citers, went back and checked. But amplification works in exactly the opposite direction. The more a claim is cited, the more it looks established, the less anyone feels the need to return to the source, and the more freely the next author cites it on the strength of the crowd. Citation count is not evidence of verification. Past a certain threshold it is evidence that verification stopped, because the claim graduated to "well known."

This is why tools that surface how a paper was cited — whether the citing work supported, mentioned, or contradicted it — matter. When Josh Nicholson and colleagues built exactly such a classifier and analysed citation statements at scale, they found that explicit disputing citations are rare in the raw counts: the overwhelming majority of citations are neutral mentions, and contradicting citations are a sliver. A raw citation total flattens all of that into a single number that cannot distinguish "one hundred labs confirmed this" from "one hundred authors mentioned it in passing" from "one hundred papers cited it while disagreeing." The support and the doubt are averaged into silence.

What being supported would actually require

Strip it back and the requirement is almost embarrassingly simple. For a citation to mean "supported," someone has to have read the cited paper, confirmed it makes the claim, confirmed the claim survives the paper's own limitations, and confirmed the chain beneath it eventually reaches evidence rather than another pointer. That is four checks per citation, and the reason we skip them is not laziness but arithmetic — nobody has time to audit a chain by hand for every reference in every paper they read.

Which is precisely why the checking has to become something other than a heroic manual act. Not a better intention to read carefully, but a discipline that decomposes a claim into what actually supports it and follows the chain until it either reaches ground or reveals that it never did. The next part takes on the metric we reach for instead — citation counts — and asks what, exactly, they are measuring when they are so clearly not measuring this.

REFERENCES

  1. Greenberg, S.A. (2009). How citation distortions create unfounded authority: analysis of a citation network. BMJ 339:b2680.
  2. Nicholson, J.M. et al. (2021). scite: A smart citations index that displays the context of citations. Quantitative Science Studies.
  3. Simkin, M.V. & Roychowdhury, V.P. (2003). Read before you cite! Complex Systems.
Watch the engine read the origins of AI — live, open-access, no login. Open the demo →