The Prior-Art Funnel: From 150 Million Documents to the Critical Few

The funnel is where the budget goes

A search doesn’t cost money because documents are expensive — they’re mostly free. It costs money because narrowing the corpus takes attorney hours, and attorney hours are the most expensive input in the building. The traditional funnel front-loads that cost into manual keyword iteration and screening: roughly 32 hours to go from 150 million documents to a shortlist of about a dozen. The AI-assisted funnel does most of the narrowing computationally and spends the human time only at the end, on a pre-ranked set.

Not all narrowing is equal: recall risk

Toggle “Show what’s discarded.” Every stage throws documents away, and every discard carries a chance that the one decisive reference went with it. The traditional keyword stage is the dangerous one: it cuts thousands of documents on exact-word matches, so anything described in different words — a different field, an older term, another language — is gone, silently. That’s the vocabulary gap doing its damage inside the funnel. Concept-based narrowing discards on meaning, which keeps far more of the genuinely relevant art in the set.

The trade-off in one view. A faster funnel is only better if it doesn’t quietly drop the reference you needed. The goal isn’t just fewer hours — it’s fewer hours without sacrificing recall. That’s the combination an AI-assisted workflow is built to hit.

How this connects

The numbers here are the business case behind the other explainers. The mechanism that makes the AI funnel both faster and higher-recall is in How AI Prior-Art Search Actually Works; the reason the keyword stage leaks so badly is in The Vocabulary Gap.

A note on these numbers: the document counts, hours, and costs are illustrative order-of-magnitude figures to show the shape of the trade-off, not a benchmark of any particular tool or engagement. Real searches vary widely by technology, scope, and standard.

The funnel is where the budget goes

Not all narrowing is equal: recall risk

How this connects

Compress your own funnel