Interactive explainer

The Prior-Art Funnel: From 150 Million Documents to the Critical Few

Every prior-art search is a funnel from the whole corpus down to the few references that matter. Two workflows reach a similar shortlist — but spend very different amounts of time, money, and risk getting there. Switch between them and watch the counters move.

From 150M documents to the critical few
Full corpus150M remainingClassification filter (CPC/IPC)140k remainingBoolean keyword queries3.5k remainingManual screening280 remainingClose review → shortlist11 remainingthe critical few ✦
Traditional workflow

Classification + Boolean keyword queries + manual screening.

32 hrs
Attorney time
$11,200
Cost @ $350/hr
11
References to review

The funnel is where the budget goes

A search doesn’t cost money because documents are expensive — they’re mostly free. It costs money because narrowing the corpus takes attorney hours, and attorney hours are the most expensive input in the building. The traditional funnel front-loads that cost into manual keyword iteration and screening: roughly 32 hours to go from 150 million documents to a shortlist of about a dozen. The AI-assisted funnel does most of the narrowing computationally and spends the human time only at the end, on a pre-ranked set.

Not all narrowing is equal: recall risk

Toggle “Show what’s discarded.” Every stage throws documents away, and every discard carries a chance that the one decisive reference went with it. The traditional keyword stage is the dangerous one: it cuts thousands of documents on exact-word matches, so anything described in different words — a different field, an older term, another language — is gone, silently. That’s the vocabulary gap doing its damage inside the funnel. Concept-based narrowing discards on meaning, which keeps far more of the genuinely relevant art in the set.

The trade-off in one view. A faster funnel is only better if it doesn’t quietly drop the reference you needed. The goal isn’t just fewer hours — it’s fewer hours without sacrificing recall. That’s the combination an AI-assisted workflow is built to hit.

How this connects

The numbers here are the business case behind the other explainers. The mechanism that makes the AI funnel both faster and higher-recall is in How AI Prior-Art Search Actually Works; the reason the keyword stage leaks so badly is in The Vocabulary Gap.

A note on these numbers: the document counts, hours, and costs are illustrative order-of-magnitude figures to show the shape of the trade-off, not a benchmark of any particular tool or engagement. Real searches vary widely by technology, scope, and standard.

Compress your own funnel

PatentScan does the narrowing computationally and hands you a ranked shortlist — so attorney time goes to judgment, not to keyword iteration.