Interactive explainer

How AI Prior-Art Search Actually Works

A keyword search for “self-balancing scooter” can miss the 1998 patent that invalidates you — because it never used the word scooter. Here is exactly why that happens, and what finds it instead. Type your own invention below and watch the search think.

Interactive prior-art search
Balance controlTilt sensingElectric drivePersonal transportSteeringTwo-wheel layout

The AI reads meaning, not words. These concepts — not your exact phrasing — are what it searches with.

3 · The meaning map
Balance controlTilt sensingElectric drivePersonal transportSteeringTwo-wheel layout
Your invention Patent Journal / NPL
Wider net (more recall)Stricter (more precision)
88%
Recall
7/8 relevant
70%
Precision
7/10 returned
1
Missed art
below threshold
5 · Ranked prior art (10)
PatentSegway2004● relevant
Personal mobility device with dual coaxial wheels and active balancing
98%
PatentUS5791425● relevant
Self-stabilizing single-track human transport vehicle
93%
PatentUS8830188● relevant
Self-balancing scooter with foot platform
91%
PatentUS6234261● relevant
Dynamically balanced two-wheel conveyance
84%
PatentUS9101817● relevant
Two-wheeled electric balancing vehicle
84%
NPLIEEE1996● relevant
Inverted-pendulum control of a coaxial wheeled robot
81%
PatentUS7250000
Electric wheelchair with tilt control
77%
PatentUS9415835
Electric kick scooter with folding frame
73%
NPLNPL2003-Gyro● relevant
Gyroscopic stabilization of an unstable rider platform
69%
PatentUS7000001
Motorized skateboard / electric longboard
66%

From a sentence to a shortlist, one stage at a time

Every prior-art search answers one question: of the 150 million-plus patents and papers ever published, which few actually matter to this invention? The interactive tool above runs a miniature version of how AI semantic search answers it. It moves through five stages — the same five stages PatentScan runs at full scale.

1Your disclosure

You start with the invention in plain language — no Boolean operators, no classification codes. The three preset buttons describe the same device three ways: everyday words, dense patent-ese, and a mix. That is the whole problem in one click: the invention is constant, but the words people use for it are not.

2Concept extraction

Instead of indexing your literal words, the model reads the disclosure for meaning and lights up the underlying concepts — balance control, tilt sensing, electric drive, and so on. This is the step keyword search simply does not have. Edit the text and watch the concept chips change; swap to “plain words” and notice the same concepts still appear even though almost none of the technical terms do.

3The meaning map

Every document — and your invention — becomes a point whose position is determined by the concepts it covers, not the words it uses. Things that mean the same thing land near each other. Your invention is the blue diamond; the closer a document sits, the more it shares the meaning of your disclosure. Hover any point to see what it is and how strongly it matches. The thick lines are the search reaching out to its nearest neighbours — the AI equivalent of “these are the ones worth your attention.”

Try the toggle. Flip from Semantic (AI) to Keyword. Documents that describe your invention in different words go dim, and the genuinely relevant ones the keyword search can no longer see are ringed in red. The recall number on the right is the share of the truly relevant prior art each method actually surfaced.

4The relevance threshold

Semantic search returns a ranked list, so you choose how wide to cast the net. Drag the threshold down and you recover more of the real prior art (higher recall) at the cost of more noise (lower precision); drag it up and the results get cleaner but you risk dropping a reference that matters. That trade-off — coverage versus reviewing time — is the entire economics of a search. Keyword search gives you no such dial: a word is either present or it is not.

5The ranked shortlist

What lands on the right is the deliverable: a ranked, scored set of references an attorney can actually review — patents and non-patent literature together, ordered by how closely they read on the invention rather than by which ones happened to reuse your vocabulary.

Why this matters for invalidation and clearance

The references most likely to sink a patent are often the ones written years earlier, in a different field, or in another language — exactly the references that share an invention’s meaning but none of its words. That is the vocabulary gap, and it is the single biggest reason thorough keyword searches still miss decisive prior art. Semantic search closes it by matching on concepts, which is why a modern workflow pairs the two: keyword for precision on known terms, semantic for the recall that catches what you didn’t know to search for.

A note on this demo: the corpus, concepts, and scores above are a small, hand-built illustration so the mechanism is easy to see. PatentScan runs the real version against the full global corpus with production embeddings — the principle is identical, the scale is not.

Run this on your real invention

PatentScan turns a plain-English disclosure into a ranked prior-art report against the full global corpus — patents and non-patent literature — in minutes.