DocsCore concepts

Core concepts

Understand how FindIP's semantic patent search engine works under the hood.

What is vector embedding?

Vector embedding turns text into hundreds of dimensions of numbers. Texts with similar meanings end up near each other in vector space, so you can find related documents based on meaning even when keywords differ.

FindIP's embedding pipeline

Split each patent into sections (title, abstract, claims, description)
Vectorize each section with the embedding model and store in a vector DB
Vectorize the query with the same model and find the nearest vectors
Rerank with a reranking model for precise final ordering

Semantic search

Unlike traditional keyword matching, FindIP uses semantic embeddings to capture the hidden intent and meaning of a query. You can search with natural-language sentences, technical problems, or solution approaches and still find highly relevant patents — even when the wording differs.

Search example

Keyword search: "lithium battery overheating" — only matches the exact words.

Semantic search: "how to prevent thermal runaway in EV battery packs" — understands the technical context and intent.

How ranking works: vector retrieval + reranking

FindIP does not run keyword (BM25) matching. The pipeline is purely semantic and works in two stages:

Stage 1 — Vector retrieval (paragraph / chunk level)

Each patent is split into paragraph- and claim-level chunks that are embedded into vectors. Your query is embedded with the same model, and the engine retrieves the nearest chunks by vector similarity (similarity_score).

Stage 2 — Reranking

A reranking model re-scores the retrieved candidates against your query for precise final ordering (rerank_score). Results are returned ordered by this rerank score.

Because matching happens at the paragraph / chunk level, describing a specific technical problem or solution in natural language tends to retrieve more relevant results than a short bag of keywords.

Supported countries

FindIP indexes patent data from the world's major patent offices.

Country code	Office	Language
`US`	United States (USPTO)	English
`CN`	China (CNIPA)	Chinese
`JP`	Japan (JPO)	Japanese
`KR`	Korea (KIPO)	Korean
`EP`	European Patent Office (EPO)	English, French, German

Patent document structure

Each patent document is composed of sections that can be searched and retrieved individually:

Abstract

A brief summary of the invention.

Claims

The legal scope of the patent right.

Description

Detailed technical description including embodiments.

Figures

Drawings and diagrams that aid technical understanding.

Metadata

Filing date, publication date, applicant, IPC classification, etc.

Get started Next: Data coverage