Dense Retrieval

Retrieval Dense Embedding Nearest-Neighbour Semantic-Search Needs-Review

What it is

Dense retrieval finds relevant documents by embedding queries and documents into continuous vector space, then ranking by vector similarity. A query embedding is compared against all (or indexed) document embeddings using nearest-neighbour search. Dense methods capture semantic relatedness beyond keyword matching.

[illustrate: Embedding space with query point (red) surrounded by relevant documents (green) and irrelevant documents (gray); ANN index overlay showing search trajectory]

How it works

Offline indexing:
- Encode each document or passage into a dense vector using a shared encoder (e.g., sentence-BERT)
- Store vectors in an approximate nearest-neighbour (ANN) index (HNSW, IVF, etc.)
Online retrieval:
- Encode query using the same encoder
- Search ANN index to retrieve top-k nearest neighbours
- Return ranked list of documents
Scoring: Usually dot-product similarity or cosine distance between vectors

Example

Query: "benefits of regular exercise"
→ encode to vector q

Documents indexed:
"Physical fitness improves health" → doc_1_vec (similarity: 0.87)
"Exercise and mental wellbeing" → doc_2_vec (similarity: 0.85)
"History of Ancient Rome" → doc_3_vec (similarity: 0.12)

Top-1 result: doc_1

Variants and history

Dense retrieval became practical around 2019–2020 with improvements in encoding efficiency and ANN indexes. Early work (DPR, ColBERT) showed dense retrieval outperforms BM25 on MSMARCO and other benchmarks. Hybrid search combines dense and sparse methods. Modern variants include late-interaction models (ColBERT), multi-representation systems, and instruction-tuned encoders. Matryoshka representation learning reduces storage and search cost.

When to use it

Choose dense retrieval when:

Semantic matching matters more than exact keywords
You have resources for ANN indexing
Query and document distributions are similar
Latency budgets allow nearest-neighbour search (~10–100ms at scale)
You want to capture paraphrases and synonymy

Dense methods are slower at query time than inverted-index BM25 and require tuning of embedding models. Hybrid search often balances accuracy and efficiency.