DeepImpact
What it is
DeepImpact (Mallia et al., 2021) is a learned sparse retrieval model that assigns importance scores to document terms using a BERT encoder, then stores those scores in a standard inverted index. Unlike SPLADE, DeepImpact does not perform vocabulary expansion — it only rescores the terms already present in the document. This makes it simpler to implement and compatible with any inverted index infrastructure.
[illustrate: Document tokens → BERT → per-token scalar impact scores → sparse posting list with scores; query terms look up impact scores at retrieval time]
How it works
-
Impact scoring:
- Encode the document with BERT
- For each unique term in the document, take the contextualized representation and project to a scalar score
- Scores are quantized to integers for inverted index storage
-
No query expansion:
- Only terms present in the original document are scored
- Simpler than SPLADE; no FLOPS regularization needed
-
Indexing:
- Store (term, impact_score) pairs in a standard inverted index
- Query terms look up their precomputed impact scores at retrieval time
- Compatible with Anserini / Lucene infrastructure
-
Scoring at retrieval time:
- For each query term, retrieve its impact score from the index
- Sum impact scores across matched query terms
- No encoder inference at query time
Variants and history
DeepImpact (2021) was an early demonstration that learned impact scores improve over BM25 without ANN infrastructure. uniCOIL followed with a similar approach but using a single-vector-per-token design. SPLADE extended the concept with vocabulary expansion and showed substantially higher effectiveness at the cost of more complex training. DeepImpact remains a useful baseline for systems that want semantic scoring without index infrastructure changes.
When to use it
Use DeepImpact when:
- You have a standard inverted index and cannot add ANN infrastructure
- Query-time latency constraints prevent encoder inference
- A moderate improvement over BM25 with minimal architectural change is acceptable
- Vocabulary expansion (SPLADE) is too complex for your setup