SPLADE
What it is
SPLADE (Sparse Lexical and Expansion Embeddings) is a neural model that learns sparse embeddings—vectors with many zeros and few non-zero dimensions—that are directly compatible with inverted index retrieval. SPLADE bridges dense and sparse paradigms: it learns semantic representations like dense embeddings but indexes them like BM25-compatible sparse vectors.
[illustrate: Query producing sparse vector with high values for semantically relevant terms; low/zero values for irrelevant terms; compatible with inverted index lookup]
How it works
-
Model architecture:
- Encode query and document with BERT-like encoder
- For each token, predict a weight (importance) via learned layer
- Aggregate token weights by term: sparse embedding = term → max(token_weight)
-
Training:
- Contrastive loss between query and relevant documents
- Sparsity regularization (L0 or ReLU thresholding) to keep weights sparse
- Optional: expand vocabulary with semantically similar terms
-
Indexing:
- Store sparse vector as (term, weight) pairs
- Compatible with existing inverted-index infrastructure (Elasticsearch, etc.)
- No special ANN index needed
-
Retrieval:
- Look up terms in inverted index (like BM25)
- Weight scoring by learned term weights (like embeddings)
Example
# Dense embedding: 768 floats per query/document
# SPLADE sparse embedding: ~20–50 non-zero terms per query/document
Query: "machine learning applications"
SPLADE embedding:
{
"machine": 0.9,
"learning": 0.8,
"applications": 0.7,
"neural": 0.4, # learned expansion
"model": 0.3, # learned expansion
...
(rest are zero)
}
# Index in Elasticsearch as sparse vector
# Retrieval: match non-zero terms, weight by values
Variants and history
SPLADE appeared around 2021 as a way to combine benefits of dense and sparse retrieval. SPLADE v2 improved efficiency and recall. SPLADE++ integrated better pre-training. Variants include multi-vector SPLADE, learned document expansion, and hybrid combinations with dense reranking. SPLADE addresses a practical need: semantic understanding without replicating entire embedding infrastructure.
When to use it
Use SPLADE when:
- You want semantic understanding with inverted-index efficiency
- Existing sparse infrastructure (Elasticsearch) is in place
- You need interpretable term-based explanations
- Resource constraints prevent dense ANN indexing
- Hybrid search with dense reranking is acceptable
SPLADE is slower at indexing than BM25 but faster at query time than dense retrieval and more interpretable. Trade-off: moderate effectiveness gains over BM25 without full dense retrieval infrastructure.