SPLADE

Splade Sparse Learned Embedding Inverted-Index Needs-Review

What it is

SPLADE (Sparse Lexical and Expansion Embeddings) is a neural model that learns sparse embeddings—vectors with many zeros and few non-zero dimensions—that are directly compatible with inverted index retrieval. SPLADE bridges dense and sparse paradigms: it learns semantic representations like dense embeddings but indexes them like BM25-compatible sparse vectors.

[illustrate: Query producing sparse vector with high values for semantically relevant terms; low/zero values for irrelevant terms; compatible with inverted index lookup]

How it works

Model architecture:
- Encode query and document with BERT-like encoder
- For each token, predict a weight (importance) via learned layer
- Aggregate token weights by term: sparse embedding = term → max(token_weight)
Training:
- Contrastive loss between query and relevant documents
- Sparsity regularization (L0 or ReLU thresholding) to keep weights sparse
- Optional: expand vocabulary with semantically similar terms
Indexing:
- Store sparse vector as (term, weight) pairs
- Compatible with existing inverted-index infrastructure (Elasticsearch, etc.)
- No special ANN index needed
Retrieval:
- Look up terms in inverted index (like BM25)
- Weight scoring by learned term weights (like embeddings)

Example

# Dense embedding: 768 floats per query/document
# SPLADE sparse embedding: ~20–50 non-zero terms per query/document

Query: "machine learning applications"
SPLADE embedding:
{
  "machine": 0.9,
  "learning": 0.8,
  "applications": 0.7,
  "neural": 0.4,    # learned expansion
  "model": 0.3,     # learned expansion
  ...
  (rest are zero)
}

# Index in Elasticsearch as sparse vector
# Retrieval: match non-zero terms, weight by values

Variants and history

SPLADE appeared around 2021 as a way to combine benefits of dense and sparse retrieval. SPLADE v2 improved efficiency and recall. SPLADE++ integrated better pre-training. Variants include multi-vector SPLADE, learned document expansion, and hybrid combinations with dense reranking. SPLADE addresses a practical need: semantic understanding without replicating entire embedding infrastructure.

When to use it

Use SPLADE when:

You want semantic understanding with inverted-index efficiency
Existing sparse infrastructure (Elasticsearch) is in place
You need interpretable term-based explanations
Resource constraints prevent dense ANN indexing
Hybrid search with dense reranking is acceptable

SPLADE is slower at indexing than BM25 but faster at query time than dense retrieval and more interpretable. Trade-off: moderate effectiveness gains over BM25 without full dense retrieval infrastructure.