SPLADE

What it is

SPLADE (Sparse Lexical and Expansion Embeddings) is a neural model that learns sparse embeddings—vectors with many zeros and few non-zero dimensions—that are directly compatible with inverted index retrieval. SPLADE bridges dense and sparse paradigms: it learns semantic representations like dense embeddings but indexes them like BM25-compatible sparse vectors.

[illustrate: Query producing sparse vector with high values for semantically relevant terms; low/zero values for irrelevant terms; compatible with inverted index lookup]

How it works

  1. Model architecture:

    • Encode query and document with BERT-like encoder
    • For each token, predict a weight (importance) via learned layer
    • Aggregate token weights by term: sparse embedding = term → max(token_weight)
  2. Training:

    • Contrastive loss between query and relevant documents
    • Sparsity regularization (L0 or ReLU thresholding) to keep weights sparse
    • Optional: expand vocabulary with semantically similar terms
  3. Indexing:

    • Store sparse vector as (term, weight) pairs
    • Compatible with existing inverted-index infrastructure (Elasticsearch, etc.)
    • No special ANN index needed
  4. Retrieval:

    • Look up terms in inverted index (like BM25)
    • Weight scoring by learned term weights (like embeddings)

Example

# Dense embedding: 768 floats per query/document
# SPLADE sparse embedding: ~20–50 non-zero terms per query/document

Query: "machine learning applications"
SPLADE embedding:
{
  "machine": 0.9,
  "learning": 0.8,
  "applications": 0.7,
  "neural": 0.4,    # learned expansion
  "model": 0.3,     # learned expansion
  ...
  (rest are zero)
}

# Index in Elasticsearch as sparse vector
# Retrieval: match non-zero terms, weight by values

Variants and history

SPLADE appeared around 2021 as a way to combine benefits of dense and sparse retrieval. SPLADE v2 improved efficiency and recall. SPLADE++ integrated better pre-training. Variants include multi-vector SPLADE, learned document expansion, and hybrid combinations with dense reranking. SPLADE addresses a practical need: semantic understanding without replicating entire embedding infrastructure.

When to use it

Use SPLADE when:

  • You want semantic understanding with inverted-index efficiency
  • Existing sparse infrastructure (Elasticsearch) is in place
  • You need interpretable term-based explanations
  • Resource constraints prevent dense ANN indexing
  • Hybrid search with dense reranking is acceptable

SPLADE is slower at indexing than BM25 but faster at query time than dense retrieval and more interpretable. Trade-off: moderate effectiveness gains over BM25 without full dense retrieval infrastructure.

See also