DRMM
What it is
DRMM (Deep Relevance Matching Model, Guo et al., 2016) is an interaction-based neural ranking model that models query-document relevance via a histogram of term-level similarity scores. Unlike representation-based models (DSSM), DRMM explicitly computes interactions between query and document terms, making it better at capturing exact and near-exact matches. The paper introduced a principled distinction between semantic similarity (what representation models do) and relevance matching (what retrieval requires).
[illustrate: Query terms × document terms → cosine similarity matrix → histogram bins per query term → feed-forward → term gate weights → relevance score]
How it works
-
Interaction matrix:
- Pre-trained word embeddings for query and document terms
- Compute cosine similarity between every query-term and document-term pair
-
Histogram mapping:
- For each query term, bin the similarities into a fixed histogram (e.g., 30 bins from -1 to 1)
- Captures the distribution of how well that query term matches the document
-
Feed-forward scoring:
- Each query term’s histogram → small MLP → scalar score
-
Term gating:
- Weight each query term’s score by an IDF-based gate
- High-IDF (rare) terms contribute more to the final score
-
Aggregation:
- Sum gated term scores → final relevance score
Variants and history
DRMM (2016) was influential in framing the interaction-based vs. representation-based dichotomy. It inspired KNRM (kernel-based smoothing of the histogram) and PACRR (CNN over the interaction matrix). The explicit relevance-matching framing was important for understanding why dense retrieval models sometimes fail on entity-heavy or technical queries.
When to use it
DRMM is primarily historical. For practical deployment:
- Use ColBERT or cross-encoders for interaction-based relevance matching
- DRMM’s histograms and term gating ideas are useful for interpreting retrieval failures