DRMM

What it is

DRMM (Deep Relevance Matching Model, Guo et al., 2016) is an interaction-based neural ranking model that models query-document relevance via a histogram of term-level similarity scores. Unlike representation-based models (DSSM), DRMM explicitly computes interactions between query and document terms, making it better at capturing exact and near-exact matches. The paper introduced a principled distinction between semantic similarity (what representation models do) and relevance matching (what retrieval requires).

[illustrate: Query terms × document terms → cosine similarity matrix → histogram bins per query term → feed-forward → term gate weights → relevance score]

How it works

  1. Interaction matrix:

    • Pre-trained word embeddings for query and document terms
    • Compute cosine similarity between every query-term and document-term pair
  2. Histogram mapping:

    • For each query term, bin the similarities into a fixed histogram (e.g., 30 bins from -1 to 1)
    • Captures the distribution of how well that query term matches the document
  3. Feed-forward scoring:

    • Each query term’s histogram → small MLP → scalar score
  4. Term gating:

    • Weight each query term’s score by an IDF-based gate
    • High-IDF (rare) terms contribute more to the final score
  5. Aggregation:

    • Sum gated term scores → final relevance score

Variants and history

DRMM (2016) was influential in framing the interaction-based vs. representation-based dichotomy. It inspired KNRM (kernel-based smoothing of the histogram) and PACRR (CNN over the interaction matrix). The explicit relevance-matching framing was important for understanding why dense retrieval models sometimes fail on entity-heavy or technical queries.

When to use it

DRMM is primarily historical. For practical deployment:

  • Use ColBERT or cross-encoders for interaction-based relevance matching
  • DRMM’s histograms and term gating ideas are useful for interpreting retrieval failures

See also