Mean Reciprocal Rank
What it is
Mean Reciprocal Rank (MRR) is the average reciprocal rank of the first relevant result across queries. It measures how quickly a system finds the first correct answer. MRR = 1 / (rank of first relevant result). Ideal for tasks where only one correct answer exists (QA, entity lookup) or where user care primarily about first result.
[illustrate: Rankings with first relevant position highlighted; reciprocal rank for each; MRR as average]
How it works
Reciprocal Rank (single query):
RR = 1 / rank_of_first_relevant
Properties:
- RR = 1 if first result relevant (rank 1)
- RR = 0.5 if first relevant at rank 2
- RR = 0 if no relevant result
Mean Reciprocal Rank (multiple queries):
MRR = (1/Q) × Σ RR_q over all Q queries
Interpretation:
- MRR ∈ [0, 1]
- MRR = 1: always find answer at rank 1
- MRR = 0.5: find answer at rank 2 on average
- High MRR: user quickly finds answer
Example
Query 1: First relevant at rank 1 → RR = 1/1 = 1.0
Query 2: First relevant at rank 3 → RR = 1/3 = 0.33
Query 3: No relevant results → RR = 0
Query 4: First relevant at rank 2 → RR = 1/2 = 0.5
MRR = (1.0 + 0.33 + 0 + 0.5) / 4 = 0.46
Interpretation: On average, first relevant result at rank ~2.2
Variants and history
MRR emerges from TREC QA evaluation (2000s). Particularly useful for exact answer QA where finding first answer is goal. GRES (Google Research Effectiveness Score) variant weights first few positions. Modern QA systems often report MRR; also used in zero-shot learning (first correct example). Less focus in ranking research compared to MAP/NDCG.
When to use it
Use MRR when:
- Exactly one correct answer exists (QA, entity lookup)
- User only cares about first result
- Fast finding of correct answer is priority
- Ranking of results after first answer is less important
- Quick usability metrics needed
MRR is interpretable and simple. Limitation: ignores ranking quality beyond first result; complements with P@1 or MR precision.