Mean Reciprocal Rank

What it is

Mean Reciprocal Rank (MRR) is the average reciprocal rank of the first relevant result across queries. It measures how quickly a system finds the first correct answer. MRR = 1 / (rank of first relevant result). Ideal for tasks where only one correct answer exists (QA, entity lookup) or where user care primarily about first result.

[illustrate: Rankings with first relevant position highlighted; reciprocal rank for each; MRR as average]

How it works

Reciprocal Rank (single query):

RR = 1 / rank_of_first_relevant

Properties:

RR = 1 if first result relevant (rank 1)
RR = 0.5 if first relevant at rank 2
RR = 0 if no relevant result

Mean Reciprocal Rank (multiple queries):

MRR = (1/Q) × Σ RR_q  over all Q queries

Interpretation:

MRR ∈ [0, 1]
MRR = 1: always find answer at rank 1
MRR = 0.5: find answer at rank 2 on average
High MRR: user quickly finds answer

Example

Query 1: First relevant at rank 1 → RR = 1/1 = 1.0
Query 2: First relevant at rank 3 → RR = 1/3 = 0.33
Query 3: No relevant results → RR = 0
Query 4: First relevant at rank 2 → RR = 1/2 = 0.5

MRR = (1.0 + 0.33 + 0 + 0.5) / 4 = 0.46

Interpretation: On average, first relevant result at rank ~2.2

Variants and history

MRR emerges from TREC QA evaluation (2000s). Particularly useful for exact answer QA where finding first answer is goal. GRES (Google Research Effectiveness Score) variant weights first few positions. Modern QA systems often report MRR; also used in zero-shot learning (first correct example). Less focus in ranking research compared to MAP/NDCG.

When to use it

Use MRR when:

Exactly one correct answer exists (QA, entity lookup)
User only cares about first result
Fast finding of correct answer is priority
Ranking of results after first answer is less important
Quick usability metrics needed

MRR is interpretable and simple. Limitation: ignores ranking quality beyond first result; complements with P@1 or MR precision.