Recall

What it is

Recall is the fraction of relevant documents in the collection that a system retrieves. It measures completeness: recall = (# relevant retrieved) / (# relevant total). High recall indicates few false negatives (missed relevant documents). Recall is crucial for exhaustive retrieval scenarios (legal discovery, medical research) where missing relevant information is costly.

[illustrate: All relevant documents with retrieved subset highlighted; recall calculation]

How it works

Formula:

Recall = (# relevant retrieved) / (# relevant total in collection)

At rank k:

R@k = (# relevant in top-k) / (total relevant)

Properties:

Range: 0–1 (or 0–100%)
High recall: few missed relevant documents
Trade-off: optimizing for recall often reduces precision
Emphasis: quantity (covering all relevant) over purity

Example

Query: "best machine learning libraries"
Relevant documents in collection: {A, B, C, D, E} (5 total)

Retrieved: {A, B, X, Y, C}  (5 retrieved)
Relevant retrieved: {A, B, C}

Recall = 3/5 = 0.6
(Missed relevant documents: D, E)

R@3 = 2/5 = 0.4
R@5 = 3/5 = 0.6

Variants and history

Precision and recall are classical metrics (1950s–60s). Recall@k focuses on top-k results. Average Recall across cutoffs (less common than Average Precision). Discounted Cumulative Gain (DCG) and NDCG weight ranking position. Precision-recall curves show trade-off. In practice, systems optimize for both (F1, MAP) or rank-aware metrics (NDCG).

When to use it

Use recall when:

Exhaustive retrieval is required (missing items costly)
Legal/medical/scientific discovery (completeness critical)
Covering all relevant results matters
Users will review all results (not just top-k)
Minimizing false negatives is priority

Recall alone insufficient; combine with precision using F1 or mean-based metrics.