Recall
What it is
Recall is the fraction of relevant documents in the collection that a system retrieves. It measures completeness: recall = (# relevant retrieved) / (# relevant total). High recall indicates few false negatives (missed relevant documents). Recall is crucial for exhaustive retrieval scenarios (legal discovery, medical research) where missing relevant information is costly.
[illustrate: All relevant documents with retrieved subset highlighted; recall calculation]
How it works
Formula:
Recall = (# relevant retrieved) / (# relevant total in collection)
At rank k:
R@k = (# relevant in top-k) / (total relevant)
Properties:
- Range: 0–1 (or 0–100%)
- High recall: few missed relevant documents
- Trade-off: optimizing for recall often reduces precision
- Emphasis: quantity (covering all relevant) over purity
Example
Query: "best machine learning libraries"
Relevant documents in collection: {A, B, C, D, E} (5 total)
Retrieved: {A, B, X, Y, C} (5 retrieved)
Relevant retrieved: {A, B, C}
Recall = 3/5 = 0.6
(Missed relevant documents: D, E)
R@3 = 2/5 = 0.4
R@5 = 3/5 = 0.6
Variants and history
Precision and recall are classical metrics (1950s–60s). Recall@k focuses on top-k results. Average Recall across cutoffs (less common than Average Precision). Discounted Cumulative Gain (DCG) and NDCG weight ranking position. Precision-recall curves show trade-off. In practice, systems optimize for both (F1, MAP) or rank-aware metrics (NDCG).
When to use it
Use recall when:
- Exhaustive retrieval is required (missing items costly)
- Legal/medical/scientific discovery (completeness critical)
- Covering all relevant results matters
- Users will review all results (not just top-k)
- Minimizing false negatives is priority
Recall alone insufficient; combine with precision using F1 or mean-based metrics.