ANCE
What it is
ANCE (Approximate Nearest Neighbor Negative Contrastive Estimation) is a training technique and model (Xiong et al., 2020) for dense retrieval that addresses the weak negatives problem in DPR-style training. Rather than using static BM25 negatives or random in-batch negatives, ANCE periodically rebuilds the ANN index with the current model’s weights and mines hard negatives — passages that the current model incorrectly ranks highly for a given query.
[illustrate: Training loop alternating between model update and ANN index refresh; hard negatives pulled from current model’s top-k retrievals]
How it works
-
The weak negatives problem:
- In-batch negatives are mostly easy to distinguish from positives
- BM25 negatives are hard lexically but may be easy semantically once the model learns dense representations
- Result: training signal saturates early, model plateaus
-
ANCE training loop:
- Train bi-encoder on current negatives
- Periodically re-encode all passages with updated model weights
- Rebuild ANN index; for each query, mine top-k passages that are not relevant as new hard negatives
- Resume training with refreshed negatives
-
Asynchronous refresh:
- Full re-encoding is expensive; ANCE uses a refresh cadence (every N steps)
- A slightly stale index is still much better than random negatives
-
Result:
- Substantially better recall than DPR on MSMARCO and NQ
- The gap closes as other methods catch up, but ANCE established dynamic negative mining as standard practice
Example
Training iteration k:
Negatives: random in-batch passages
After N steps, refresh index:
Re-encode all 8.8M Wikipedia passages with current encoder
Build new FAISS index
For query "What is photosynthesis?":
ANN top-10: ["Cell respiration...", "Chlorophyll pigment...", ...]
Remove known positives → remaining = hard negatives
Training iteration k+N:
Negatives: dynamically mined hard negatives
Training signal: much stronger
Variants and history
ANCE (2020) was one of the first papers to systematically study negative selection for dense retrieval. Follow-on: RocketQA (2021) added cross-batch denoised negatives; ADORE optimized the query encoder while keeping document encoder fixed for stability; AR2 combined ANCE-style mining with augmented data. Dynamic hard negative mining is now standard in retrieval training pipelines.
When to use it
Use ANCE-style training when:
- Dense retrieval model performance has plateaued with static negatives
- You have infrastructure to periodically re-encode a large corpus
- Recall@k is the primary metric and you want to push it higher
The index refresh cost is non-trivial (full corpus re-encoding). For smaller corpora or limited compute, ANCE is highly recommended; for billion-scale corpora, consider distillation-based alternatives.