Late-Interaction
-
ColBERTv2
Improved ColBERT with cross-encoder distillation and residual compression; dramatically reduces index size while matching or exceeding v1 effectiveness.
-
PLAID
Performance-optimized Late Interaction Driver; efficient serving engine for ColBERT using centroid-based candidate filtering to avoid full MaxSim computation over the entire index.
-
ColBERT
Contextualized Late Interaction over BERT; late-interaction ranking using per-token embeddings with MaxSim scoring for efficient dense retrieval.