Embedding

Word2Vec

Efficient neural method for learning word embeddings using skip-gram or CBOW objectives, published by Mikolov et al. in 2013.
Word Embedding

Dense vector representation of a word in low-dimensional space, capturing semantic and syntactic relationships.
SPLADE

Sparse Lexical and Expansion Embeddings; learns sparse embeddings compatible with inverted indexes while capturing semantic understanding.
Sentence Embedding

Dense vector representation of a sentence or passage, aggregating token information into a single low-dimensional vector that preserves semantic meaning.
Matryoshka Representation Learning

Training method where prefixes of a vector are also useful embeddings; enables efficient storage and search at multiple granularities. Abbreviated MRL.
GloVe

Global Vectors for Word Representation; combines matrix factorization of word co-occurrence statistics with local context windows for learning embeddings.
fastText

Word embedding method using character n-grams to handle out-of-vocabulary words and morphological variants; published by Bojanowski et al. in 2017.
Dot Product Similarity

Inner product of two vectors; equivalent to cosine similarity when vectors are unit-normalised; fast to compute in dense retrieval.
Dense Retrieval

Retrieval method using nearest-neighbour search over dense embedding vectors; contrasts with inverted-index sparse retrieval like BM25.
ColBERT

Contextualized Late Interaction over BERT; late-interaction ranking using per-token embeddings with MaxSim scoring for efficient dense retrieval.
Bi-Encoder

Neural architecture encoding query and document independently into separate embeddings, enabling fast retrieval via approximate nearest-neighbour search.