Data-Structures

Term Vector

A term vector is a per-document record of the terms, frequencies, and optionally positions and offsets produced by index-time analysis. It enables highlighting, More Like This queries, and forward-index access.
Stored Field

A stored field retains the original verbatim value of a field in the index so it can be returned in search results. Stored fields are separate from the inverted index and from DocValues.
Segment

A segment is an immutable, self-contained unit of a Lucene index. New documents are written to new segments; segments are periodically merged to keep the index efficient.
Postings List

A postings list is the ordered sequence of postings for a single term in an inverted index — the list of all documents containing that term, with optional frequencies and positions.
Posting

A posting is a single record in an inverted index, linking a term to one document in which it appears — optionally including term frequency and token positions.
Positional Index

A positional index extends the inverted index by storing the token position of each term occurrence within a document, enabling phrase queries and proximity queries.
Merge Policy

A merge policy defines the rules governing when and how Lucene index segments are merged. Merging controls the tradeoff between indexing throughput, search performance, and disk usage.
Forward Index

A forward index maps each document to the list of terms it contains. It is the natural output of document ingestion and the starting point for building an inverted index.
Field Type

A field type is a named schema definition that specifies how a field’s values are stored, indexed, and analysed. It bundles an analyzer, storage options, and index behaviour into a reusable configuration.
DocValues

DocValues is a column-oriented on-disk data structure in Lucene that stores field values per document, enabling efficient sorting, faceting, and aggregations without loading the entire index into memory.
Trie

A trie is a tree where each path from root to node spells out a prefix, enabling O(k) term lookup, prefix enumeration, and autocomplete — where k is the length of the query string.
Inverted Index

An inverted index maps each unique term in a corpus to the documents — and optionally the positions — where it appears, making full-text search fast regardless of corpus size.