Stored Field

What it is

A stored field is a field whose original value is persisted in the Lucene index and can be retrieved when a document matches a search query. When you see the _source content in an Elasticsearch hit, you are seeing stored field data.

Stored fields are entirely separate from the inverted index (which stores analysed terms for matching) and from DocValues (which stores field values for sorting and aggregation). A field can independently have any combination of these three storage types enabled.

How it works

When a document is indexed with stored: true for a field, Lucene writes the raw bytes of the field value into a stored fields file (.fdt / .fdx). At retrieval time, the engine reads this file by document ID to return the original value.

Lucene compresses stored field blocks together using LZ4 (fast, low compression) or Deflate (slower, higher compression). Documents are grouped into blocks of roughly 16KB; decompressing one stored field requires decompressing the whole block.

In Elasticsearch, the _source field is a special stored field containing the entire original JSON document. Individual field-level store: true settings create additional per-field stored entries:

"title": {
  "type": "text",
  "store": true
}

This is useful for returning specific fields without fetching and parsing the full _source document — relevant when documents are large and only a few fields are needed in results.

Example

Field Indexed (inverted) DocValues Stored
body (text) via _source
price (double) ✓ (range queries) ✓ (sort/agg) via _source
category (keyword) ✓ (filter) ✓ (facet) via _source
image_bytes (binary) ✓ (retrieval only)

The image_bytes field has no inverted index (can’t full-text search binary data) and no DocValues (can’t sort by it), but is stored for retrieval.

Variants and history

Stored fields exist in all Lucene-based engines since Lucene 1.0. The decision of what to store has grown more complex over time:

  • Elasticsearch_source is the primary stored field; individual field store: true is rarely needed. Disabling _source saves disk but breaks document reindexing, update, and highlighting.
  • Solrstored="true" is the per-field attribute; there is no _source equivalent — you must explicitly store each field you want to retrieve.
  • Source filtering — Elasticsearch allows _source: {includes: [...], excludes: [...]} to store a subset of the source, reducing disk usage for large documents with unretrievable fields.

When to use it

  • Enable _source in Elasticsearch unless you have strong disk-saving requirements — it enables highlighting, partial updates, and reindexing.
  • Disable storage on large fields that are only used for matching, never retrieval (e.g. a large body field when you return only title and URL).
  • Prefer DocValues over stored fields for fields used in sorting and faceting — DocValues are faster for those access patterns.

See also