Stored Field
What it is
A stored field is a field whose original value is persisted in the Lucene index and can be retrieved when a document matches a search query. When you see the _source content in an Elasticsearch hit, you are seeing stored field data.
Stored fields are entirely separate from the inverted index (which stores analysed terms for matching) and from DocValues (which stores field values for sorting and aggregation). A field can independently have any combination of these three storage types enabled.
How it works
When a document is indexed with stored: true for a field, Lucene writes the raw bytes of the field value into a stored fields file (.fdt / .fdx). At retrieval time, the engine reads this file by document ID to return the original value.
Lucene compresses stored field blocks together using LZ4 (fast, low compression) or Deflate (slower, higher compression). Documents are grouped into blocks of roughly 16KB; decompressing one stored field requires decompressing the whole block.
In Elasticsearch, the _source field is a special stored field containing the entire original JSON document. Individual field-level store: true settings create additional per-field stored entries:
"title": {
"type": "text",
"store": true
}
This is useful for returning specific fields without fetching and parsing the full _source document — relevant when documents are large and only a few fields are needed in results.
Example
| Field | Indexed (inverted) | DocValues | Stored |
|---|---|---|---|
body (text) |
✓ | ✗ | via _source |
price (double) |
✓ (range queries) | ✓ (sort/agg) | via _source |
category (keyword) |
✓ (filter) | ✓ (facet) | via _source |
image_bytes (binary) |
✗ | ✗ | ✓ (retrieval only) |
The image_bytes field has no inverted index (can’t full-text search binary data) and no DocValues (can’t sort by it), but is stored for retrieval.
Variants and history
Stored fields exist in all Lucene-based engines since Lucene 1.0. The decision of what to store has grown more complex over time:
- Elasticsearch —
_sourceis the primary stored field; individual fieldstore: trueis rarely needed. Disabling_sourcesaves disk but breaks document reindexing, update, and highlighting. - Solr —
stored="true"is the per-field attribute; there is no_sourceequivalent — you must explicitly store each field you want to retrieve. - Source filtering — Elasticsearch allows
_source: {includes: [...], excludes: [...]}to store a subset of the source, reducing disk usage for large documents with unretrievable fields.
When to use it
- Enable
_sourcein Elasticsearch unless you have strong disk-saving requirements — it enables highlighting, partial updates, and reindexing. - Disable storage on large fields that are only used for matching, never retrieval (e.g. a large body field when you return only title and URL).
- Prefer DocValues over stored fields for fields used in sorting and faceting — DocValues are faster for those access patterns.