Prefix Query
What it is
A prefix query retrieves all documents containing any term starting with a specified prefix string. It is a special case of wildcard queries optimised for the prefix-only pattern.
How it works
Prefix queries exploit index structure for efficiency:
- Inverted index approach: Index terms are alphabetically sorted. Binary search finds the first term matching the prefix, then enumerate terms until prefix no longer matches.
- Trie approach: Navigate trie nodes matching the prefix; enumerate all terms in the resulting subtree.
- Ngram approach: Use edge-ngrams indexed at query time to match the prefix.
All approaches are logarithmic or linear in matching term count, not total index size.
[illustrate: Sorted inverted index with prefix “in” and range of matching terms; contrast with trie traversal]
Example
Query: title:in* (prefix query for terms starting with “in”)
Matches: “information”, “indexing”, “insert”, “inference”, etc.
Common in autocomplete: user types “inf”, system suggests “information”, “inference”.
Variants and history
Prefix trie: explicit trie structure. Sorted index binary search: leverages term ordering. Ngram indexing: edge-ngrams enable prefix matching at index time. Elasticsearch and Solr offer efficient prefix queries via term index traversal.
When to use it
Essential for autocomplete and type-ahead search. Efficient with proper index structure (trie or sorted). Preferred over wildcard queries for prefix matching due to lower latency. Can return large result sets if prefix is short.