Prefix Query

What it is

A prefix query retrieves all documents containing any term starting with a specified prefix string. It is a special case of wildcard queries optimised for the prefix-only pattern.

How it works

Prefix queries exploit index structure for efficiency:

  1. Inverted index approach: Index terms are alphabetically sorted. Binary search finds the first term matching the prefix, then enumerate terms until prefix no longer matches.
  2. Trie approach: Navigate trie nodes matching the prefix; enumerate all terms in the resulting subtree.
  3. Ngram approach: Use edge-ngrams indexed at query time to match the prefix.

All approaches are logarithmic or linear in matching term count, not total index size.

[illustrate: Sorted inverted index with prefix “in” and range of matching terms; contrast with trie traversal]

Example

Query: title:in* (prefix query for terms starting with “in”)

Matches: “information”, “indexing”, “insert”, “inference”, etc.

Common in autocomplete: user types “inf”, system suggests “information”, “inference”.

Variants and history

Prefix trie: explicit trie structure. Sorted index binary search: leverages term ordering. Ngram indexing: edge-ngrams enable prefix matching at index time. Elasticsearch and Solr offer efficient prefix queries via term index traversal.

When to use it

Essential for autocomplete and type-ahead search. Efficient with proper index structure (trie or sorted). Preferred over wildcard queries for prefix matching due to lower latency. Can return large result sets if prefix is short.

See also