HNSW

What it is

HNSW (Hierarchical Navigable Small World) is a graph-based approximate nearest-neighbour algorithm published by Malkov and Yashunin (2018). It organizes vectors into a multi-layered navigable graph where each layer is sparser than the one below, enabling efficient logarithmic-time traversal to find nearest neighbours.

[illustrate: HNSW multi-layer structure with top sparse layer (few nodes), bottom dense layer (all nodes); search path from entry point through layers]

How it works

  1. Insertion:

    • Assign new vector to random layers with exponential decay
    • Insert into each layer, connecting to M nearest neighbours in that layer
  2. Search:

    • Start from top layer with single entry point
    • Search layer greedily (move to nearest unvisited candidate)
    • Descend to next layer when no improvement found
    • Return nearest neighbours from bottom layer
  3. Parameters:

    • M: max connections per node (typically 5–48)
    • ef_construction: search width during insertion (typically 200)
    • ef: search width at query time (typically 200)

Example

# Insert vector into HNSW(M=16, ef=200)
# Vector assigned to layers [0, 1, 2, 3] (exponential distribution)
# Connected to 16 nearest neighbours in each layer

# Query: find 10 nearest neighbours
# Start at layer 3 (top), navigate greedily
# Descend through layers 2, 1
# Return top-10 from layer 0
# Latency: ~50ms for 1B vectors, 0.95 recall@10

Variants and history

HNSW appeared in 2018 as a refinement of Navigable Small World (NSW) graphs. It quickly dominated the ANN landscape due to simplicity, strong recall-speed tradeoff, and single-threaded insertion. Variants include HNSW+DistTransform (improved distance estimation), learned entry points, and dynamic layer assignment. Implemented in FAISS, Weaviate, Qdrant, and other production systems.

When to use it

Choose HNSW when:

  • You need sub-100ms query latency on billion-scale data
  • Single-threaded insertion is acceptable
  • Graph-based indexing suits your use case
  • Memory overhead (8–16 bytes per vector) is tolerable
  • 0.95+ recall@k is sufficient

HNSW is now the industry standard for dense retrieval. Trade-offs: insertion is slower than IVF, memory overhead higher than quantization, and tuning M and ef affects quality and latency.

See also