Skip List
What it is
A skip list is a probabilistic data structure extending linked lists with additional “express” lanes that skip ahead. It enables O(log n) search, insertion, and deletion while maintaining simplicity of linked lists. Skip lists are used in IR engines for efficient postings-list merging and set intersection operations.
[illustrate: Multi-level skip list with express lanes; search path showing how express lanes speed up traversal]
How it works
-
Structure:
- Multiple levels of linked lists
- Bottom level: all elements in sorted order
- Higher levels: subset of elements (probabilistically chosen)
- Each element: forward pointers to next element at each level
-
Search (for value x):
- Start at top level
- Move right while element < x
- Drop down level when can’t go further
- O(log n) expected time
-
Insertion/Deletion:
- Find position via search
- Insert/delete node and update pointers
- Randomize level for inserted node
- O(log n) expected time
Example
Skip list (5 levels):
Level 3: [1] ────────────────────→ [9]
Level 2: [1] ─────→ [3] ────→ [7] ─→ [9]
Level 1: [1] ─→ [2] ─→ [3] → [5] → [7] → [8] → [9]
Level 0: [1] → [2] → [3] → [5] → [7] → [8] → [9]
Search for 7:
Start at top (Level 3): 1 < 7, move right
At [1], next is [9] > 7, drop to Level 2
Level 2: 1 < 7, move right
At [3], next is [7], move right, found!
Time: ~3 comparisons (vs. ~7 for naive linked list)
Variants and history
Skip lists invented by Pugh (1989) as alternative to balanced trees. Probabilistic balancing eliminates rebalancing complexity. Implemented in:
- ConcurrentSkipListMap (Java)
- Redis (sorted sets)
- Lucene (postings list merging)
Weighted skip lists enable efficient sampling. Skip lists have same complexity as balanced trees but simpler implementation.
When to use it
Use skip lists when:
- Postings-list merging (intersection, union)
- Sorted set operations (ranking, range queries)
- Simpler alternative to balanced trees
- Concurrent data structures (lock-free skip lists possible)
- Memory efficiency with O(log n) guarantees
Skip lists are practical and well-studied. Memory overhead ~1–2x vs. sorted array but dynamic updates faster.