perf: batched WAND and new WAND structure, ~50% faster by BubbleCal · Pull Request #6241 · lance-format/lance

BubbleCal · 2026-03-20T15:48:09Z

advances posting iterators in batch
splits posting iterators into head, lead and tail, reduces the cost of updating posting iterators
reuses query_weight
pruning documents more aggressive with block max scores
uses a Lucene-style conjunction path for phrase queries, with conjunction intersection and block-max pruning separated from OR WAND

Query Type	Version	Mode	QPS	Avg Latency	P90	P99
match	current	single-thread	—	2.93 ms	5.62 ms	7.50 ms
match	main (v2)	single-thread	—	4.22 ms	7.56 ms	7.85 ms
match	current	8-concurrency	612.80	13.05 ms	17.80 ms	20.78 ms
match	main (v2)	8-concurrency	599.17	13.35 ms	19.61 ms	21.86 ms
phrase	current	single-thread	—	2.02 ms	2.59 ms	2.62 ms
phrase	main (v2)	single-thread	—	3.60 ms	4.62 ms	4.67 ms
phrase	current	8-concurrency	1597.37	5.01 ms	6.57 ms	8.60 ms
phrase	main (v2)	8-concurrency	1040.66	7.69 ms	9.86 ms	11.23 ms

Signed-off-by: BubbleCal <bubble-cal@outlook.com>

github-actions · 2026-03-20T15:49:45Z

Review: perf: batched WAND and new WAND structure

Nice performance improvement — the benchmarks show meaningful gains across the board, especially for phrase queries (~40% latency reduction single-threaded). The head/lead/tail split is a well-known WAND optimization.

Issues to consider

P0: Potential infinite loop in next_and_candidate

next_and_candidate (wand.rs) has a loop with no exit condition other than early return via ? (which returns None if a posting is exhausted). However, if all postings are non-empty but never align to the same target, the loop will spin forever as target keeps increasing via target = target.max(doc.doc_id()). If postings are genuinely disjoint (no shared doc_id), target will keep jumping and eventually all postings exhaust — but only if next() eventually returns None when past the last doc. Please verify that posting.next(target) when target is beyond all docs causes posting.doc() to return None, breaking the loop. If there's a gap where next() clamps to the last doc instead of returning None, this is an infinite loop.

P1: advance_lead_to_head unconditionally clears tail

advance_lead_to_head calls self.clear_tail() at the end, which discards all tail postings. In the search_flat path, this is called after collecting a match (advance_lead_to_head(doc_id + 1) at line ~769), but those tail postings may still be relevant for future candidates. Contrast with push_back_leads which carefully reinserts into tail/head. Is the tail always guaranteed to be empty at this point, or is this losing postings?

P1: Floating-point subtraction in insert_tail_with_overflow

self.tail_max_score = self.tail_max_score - evicted.upper_bound + upper_bound;

Repeated add/subtract of f32 upper bounds will accumulate rounding errors in tail_max_score. Over many iterations this drift could cause premature pruning (if tail_max_score becomes slightly too low). Consider periodically recomputing tail_max_score from scratch (sum of all tail entries), or accept this as a minor recall risk and document it.

P1: update_max_scores rebuilds both heaps from scratch

update_max_scores calls std::mem::take on both self.head and self.tail, iterates over into_vec(), and rebuilds. This is O(n) per call. Since this is called in the hot loop (next()) whenever target > self.up_to, if up_to advances in small steps (short blocks), this could regress to O(n²) for large posting lists. Worth profiling with a pathological case (many terms, small blocks).

Minor nits

collect_tail_matches pushes non-matching postings to remaining then re-inserts them into head. These were in tail (i.e., low-priority) and are now being promoted to head, which seems intentional but worth a brief comment.
The #[cfg(test)] on PostingIterator::new is fine, but if any downstream crate's tests relied on it, they'd break. Looks safe since it's pub(crate).

Overall a solid optimization. The main concern is verifying the next_and_candidate termination and the advance_lead_to_head tail clearing behavior.

Signed-off-by: BubbleCal <bubble-cal@outlook.com>

codecov · 2026-03-20T16:53:53Z

Codecov Report

❌ Patch coverage is 80.78740% with 122 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance-index/src/scalar/inverted/wand.rs	80.69%	104 Missing and 18 partials ⚠️

📢 Thoughts on this report? Let us know!

- advances posting iterators in batch - splits posting iterators into `head`, `lead` and `tail`, reduces the cost of updating posting iterators - reuses `query_weight` - pruning documents more aggressive with block max scores - uses a Lucene-style conjunction path for phrase queries, with conjunction intersection and block-max pruning separated from OR WAND Query Type | Version | Mode | QPS | Avg Latency | P90 | P99 -- | -- | -- | -- | -- | -- | -- match | current | single-thread | — | 2.93 ms | 5.62 ms | 7.50 ms match | main (v2) | single-thread | — | 4.22 ms | 7.56 ms | 7.85 ms match | current | 8-concurrency | 612.80 | 13.05 ms | 17.80 ms | 20.78 ms match | main (v2) | 8-concurrency | 599.17 | 13.35 ms | 19.61 ms | 21.86 ms phrase | current | single-thread | — | 2.02 ms | 2.59 ms | 2.62 ms phrase | main (v2) | single-thread | — | 3.60 ms | 4.62 ms | 4.67 ms phrase | current | 8-concurrency | 1597.37 | 5.01 ms | 6.57 ms | 8.60 ms phrase | main (v2) | 8-concurrency | 1040.66 | 7.69 ms | 9.86 ms | 11.23 ms --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>

perf: batched WAND

7035eb7

Signed-off-by: BubbleCal <bubble-cal@outlook.com>

github-actions bot added the performance label Mar 20, 2026

BubbleCal added 2 commits March 20, 2026 23:51

add more comments

7b4d804

Signed-off-by: BubbleCal <bubble-cal@outlook.com>

more test

474b227

Signed-off-by: BubbleCal <bubble-cal@outlook.com>

BubbleCal changed the title ~~perf: batched WAND and new WAND structure~~ perf: batched WAND and new WAND structure, ~50% faster Mar 20, 2026

BubbleCal requested review from Xuanwo and westonpace March 20, 2026 16:30

BubbleCal added 4 commits March 23, 2026 21:53

fix flat_search

797a775

fix wand pruning for and queries and initialize up_to

3d3b61c

perf: align conjunction wand with lucene

83ad229

test: cover advancing to next candidate after phrase miss

dcfbb29

Xuanwo approved these changes Mar 23, 2026

View reviewed changes

BubbleCal merged commit 384fb55 into main Mar 23, 2026
29 checks passed

BubbleCal deleted the yang/batched-wand branch March 23, 2026 16:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: batched WAND and new WAND structure, ~50% faster#6241

perf: batched WAND and new WAND structure, ~50% faster#6241
BubbleCal merged 7 commits intomainfrom
yang/batched-wand

BubbleCal commented Mar 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

codecov bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BubbleCal commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 20, 2026

Review: perf: batched WAND and new WAND structure

Issues to consider

Minor nits

Uh oh!

codecov bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BubbleCal commented Mar 20, 2026 •

edited

Loading

codecov bot commented Mar 20, 2026 •

edited

Loading