Abstract
Range-filtering approximate nearest neighbor search (RFANNS) has gained significant attention recently. Consider a set D of highdimensional vectors, each associated with a numeric attribute value, e.g., price or timestamp. An RFANNS query consists of a query vector D and a query range, reporting the approximate nearest neighbors of D among data vectors whose attributes fall in the query range. Existing work on RFANNS only considers a static set D of data vectors while in many real-world scenarios, vectors arrive in the system in an arbitrary order. This paper studies dynamic RFANNS where both data vectors and queries arrive in a mixed stream: a query is posed on all the data vectors that have already arrived in the system. Existing work on RFANNS is difficult to be extended to the streaming setting as they construct the index in the order of the attribute values while the vectors arrive in the system in an arbitrary order. The main challenge to the dynamic RFANNS lies in the difference between the two orders. A naive approach to RFANNS maintains multiple hierarchical navigable small-world (HNSW) graphs, one for each of the O (|D|2 ) possible query ranges – too expensive to construct and maintain. To design an index structure that can integrate new data vectors with a low index size increment for efficient and effective query processing, we propose a structure called dynamic segment graph. It compresses the set of HNSW graphs of the naive approach, proven to be lossless under certain conditions, with only a linear to log |D| new edges in expectation when inserting a new vector. This dramatically reduces the index size while largely preserving the search performance. We further propose heuristics to significantly reduce the index cost of our dynamic segment graph in practice. Extensive experimental results show that our approach outperforms existing methods for static RFANNS and is scalable in handling dynamic RFANNS.
| Original language | American English |
|---|---|
| Pages (from-to) | 3256-3268 |
| Number of pages | 13 |
| Journal | Proceedings of the VLDB Endowment |
| Volume | 18 |
| Issue number | 10 |
| DOIs | |
| State | Published - 2025 |
| Externally published | Yes |
| Event | 51st International Conference on Very Large Data Bases, VLDB 2025 - London, United Kingdom Duration: Sep 1 2025 → Sep 5 2025 |
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- General Computer Science
Fingerprint
Dive into the research topics of 'Dynamic Range-Filtering Approximate Nearest Neighbor Search'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver