Details

Extending Hierarchical Epsilon-Net Navigation (HENN) Graphs for Hybrid Range Queries and Hierarchical Outlier Detection

Project Image

Year: 2026

Term: Winter

Student Name: Peien Liu

Supervisor: Michiel Smid

Abstract: Graph-based Approximate Nearest Neighbor (ANN) structures excel in pure similarity search but struggle with hybrid attribute queries and anomaly detection. Traditional post-filtering approaches for range-constrained queries frequently return empty result sets when target attributes are sparse. Simultaneously, standalone outlier detection models like the Local Outlier Factor (LOF) require computationally expensive O(n^2) neighborhood graph constructions. This paper extends the Hierarchical epsilon-Net Navigation (HENN) graph to resolve these limitations through two structural modifications. First, we introduce BT-HENN, a hybrid architecture that integrates HENN sub-indices within a balanced B-tree over a numerical attribute. This approach structurally guarantees adherence to range constraints, maintaining a perfect recall of 1.0 where naive post-filtering fails completely. By restricting graph traversals to explicitly bounded canonical nodes, BT-HENN reduces empirical query latency by an order of magnitude compared to global index searches, trading an increased O(n log^2 n) index construction cost for high-reliability retrieval. Second, we propose EndTree, a hierarchical propagation mechanism that embeds local density and distance statistics directly into the epsilon-net layers during index construction. EndTree calculates local anomaly scores concurrently during standard ANN traversals. Empirical evaluations demonstrate that when the exponential decay parameter of the index is properly tuned, EndTree matches the exact precision and recall of the LOF baseline. This integration provides a highly scalable solution for joint similarity and anomaly detection without the need for secondary data structures.