chore: introduce VectorIndexManager runtime framework with incremental sync, ANN search and versioned persistence by hahahahbenny · Pull Request #2922 · apache/hugegraph

hahahahbenny · 2025-12-19T11:34:08Z

Purpose of the PR

This PR implements the vector-index runtime management framework (VectorIndexManager), which coordinates data synchronization between the RocksDB storage layer and the JVector in-memory index, and supports incremental vector updates, ANN search, and index persistence.

Architecture

Overall Architecture

flowchart TB
    %% --- Node style definitions (ClassDefs) ---
    classDef manager fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000;
    classDef memoryComp fill:#bbdefb,stroke:#0d47a1,stroke-width:2px,color:#000;
    classDef memoryStruct fill:#e1bee7,stroke:#4a148c,stroke-width:2px,stroke-dasharray: 5 5,color:#000;
    classDef disk fill:#ffe0b2,stroke:#e65100,stroke-width:2px,shape:cylinder,color:#000;
    classDef file fill:#fff9c4,stroke:#fbc02d,stroke-width:2px,shape:note,color:#000;

    %% ================= Main Architecture =================

    %% --- 1. Top: Orchestration Layer (green background) ---
    subgraph Orchestration ["Orchestration Layer"]
        direction TB
        M["VectorIndexManager<br/>(Main Coordinator)"]:::manager
    end

    %% --- 2. Middle: Memory Layer (blue background) ---
    subgraph MemoryLayer ["Memory Space"]
        direction TB
        
        %% Three core component containers (white background to highlight inner components)
        subgraph Components ["Core Components"]
            direction LR
            
            subgraph Components_vector ["Vector-Related Components"]
               %% StateStore
                SS["VectorStateStore<br/>(KV Data Abstraction)"]:::memoryComp

                %% Runtime
                RT["VectorRuntime<br/>(JVector In-Memory Graph)"]:::memoryComp
            end
            
            %% Scheduler & EventHub
            subgraph Scheduler_Wrap ["Async Scheduling"]
                style Scheduler_Wrap fill:none
                SC["VectorTaskScheduler<br/>(Task Scheduler)"]:::memoryComp
                EH[("EventHub<br/>(In-Memory Queue/RingBuffer)")]:::memoryStruct
            end
        end
    end

    %% --- 3. Bottom: Disk Persistence Layer (orange background) ---
    subgraph DiskLayer ["Disk Space"]
        direction LR
        
        %% RocksDB
        ROCKS[("RocksDB<br/>(WAL / SSTables)")]:::disk
        
        %% JVector Files
        subgraph JVectorFiles ["JVector Persistence"]
            JV_IDX["Index File<br/>(Vector-Graph Data)"]:::file
            JV_META["Meta Data<br/>(Sequence / VectorID)"]:::file
        end
    end

    %% ================= Connection Logic =================

    %% Manager interactions
    M -->|Interactive| SS
    M -->|Interactive| RT

    %% Inside scheduler
    SC -- "push/poll" --> EH

    %% Update coordination flow (dashed for async/data flow)
    M -.->|Submit Update Task| SC
    SC -.->|1. Async: Scan Deltas| SS
    SC -.->|2. Async: Build/Update| RT

    %% Persistence interactions
    SS -- "scanDeltas / getVertex" --- ROCKS
    RT -- "Load / Flush" --- JV_IDX
    RT -- "Read / Write" --- JV_META

    %% Force alignment
    SS ~~~ SC ~~~ RT

    %% ================= Subgraph color styling =================
    %% 1. Orchestration Layer: light green
    style Orchestration fill:#e8f5e9,stroke:#4caf50,stroke-width:2px,stroke-dasharray: 5 5

    %% 2. Memory Layer: light blue
    style MemoryLayer fill:#e3f2fd,stroke:#2196f3,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 3. Core Components area: pure white (makes blue nodes stand out)
    style Components fill:#ffffff,stroke:#90caf9,stroke-width:1px

    %% 4. Disk Layer: light orange
    style DiskLayer fill:#fff3e0,stroke:#ff9800,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 5. JVector file area: transparent or light yellow
    style JVectorFiles fill:#fffde7,stroke:#fbc02d,stroke-width:1px,stroke-dasharray: 3 3

Data Flow

sequenceDiagram
    participant GIT as GraphIndexTransaction
    participant M as VectorIndexManager
    participant SC as Scheduler
    participant SS as StateStore
    participant RT as Runtime
    participant JV as JVector

    Note over GIT,JV: Write Flow (async)
    GIT->>M: signal(indexLabelId)
    M->>SC: execute(task)
    SC->>SS: scanDeltas(indexLabelId, fromSeq)
    SS-->>SC: Iterator VectorRecord
    SC->>RT: update(indexLabelId, records)
    RT->>JV: addGraphNode / markNodeDeleted

    Note over GIT,JV: Search Flow (sync)
    GIT->>M: searchVector(indexLabelId, vector, topK)
    M->>RT: search(indexLabelId, vector, topK)
    RT->>JV: GraphSearcher.search()
    JV-->>RT: Iterator vectorId
    RT-->>M: Iterator vectorId
    M->>SS: getVertex(indexLabelId, vectorIds)
    SS-->>M: Set vertexId
    M-->>GIT: Set Id

Main Changes

hugegraph-common (abstraction layer)

File	Responsibility
`VectorIndexManager`	Coordinator, manages lifecycle and interaction of the three components
`VectorIndexRuntime`	Runtime interface, defines operations such as update/search/flush
`AbstractVectorRuntime`	Abstract runtime implementation, manages IndexContext and versioned persistence
`VectorIndexStateStore`	State storage interface, defines scanDeltas/getVertex operations
`VectorTaskScheduler`	Task scheduling interface, supports async task execution
`VectorRecord`	Vector record DTO, contains vectorId/vector/deleted/sequence

hugegraph-core (server-side implementation)

File	Responsibility
`ServerVectorRuntime`	JVector runtime implementation, supports COSINE/EUCLIDEAN/DOT_PRODUCT
`ServerVectorStateStore`	RocksDB state storage implementation, scans increments based on IdPrefixQuery
`ServerVectorScheduler`	Event-driven scheduling implementation based on EventHub

Core Design

1. Incremental Sync Mechanism

Uses sequence watermarks to track and sync only newly added/modified vector records to the JVector in-memory index.

2. IndexContext Management

Each IndexLabel corresponds to one IndexContext, which encapsulates vector data, JVector builder, and metadata.

3. Versioned Persistence

Employs symbolic link switching to support atomic version updates and rollback of old versions.

flowchart LR
    subgraph Dir["Directory Structure"]
        BASE["{basePath}/{indexLabelId}/"]
        BASE --> CUR["current → version_xxx (symlink)"]
        BASE --> V1["version_20250101_120000/"]
        BASE --> V2["version_20250101_110000/"]
        V1 --> IDX1["index.inline"]
        V1 --> META1["vector_meta.json"]
    end

4. Soft Delete Strategy

Deletion operations only mark nodes as deleted; actual cleanup occurs during flush.

Search Flow

Search returns Set<Id> (vertexId), which can be directly used to build FixedIdHolder for IdHolderList, seamlessly integrating with the existing index query framework.

New Dependencies

Dependency	Version	Purpose
jvector	3.0.0	HNSW vector index implementation

Follow-up Work

To be completed

Integrate into GraphIndexTransaction.queryByUserprop() query path
Implement doVectorIndex() method
REST API / Gremlin Step support for vector search syntax
Old version file cleanup during stop()

Tests to be added

Test Type	Test Content
Unit test	VectorIndexManager lifecycle
Unit test	ServerVectorRuntime incremental update and search
Unit test	AbstractVectorRuntime versioned persistence
Integration test	End-to-end search with RocksDB + JVector
Performance test	Search latency under different vector scales

Verifying these changes

Trivial rework / code cleanup without any test coverage. (No Need)
Already covered by existing tests, such as (please modify tests here).
Need tests and can be verified as follows:
- xxx

Does this PR potentially affect the following parts?

Dependencies (add/update license info & regenerate_known_dependencies.sh)
Modify configurations
The public API
Other affects (add new framwork to manage vector index)
Nope

Documentation Status

Doc - TODO
Doc - Done
Doc - No Need

概述

本 PR 实现了向量索引的运行时管理框架（VectorIndexManager），负责协调 RocksDB 存储层与 JVector 内存索引之间的数据同步，支持向量的增量更新、ANN 搜索和索引持久化。

架构设计

整体架构

flowchart TB
    %% --- 节点样式定义 (ClassDefs) ---
    classDef manager fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000;
    classDef memoryComp fill:#bbdefb,stroke:#0d47a1,stroke-width:2px,color:#000;
    classDef memoryStruct fill:#e1bee7,stroke:#4a148c,stroke-width:2px,stroke-dasharray: 5 5,color:#000;
    classDef disk fill:#ffe0b2,stroke:#e65100,stroke-width:2px,shape:cylinder,color:#000;
    classDef file fill:#fff9c4,stroke:#fbc02d,stroke-width:2px,shape:note,color:#000;

    %% ================= 架构图主体 =================

    %% --- 1. 顶层：协调层 (绿色背景) ---
    subgraph Orchestration ["Orchestration Layer (协调层)"]
        direction TB
        M["VectorIndexManager<br/>(总协调器)"]:::manager


    %% --- 2. 中层：内存组件层 (蓝色背景) ---
    subgraph MemoryLayer ["Memory Space (内存层)"]
        direction TB
        
        %% 三大组件容器 (白色背景，突出内部组件)
        subgraph Components ["Core Components (核心组件)"]
            direction LR
            
            subgraph Components_vector [直接与vector相关组件]
               %% StateStore
                SS["VectorStateStore<br/>(KV数据抽象)"]:::memoryComp

                %% Runtime
                RT["VectorRuntime<br/>(JVector 内存图结构)"]:::memoryComp
            
            end
            %% Scheduler & EventHub
            subgraph Scheduler_Wrap ["异步调度"]
                style Scheduler_Wrap fill:none
                SC["VectorTaskScheduler<br/>(任务调度器)"]:::memoryComp
                EH[("EventHub<br/>(内存队列/RingBuffer)")]:::memoryStruct
            end
            
            
        end
    end
   end
    %% --- 3. 底层：磁盘持久化层 (橙色背景) ---
    subgraph DiskLayer ["Disk Space (磁盘层)"]
        direction LR
        
        %% RocksDB
        ROCKS[("RocksDB<br/>(WAL / SSTables)")]:::disk
        
        %% JVector Files
        subgraph JVectorFiles ["JVector Persistence"]
            JV_IDX["Index File<br/>(向量图数据)"]:::file
            JV_META["Meta Data<br/>(Sequence / VectorID)"]:::file
        end
    end

    %% ================= 连线逻辑 =================

    %% Manager 交互
    M -->|Interactive| SS
    M -->|Interactive| RT

    %% 调度器内部
    SC -- "push/poll" --> EH

    %% Update 协同流程 (虚线体现异步/数据流)
    M -.->|Submit Update Task| SC
    SC -.->|1. Async: Scan Deltas| SS
    SC -.->|2. Async: Build/Update| RT

    %% 持久化交互
    SS -- "scanDeltas / getVertex" --- ROCKS
    RT -- "Load / Flush" --- JV_IDX
    RT -- "Read / Write" --- JV_META

    %% 强制对齐
    SS ~~~ SC ~~~ RT

    %% ================= Subgraph 颜色样式配置 =================
    %% 语法: style [SubgraphID] fill:[背景色],stroke:[边框色],stroke-width:[宽],color:[文字色]
    
    %% 1. 协调层：浅绿色
    style Orchestration fill:#e8f5e9,stroke:#4caf50,stroke-width:2px,stroke-dasharray: 5 5

    %% 2. 内存层：浅蓝色
    style MemoryLayer fill:#e3f2fd,stroke:#2196f3,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 3. 核心组件区：纯白 (让内部的蓝色节点更突出)
    style Components fill:#ffffff,stroke:#90caf9,stroke-width:1px

    %% 4. 磁盘层：浅橙色
    style DiskLayer fill:#fff3e0,stroke:#ff9800,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 5. JVector文件区：透明或微黄
    style JVectorFiles fill:#fffde7,stroke:#fbc02d,stroke-width:1px,stroke-dasharray: 3 3

数据流

sequenceDiagram
    participant GIT as GraphIndexTransaction
    participant M as VectorIndexManager
    participant SC as Scheduler
    participant SS as StateStore
    participant RT as Runtime
    participant JV as JVector

    Note over GIT,JV: 写入流程（异步）
    GIT->>M: signal(indexLabelId)
    M->>SC: execute(task)
    SC->>SS: scanDeltas(indexLabelId, fromSeq)
    SS-->>SC: Iterator VectorRecord
    SC->>RT: update(indexLabelId, records)
    RT->>JV: addGraphNode / markNodeDeleted

    Note over GIT,JV: 搜索流程（同步）
    GIT->>M: searchVector(indexLabelId, vector, topK)
    M->>RT: search(indexLabelId, vector, topK)
    RT->>JV: GraphSearcher.search()
    JV-->>RT: Iterator vectorId
    RT-->>M: Iterator vectorId
    M->>SS: getVertex(indexLabelId, vectorIds)
    SS-->>M: Set vertexId
    M-->>GIT: Set Id

主要变更

hugegraph-common（抽象层）

文件	职责
`VectorIndexManager`	协调器，管理三大组件的生命周期与交互
`VectorIndexRuntime`	运行时接口，定义 update/search/flush 等操作
`AbstractVectorRuntime`	运行时抽象实现，管理 IndexContext 与版本化持久化
`VectorIndexStateStore`	状态存储接口，定义 scanDeltas/getVertex 操作
`VectorTaskScheduler`	任务调度接口，支持异步任务执行
`VectorRecord`	向量记录 DTO，包含 vectorId/vector/deleted/sequence

hugegraph-core（服务端实现）

文件	职责
`ServerVectorRuntime`	JVector 运行时实现，支持 COSINE/EUCLIDEAN/DOT_PRODUCT
`ServerVectorStateStore`	RocksDB 状态存储实现，基于 IdPrefixQuery 扫描增量
`ServerVectorScheduler`	基于 EventHub 的事件驱动调度实现

核心设计

1. 增量同步机制

通过 sequence 水位追踪，仅同步新增或修改的向量记录到 JVector 内存索引。

2. IndexContext 管理

每个 IndexLabel 对应一个 IndexContext，封装向量数据、JVector 构建器与元数据。

3. 版本化持久化

采用符号链接切换机制，支持原子性版本更新与旧版本回滚。

flowchart LR
    subgraph Dir["目录结构"]
        BASE["{basePath}/{indexLabelId}/"]
        BASE --> CUR["current → version_xxx（符号链接）"]
        BASE --> V1["version_20250101_120000/"]
        BASE --> V2["version_20250101_110000/"]
        V1 --> IDX1["index.inline"]
        V1 --> META1["vector_meta.json"]
    end

4. 软删除策略

删除操作仅将节点标记为已删除状态，实际清理在 flush 时进行。

搜索流程

搜索返回 Set<Id>（vertexId），可直接用于构建 FixedIdHolder 进而得到 IdHolderList，与现有索引查询框架无缝集成。

新增依赖

依赖	版本	用途
jvector	3.0.0	HNSW 向量索引实现

后续工作

待完成

集成到 GraphIndexTransaction.queryByUserprop() 查询路径
实现 doVectorIndex() 方法
REST API / Gremlin Step 支持向量搜索语法
stop() 时的旧版本文件清理

待补充测试

测试类型	测试内容
单元测试	VectorIndexManager 生命周期
单元测试	ServerVectorRuntime 增量更新与搜索
单元测试	AbstractVectorRuntime 版本化持久化
集成测试	RocksDB + JVector 端到端搜索
性能测试	不同规模向量下的搜索延迟

验证这些更改

无需测试的微小重构/代码清理。
已由现有测试覆盖，例如 (请在此处修改测试)。
需要测试，可通过以下方式验证：
- xxx

本 PR 是否可能影响以下部分？

依赖项（添加/更新许可证信息与 regenerate_known_dependencies.sh）
修改配置
公共 API
其他影响（新增管理向量索引的框架）
无影响

…che#2893) * docs(pd): update test commands and improve documentation clarity * Update README.md --------- Co-authored-by: imbajin <jin@apache.org>

* update(store): fix some problem and clean up code - chore(store): clean some comments - chore(store): using Slf4j instead of System.out to print log - update(store): update more reasonable timeout setting - update(store): add close method for CopyOnWriteCache to avoid potential memory leak - update(store): delete duplicated beginTx() statement - update(store): extract parameter for compaction thread pool(move to configuration file in the future) - update(store): add default logic in AggregationFunctions - update(store): fix potential concurrency problem in QueryExecutor * Update hugegraph-store/hg-store-common/src/main/java/org/apache/hugegraph/store/query/func/AggregationFunctions.java --------- Co-authored-by: Peng Junzhi <78788603+Pengzna@users.noreply.github.com>

* fix(store): fix duplicated definition log root

…p ci & remove duplicate module (apache#2910) * add missing license and remove binary license.txt * remove dist in commons * fix tinkerpop test open graph panic and other bugs * empty commit to trigger ci

… methods to RaftReflectionUtil (apache#2906)

Co-authored-by: imbajin <jin@apache.org>

…ache#2919)

* chore: update the status of distributed modules Eliminated mentions of BETA status from AGENTS.md, README.md, and configuration files for HugeGraph PD and Store. This clarifies the current development status and streamlines documentation for production use. * docs: update README with requirements and architecture info Added sections for Requirements and Architecture, specifying Java and Maven versions and deployment options. Updated Docker command to use version 1.7.0. Included build from source instructions with Maven command. * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update run-api-test.sh --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: VGalaxies <vgalaxies@apache.org>

codecov · 2025-12-30T03:25:08Z

Codecov Report

❌ Patch coverage is 0% with 587 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (vector-index@c92710c). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...java/org/apache/hugegraph/api/auth/ManagerAPI.java	0.00%	105 Missing ⚠️
...g/apache/hugegraph/vector/ServerVectorRuntime.java	0.00%	76 Missing ⚠️
...pache/hugegraph/vector/ServerVectorStateStore.java	0.00%	60 Missing ⚠️
...apache/hugegraph/structure/HugeVectorIndexMap.java	0.00%	46 Missing ⚠️
...hugegraph/backend/serializer/BinarySerializer.java	0.00%	39 Missing ⚠️
...n/java/org/apache/hugegraph/core/GraphManager.java	0.00%	33 Missing ⚠️
...va/org/apache/hugegraph/api/filter/PathFilter.java	0.00%	22 Missing ⚠️
...apache/hugegraph/type/define/IndexVectorState.java	0.00%	20 Missing ⚠️
...he/hugegraph/store/client/query/QueryExecutor.java	0.00%	15 Missing ⚠️
...he/hugegraph/backend/tx/GraphIndexTransaction.java	0.00%	14 Missing ⚠️
... and 37 more

Additional details and impacted files

@@              Coverage Diff               @@
##             vector-index   #2922   +/-   ##
==============================================
  Coverage                ?   0.07%           
  Complexity              ?      22           
==============================================
  Files                   ?     785           
  Lines                   ?   65385           
  Branches                ?    8367           
==============================================
  Hits                    ?      51           
  Misses                  ?   65332           
  Partials                ?       2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…che#2938)

Signed-off-by: slightsharp <slightsharp@outlook.com>

…e#2941) Add a unit test that explicitly covers the failure scenario described in the PR, where ContextCallable fails before entering runAndDone(). The test verifies that Consumers.await() does not hang when the worker task fails during ContextCallable execution, relying on safeRun() to always decrement the latch in its finally block. This test would deadlock on the previous implementation and passes with the current fix, ensuring the issue cannot regress.

…e#3017) - Align legacy cache invalidation producers and listeners on ACTION_INVALID and ACTION_CLEAR, removing the obsolete ACTION_INVALIDED/ACTION_CLEARED constants. - Add EventHub.notifyExcept(...) so cache transactions and the cache notifier bridge can avoid re-processing their own local listener while still delivering events to other listeners. - Track registered graph/schema cache listeners per graph so notifyExcept(...) uses the listener instance actually registered on the EventHub, including multi-transaction cases where later transactions reuse the first listener. - Update cache notifier forwarding to prevent local RPC bridge loops after action names are unified. - Add regression coverage for notifyExcept semantics, graph/schema action names, listener teardown/re-registration, and notifier no-loop behavior. - The holder keeps the EventHub listener registered while any transaction for the graph is alive, and unregisters/removes it only when the last transaction releases it. The registry update, ref-count decrement, and hub unlisten now run inside ConcurrentMap.compute() to avoid owner-closes-first invalidation gaps. Also add graph/schema regression coverage for owner-first close and last-close cleanup, including graph close/reopen handling for stale EventHub holders.

## Main Changes Change `ServerInfoManager.selfNodeId()` which returns "server-1" previously to "{graphname}/server-1" ## Upgrade impact This change namespaces the server id by graph name, so old unfinished tasks in the non-PD local scheduler may still reference the previous bare server id, for example `server-1`. Those historical tasks can remain visible, but they may not be restored or cancelled by the new namespaced server id after upgrade. The impact is limited to unfinished local-scheduler tasks that already existed before upgrading; newly scheduled tasks use the new namespaced id. To avoid this compatibility edge case, finish or cancel pending local tasks before upgrading.

- Fix .gitattribut -> .gitattributes typo in .dockerignore - Fix **/*.tar.gz* -> **/*.tar.gz (remove unintended trailing wildcard) - Remove **/target/dist/ (redundant, already covered by **/target/) - Restore cron to apt-get install in all 4 Dockerfiles to keep the existing start-hugegraph.sh -m true monitor path working

Normalize server-side schema ~create_time userdata in SchemaElement so serializer reloads and fromMap paths keep the Date contract. Add SchemaElement, TextSerializer, and BinarySerializer coverage. The builder accumulates userdata via Userdata.put() before eliminate() runs, so `.userdata(CREATE_TIME, "").eliminate()` parsed "" as a date and threw before the key-only removal path. Pass a blank ~create_time through unchanged; non-empty malformed values still throw on the add path, so the existing contract is unchanged.

…compose way (apache#3021) -Added hbase-shaded-client and hbase-endpoint dependencies instead of custom hbase-shaded-endpoint library. -Added docker files and HBASE.md containing instructions for HBase backend - Updated known-dependencies.txt to reflect the minimal allowlist. Improved pom.xml comments to document exclusion rationales and addressed automated review feedback regarding dependency management.

- Replace exploratory README steps with the actual packaged archive path - Use the version placeholder instead of hard-coded 1.7.0 - Keep the PR focused on the source-build documentation fix --------- Co-authored-by: imbajin <jin@apache.org>

…3035) Normalizes PropertyKey default values to their declared data type upon retrieval. Previously, values stored in userdata could lose their original type during serialization and deserialization (e.g., Date becoming String), leading to type mismatches. The `defaultValue()` method now converts deserialized string representations back to their expected runtime types. This change is verified with extensive tests covering schema parsing, vertex property assignment, and both binary and text serializers.

…ipts (apache#3044) The previous implementation captured $! after the daemon/foreground if/else block. The script blocked at hugegraph-server.sh until Java exited, then $! was empty, the pid file got an empty string, and the script exited 0, losing Java's exit code entirely.

…he#3047) In foreground mode (-d false), start-hugegraph-pd.sh had no foreground branch — the script always backgrounded Java with exec ... &, wrote $! to the pid file, and exited 0, losing Java's exit code entirely. Fix: add DAEMON="true" default and -d flag to getopts. In the daemon branch, keep the existing exec ... & pattern. In the foreground branch, write $$ to the pid file before exec (exec replaces the shell with Java, so $$ == Java's PID after exec), then exec java without & so the process blocks and Java's exit code propagates out directly. No trap needed in the foreground branch — exec replaces the shell process with Java, so signals from Docker/systemd go directly to Java without a wrapper to forward through. Add test-start-hugegraph-pd.sh with 4 tests (daemon regression, foreground blocking, exit code propagation on SIGKILL, SIGTERM forwarding via exec) — 12 assertions, all pass after the fix. Baseline on unmodified code: 3 passed, 9 failed. After fix: 12 passed, 0 failed. Wire test into pd-store-ci.yml for the RocksDB backend. Related to: apache#3043

* optimize: Optimize RocksDB batch query performance * Refactor getByIds to queryByIds in RocksDBTable * Modify queryByIds to use super method temporarily Temporarily use super.queryByIds() instead of getByIds() for batch version support. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

…fields to the index label.

* feat(server): Add the vector index type and the detection of related fields to the index label. * fix code format * add annsearch API * add doc to explain the plan

# This is the 1st commit message: add Licensed to files # This is the commit message apache#2: feat(server): support vector index in graphdb (apache#2856) * feat(server): Add the vector index type and the detection of related fields to the index label. * fix code format * add annsearch API * add doc to explain the plan delete redundency in vertexapi

… ANN search and versioned persistence

…uenceAllocator interface and the VectorIdAllocator class.

… into the original server pipeline

JisoLya and others added 17 commits October 31, 2025 19:03

docs(store): update guidance for store module (apache#2894)

126885d

docs(pd): update test commands and improve documentation clarity (apa…

f92c5a4

…che#2893) * docs(pd): update test commands and improve documentation clarity * Update README.md --------- Co-authored-by: imbajin <jin@apache.org>

chore(server): bump rocksdb version from 7.2.2 to 8.10.2 (apache#2896)

d7697f4

fix(store): handle NPE in getVersion for file (apache#2897)

00e040b

* fix(store): fix duplicated definition log root

feat(server): add path filter for graphspace (apache#2898)

2e0cffe

fix(server): support GraphAPI for rocksdb & add tests (apache#2900)

ca5fc0c

refactor(server): remove graph param in auth api path (apache#2899)

b7998c1

fix: migrate to LTS jdk11 in all Dockerfile (apache#2901)

de0360b

feat: init serena memory system & add memories (apache#2902)

496b150

fix(server): fix reflect bug in init-store.sh (apache#2905)

41d0dbc

fix: add missing license and remove binary license.txt & fix tinkerpo…

b12425c

…p ci & remove duplicate module (apache#2910) * add missing license and remove binary license.txt * remove dist in commons * fix tinkerpop test open graph panic and other bugs * empty commit to trigger ci

refactor(store): fix reflection parameter error and extract duplicate…

9a3daf8

… methods to RaftReflectionUtil (apache#2906)

docs: migrate 1.5.0 in readme to 1.7.0 (apache#2914)

18569c4

fix: use slim docker image (apache#2903)

534c81e

feat: add slack channel (apache#2920)

c6d94b4

Co-authored-by: imbajin <jin@apache.org>

fix(pd): pd raft-follower failed to get leader address due to npe (ap…

d28526e

…ache#2919)

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. dependencies Incompatible dependencies of package feature New feature labels Dec 19, 2025

Tsukilc and others added 8 commits January 4, 2026 16:27

fix(server): fix npe in non-auth mode (apache#2912)

2432603

fix: optimize code and update risky deps (apache#2918)

423ede0

docs: fix Cypher documentation link in README (apache#2925)

d641fdb

chore(server): remove outdated ConfigAuthenticator (apache#2927)

a93cc21

test(cluster-test): bump ct to version 1.7.0 (apache#2921)

37be6cd

refactor(server): support update TTL in labels & enhance configs (apa…

e0f572b

…che#2938)

docs: fix some typos in comments (apache#2943)

99baf2b

Signed-off-by: slightsharp <slightsharp@outlook.com>

dpol1 and others added 14 commits May 15, 2026 11:25

chore(ci): enable hugegraph-struct tests (apache#3038)

b9a3dd9

fix(server): avoid extracting text range filters (apache#3034)

beb30ee

docs(server): add Docker with HBase validation runbook

a49bf66

fix(server): handle match() in no index case (apache#3039)

7f0a44a

hahahahbenny force-pushed the vector-manager branch from 08d57a6 to de256a7 Compare June 5, 2026 13:50

hahahahbenny and others added 15 commits June 6, 2026 21:14

feat(server): Add the vector index type and the detection of related …

c5f9ebb

…fields to the index label.

fix code format

f6833b4

add annsearch API

c27ead8

add doc to explain the plan

8dc2408

feat: add RocksDB CF for vector index with serialize/deserialize support

67d56ba

fix master merge conflict

e7c4afd

feat(server): support vector index in graphdb (apache#2856)

5e0016c

* feat(server): Add the vector index type and the detection of related fields to the index label. * fix code format * add annsearch API * add doc to explain the plan

delete redundant method

68e0877

introduce VectorIndexManager runtime framework with incremental sync,…

f0f806d

… ANN search and versioned persistence

Move vector-related components under hugegraph-struct and Add the Seq…

2a3122f

…uenceAllocator interface and the VectorIdAllocator class.

add license to vector Sequence allocator and vectorId allocator

d7cd4ad

Add unit tests for vector index and integrate vector index operations…

e8013fb

… into the original server pipeline

add vector index end to end test and fix bug

de256a7

Merge branch 'vector-index' into vector-manager

b1b464d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: introduce VectorIndexManager runtime framework with incremental sync, ANN search and versioned persistence#2922

chore: introduce VectorIndexManager runtime framework with incremental sync, ANN search and versioned persistence#2922
hahahahbenny wants to merge 85 commits into
apache:vector-indexfrom
hahahahbenny:vector-manager

hahahahbenny commented Dec 19, 2025 •

edited

Loading

Uh oh!

codecov Bot commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

hahahahbenny commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose of the PR

Architecture

Overall Architecture

Data Flow

Main Changes

hugegraph-common (abstraction layer)

hugegraph-core (server-side implementation)

Core Design

1. Incremental Sync Mechanism

2. IndexContext Management

3. Versioned Persistence

4. Soft Delete Strategy

Search Flow

New Dependencies

Follow-up Work

To be completed

Tests to be added

Verifying these changes

Does this PR potentially affect the following parts?

Documentation Status

概述

架构设计

整体架构

数据流

主要变更

hugegraph-common（抽象层）

hugegraph-core（服务端实现）

核心设计

1. 增量同步机制

2. IndexContext 管理

3. 版本化持久化

4. 软删除策略

搜索流程

新增依赖

后续工作

待完成

待补充测试

验证这些更改

本 PR 是否可能影响以下部分？

Uh oh!

codecov Bot commented Dec 30, 2025

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

hahahahbenny commented Dec 19, 2025 •

edited

Loading