Skip to content

chore: introduce VectorIndexManager runtime framework with incremental sync, ANN search and versioned persistence#2922

Open
hahahahbenny wants to merge 85 commits into
apache:vector-indexfrom
hahahahbenny:vector-manager
Open

chore: introduce VectorIndexManager runtime framework with incremental sync, ANN search and versioned persistence#2922
hahahahbenny wants to merge 85 commits into
apache:vector-indexfrom
hahahahbenny:vector-manager

Conversation

@hahahahbenny
Copy link
Copy Markdown
Contributor

@hahahahbenny hahahahbenny commented Dec 19, 2025

Purpose of the PR

This PR implements the vector-index runtime management framework (VectorIndexManager), which coordinates data synchronization between the RocksDB storage layer and the JVector in-memory index, and supports incremental vector updates, ANN search, and index persistence.

Architecture

Overall Architecture

flowchart TB
    %% --- Node style definitions (ClassDefs) ---
    classDef manager fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000;
    classDef memoryComp fill:#bbdefb,stroke:#0d47a1,stroke-width:2px,color:#000;
    classDef memoryStruct fill:#e1bee7,stroke:#4a148c,stroke-width:2px,stroke-dasharray: 5 5,color:#000;
    classDef disk fill:#ffe0b2,stroke:#e65100,stroke-width:2px,shape:cylinder,color:#000;
    classDef file fill:#fff9c4,stroke:#fbc02d,stroke-width:2px,shape:note,color:#000;

    %% ================= Main Architecture =================

    %% --- 1. Top: Orchestration Layer (green background) ---
    subgraph Orchestration ["Orchestration Layer"]
        direction TB
        M["VectorIndexManager<br/>(Main Coordinator)"]:::manager
    end

    %% --- 2. Middle: Memory Layer (blue background) ---
    subgraph MemoryLayer ["Memory Space"]
        direction TB
        
        %% Three core component containers (white background to highlight inner components)
        subgraph Components ["Core Components"]
            direction LR
            
            subgraph Components_vector ["Vector-Related Components"]
               %% StateStore
                SS["VectorStateStore<br/>(KV Data Abstraction)"]:::memoryComp

                %% Runtime
                RT["VectorRuntime<br/>(JVector In-Memory Graph)"]:::memoryComp
            end
            
            %% Scheduler & EventHub
            subgraph Scheduler_Wrap ["Async Scheduling"]
                style Scheduler_Wrap fill:none
                SC["VectorTaskScheduler<br/>(Task Scheduler)"]:::memoryComp
                EH[("EventHub<br/>(In-Memory Queue/RingBuffer)")]:::memoryStruct
            end
        end
    end

    %% --- 3. Bottom: Disk Persistence Layer (orange background) ---
    subgraph DiskLayer ["Disk Space"]
        direction LR
        
        %% RocksDB
        ROCKS[("RocksDB<br/>(WAL / SSTables)")]:::disk
        
        %% JVector Files
        subgraph JVectorFiles ["JVector Persistence"]
            JV_IDX["Index File<br/>(Vector-Graph Data)"]:::file
            JV_META["Meta Data<br/>(Sequence / VectorID)"]:::file
        end
    end

    %% ================= Connection Logic =================

    %% Manager interactions
    M -->|Interactive| SS
    M -->|Interactive| RT

    %% Inside scheduler
    SC -- "push/poll" --> EH

    %% Update coordination flow (dashed for async/data flow)
    M -.->|Submit Update Task| SC
    SC -.->|1. Async: Scan Deltas| SS
    SC -.->|2. Async: Build/Update| RT

    %% Persistence interactions
    SS -- "scanDeltas / getVertex" --- ROCKS
    RT -- "Load / Flush" --- JV_IDX
    RT -- "Read / Write" --- JV_META

    %% Force alignment
    SS ~~~ SC ~~~ RT

    %% ================= Subgraph color styling =================
    %% 1. Orchestration Layer: light green
    style Orchestration fill:#e8f5e9,stroke:#4caf50,stroke-width:2px,stroke-dasharray: 5 5

    %% 2. Memory Layer: light blue
    style MemoryLayer fill:#e3f2fd,stroke:#2196f3,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 3. Core Components area: pure white (makes blue nodes stand out)
    style Components fill:#ffffff,stroke:#90caf9,stroke-width:1px

    %% 4. Disk Layer: light orange
    style DiskLayer fill:#fff3e0,stroke:#ff9800,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 5. JVector file area: transparent or light yellow
    style JVectorFiles fill:#fffde7,stroke:#fbc02d,stroke-width:1px,stroke-dasharray: 3 3
Loading

Data Flow

sequenceDiagram
    participant GIT as GraphIndexTransaction
    participant M as VectorIndexManager
    participant SC as Scheduler
    participant SS as StateStore
    participant RT as Runtime
    participant JV as JVector

    Note over GIT,JV: Write Flow (async)
    GIT->>M: signal(indexLabelId)
    M->>SC: execute(task)
    SC->>SS: scanDeltas(indexLabelId, fromSeq)
    SS-->>SC: Iterator VectorRecord
    SC->>RT: update(indexLabelId, records)
    RT->>JV: addGraphNode / markNodeDeleted

    Note over GIT,JV: Search Flow (sync)
    GIT->>M: searchVector(indexLabelId, vector, topK)
    M->>RT: search(indexLabelId, vector, topK)
    RT->>JV: GraphSearcher.search()
    JV-->>RT: Iterator vectorId
    RT-->>M: Iterator vectorId
    M->>SS: getVertex(indexLabelId, vectorIds)
    SS-->>M: Set vertexId
    M-->>GIT: Set Id
Loading

Main Changes

hugegraph-common (abstraction layer)

File Responsibility
VectorIndexManager Coordinator, manages lifecycle and interaction of the three components
VectorIndexRuntime Runtime interface, defines operations such as update/search/flush
AbstractVectorRuntime Abstract runtime implementation, manages IndexContext and versioned persistence
VectorIndexStateStore State storage interface, defines scanDeltas/getVertex operations
VectorTaskScheduler Task scheduling interface, supports async task execution
VectorRecord Vector record DTO, contains vectorId/vector/deleted/sequence

hugegraph-core (server-side implementation)

File Responsibility
ServerVectorRuntime JVector runtime implementation, supports COSINE/EUCLIDEAN/DOT_PRODUCT
ServerVectorStateStore RocksDB state storage implementation, scans increments based on IdPrefixQuery
ServerVectorScheduler Event-driven scheduling implementation based on EventHub

Core Design

1. Incremental Sync Mechanism

Uses sequence watermarks to track and sync only newly added/modified vector records to the JVector in-memory index.

2. IndexContext Management

Each IndexLabel corresponds to one IndexContext, which encapsulates vector data, JVector builder, and metadata.

3. Versioned Persistence

Employs symbolic link switching to support atomic version updates and rollback of old versions.

flowchart LR
    subgraph Dir["Directory Structure"]
        BASE["{basePath}/{indexLabelId}/"]
        BASE --> CUR["current → version_xxx (symlink)"]
        BASE --> V1["version_20250101_120000/"]
        BASE --> V2["version_20250101_110000/"]
        V1 --> IDX1["index.inline"]
        V1 --> META1["vector_meta.json"]
    end
Loading

4. Soft Delete Strategy

Deletion operations only mark nodes as deleted; actual cleanup occurs during flush.

Search Flow

Search returns Set<Id> (vertexId), which can be directly used to build FixedIdHolder for IdHolderList, seamlessly integrating with the existing index query framework.

New Dependencies

Dependency Version Purpose
jvector 3.0.0 HNSW vector index implementation

Follow-up Work

To be completed

  • Integrate into GraphIndexTransaction.queryByUserprop() query path
  • Implement doVectorIndex() method
  • REST API / Gremlin Step support for vector search syntax
  • Old version file cleanup during stop()

Tests to be added

Test Type Test Content
Unit test VectorIndexManager lifecycle
Unit test ServerVectorRuntime incremental update and search
Unit test AbstractVectorRuntime versioned persistence
Integration test End-to-end search with RocksDB + JVector
Performance test Search latency under different vector scales

Verifying these changes

  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
    • xxx

Does this PR potentially affect the following parts?

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

概述

本 PR 实现了向量索引的运行时管理框架(VectorIndexManager),负责协调 RocksDB 存储层与 JVector 内存索引之间的数据同步,支持向量的增量更新、ANN 搜索和索引持久化。

架构设计

整体架构

flowchart TB
    %% --- 节点样式定义 (ClassDefs) ---
    classDef manager fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000;
    classDef memoryComp fill:#bbdefb,stroke:#0d47a1,stroke-width:2px,color:#000;
    classDef memoryStruct fill:#e1bee7,stroke:#4a148c,stroke-width:2px,stroke-dasharray: 5 5,color:#000;
    classDef disk fill:#ffe0b2,stroke:#e65100,stroke-width:2px,shape:cylinder,color:#000;
    classDef file fill:#fff9c4,stroke:#fbc02d,stroke-width:2px,shape:note,color:#000;

    %% ================= 架构图主体 =================

    %% --- 1. 顶层:协调层 (绿色背景) ---
    subgraph Orchestration ["Orchestration Layer (协调层)"]
        direction TB
        M["VectorIndexManager<br/>(总协调器)"]:::manager


    %% --- 2. 中层:内存组件层 (蓝色背景) ---
    subgraph MemoryLayer ["Memory Space (内存层)"]
        direction TB
        
        %% 三大组件容器 (白色背景,突出内部组件)
        subgraph Components ["Core Components (核心组件)"]
            direction LR
            
            subgraph Components_vector [直接与vector相关组件]
               %% StateStore
                SS["VectorStateStore<br/>(KV数据抽象)"]:::memoryComp

                %% Runtime
                RT["VectorRuntime<br/>(JVector 内存图结构)"]:::memoryComp
            
            end
            %% Scheduler & EventHub
            subgraph Scheduler_Wrap ["异步调度"]
                style Scheduler_Wrap fill:none
                SC["VectorTaskScheduler<br/>(任务调度器)"]:::memoryComp
                EH[("EventHub<br/>(内存队列/RingBuffer)")]:::memoryStruct
            end
            
            
        end
    end
   end
    %% --- 3. 底层:磁盘持久化层 (橙色背景) ---
    subgraph DiskLayer ["Disk Space (磁盘层)"]
        direction LR
        
        %% RocksDB
        ROCKS[("RocksDB<br/>(WAL / SSTables)")]:::disk
        
        %% JVector Files
        subgraph JVectorFiles ["JVector Persistence"]
            JV_IDX["Index File<br/>(向量图数据)"]:::file
            JV_META["Meta Data<br/>(Sequence / VectorID)"]:::file
        end
    end

    %% ================= 连线逻辑 =================

    %% Manager 交互
    M -->|Interactive| SS
    M -->|Interactive| RT

    %% 调度器内部
    SC -- "push/poll" --> EH

    %% Update 协同流程 (虚线体现异步/数据流)
    M -.->|Submit Update Task| SC
    SC -.->|1. Async: Scan Deltas| SS
    SC -.->|2. Async: Build/Update| RT

    %% 持久化交互
    SS -- "scanDeltas / getVertex" --- ROCKS
    RT -- "Load / Flush" --- JV_IDX
    RT -- "Read / Write" --- JV_META

    %% 强制对齐
    SS ~~~ SC ~~~ RT

    %% ================= Subgraph 颜色样式配置 =================
    %% 语法: style [SubgraphID] fill:[背景色],stroke:[边框色],stroke-width:[宽],color:[文字色]
    
    %% 1. 协调层:浅绿色
    style Orchestration fill:#e8f5e9,stroke:#4caf50,stroke-width:2px,stroke-dasharray: 5 5

    %% 2. 内存层:浅蓝色
    style MemoryLayer fill:#e3f2fd,stroke:#2196f3,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 3. 核心组件区:纯白 (让内部的蓝色节点更突出)
    style Components fill:#ffffff,stroke:#90caf9,stroke-width:1px

    %% 4. 磁盘层:浅橙色
    style DiskLayer fill:#fff3e0,stroke:#ff9800,stroke-width:2px,stroke-dasharray: 5 5
    
    %% 5. JVector文件区:透明或微黄
    style JVectorFiles fill:#fffde7,stroke:#fbc02d,stroke-width:1px,stroke-dasharray: 3 3
Loading

数据流

sequenceDiagram
    participant GIT as GraphIndexTransaction
    participant M as VectorIndexManager
    participant SC as Scheduler
    participant SS as StateStore
    participant RT as Runtime
    participant JV as JVector

    Note over GIT,JV: 写入流程(异步)
    GIT->>M: signal(indexLabelId)
    M->>SC: execute(task)
    SC->>SS: scanDeltas(indexLabelId, fromSeq)
    SS-->>SC: Iterator VectorRecord
    SC->>RT: update(indexLabelId, records)
    RT->>JV: addGraphNode / markNodeDeleted

    Note over GIT,JV: 搜索流程(同步)
    GIT->>M: searchVector(indexLabelId, vector, topK)
    M->>RT: search(indexLabelId, vector, topK)
    RT->>JV: GraphSearcher.search()
    JV-->>RT: Iterator vectorId
    RT-->>M: Iterator vectorId
    M->>SS: getVertex(indexLabelId, vectorIds)
    SS-->>M: Set vertexId
    M-->>GIT: Set Id
Loading

主要变更

hugegraph-common(抽象层)

文件 职责
VectorIndexManager 协调器,管理三大组件的生命周期与交互
VectorIndexRuntime 运行时接口,定义 update/search/flush 等操作
AbstractVectorRuntime 运行时抽象实现,管理 IndexContext 与版本化持久化
VectorIndexStateStore 状态存储接口,定义 scanDeltas/getVertex 操作
VectorTaskScheduler 任务调度接口,支持异步任务执行
VectorRecord 向量记录 DTO,包含 vectorId/vector/deleted/sequence

hugegraph-core(服务端实现)

文件 职责
ServerVectorRuntime JVector 运行时实现,支持 COSINE/EUCLIDEAN/DOT_PRODUCT
ServerVectorStateStore RocksDB 状态存储实现,基于 IdPrefixQuery 扫描增量
ServerVectorScheduler 基于 EventHub 的事件驱动调度实现

核心设计

1. 增量同步机制

通过 sequence 水位追踪,仅同步新增或修改的向量记录到 JVector 内存索引。

2. IndexContext 管理

每个 IndexLabel 对应一个 IndexContext,封装向量数据、JVector 构建器与元数据。

3. 版本化持久化

采用符号链接切换机制,支持原子性版本更新与旧版本回滚。

flowchart LR
    subgraph Dir["目录结构"]
        BASE["{basePath}/{indexLabelId}/"]
        BASE --> CUR["current → version_xxx(符号链接)"]
        BASE --> V1["version_20250101_120000/"]
        BASE --> V2["version_20250101_110000/"]
        V1 --> IDX1["index.inline"]
        V1 --> META1["vector_meta.json"]
    end
Loading

4. 软删除策略

删除操作仅将节点标记为已删除状态,实际清理在 flush 时进行。

搜索流程

搜索返回 Set<Id>(vertexId),可直接用于构建 FixedIdHolder 进而得到 IdHolderList,与现有索引查询框架无缝集成。

新增依赖

依赖 版本 用途
jvector 3.0.0 HNSW 向量索引实现

后续工作

待完成

  • 集成到 GraphIndexTransaction.queryByUserprop() 查询路径
  • 实现 doVectorIndex() 方法
  • REST API / Gremlin Step 支持向量搜索语法
  • stop() 时的旧版本文件清理

待补充测试

测试类型 测试内容
单元测试 VectorIndexManager 生命周期
单元测试 ServerVectorRuntime 增量更新与搜索
单元测试 AbstractVectorRuntime 版本化持久化
集成测试 RocksDB + JVector 端到端搜索
性能测试 不同规模向量下的搜索延迟

验证这些更改

  • 无需测试的微小重构/代码清理。
  • 已由现有测试覆盖,例如 (请在此处修改测试)
  • 需要测试,可通过以下方式验证:
    • xxx

本 PR 是否可能影响以下部分?

JisoLya and others added 17 commits October 31, 2025 19:03
…che#2893)

* docs(pd): update test commands and improve documentation clarity

* Update README.md

---------

Co-authored-by: imbajin <jin@apache.org>
* update(store): fix some problem and clean up code

- chore(store): clean some comments
- chore(store): using Slf4j instead of System.out to print log
- update(store): update more reasonable timeout setting
- update(store): add close method for CopyOnWriteCache to avoid potential memory leak
- update(store): delete duplicated beginTx() statement
- update(store): extract parameter for compaction thread pool(move to configuration file in the future)
- update(store): add default logic in AggregationFunctions
- update(store): fix potential concurrency problem in QueryExecutor

* Update hugegraph-store/hg-store-common/src/main/java/org/apache/hugegraph/store/query/func/AggregationFunctions.java

---------

Co-authored-by: Peng Junzhi <78788603+Pengzna@users.noreply.github.com>
* fix(store): fix duplicated definition log root
…p ci & remove duplicate module (apache#2910)

* add missing license and remove binary license.txt

* remove dist in commons

* fix tinkerpop test open graph panic and other bugs

* empty commit to trigger ci
Co-authored-by: imbajin <jin@apache.org>
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. dependencies Incompatible dependencies of package feature New feature labels Dec 19, 2025
* chore: update the status of distributed modules

Eliminated mentions of BETA status from AGENTS.md, README.md, and configuration files for HugeGraph PD and Store. This clarifies the current development status and streamlines documentation for production use.

* docs: update README with requirements and architecture info

Added sections for Requirements and Architecture, specifying Java and Maven versions and deployment options. Updated Docker command to use version 1.7.0. Included build from source instructions with Maven command.

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update run-api-test.sh

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: VGalaxies <vgalaxies@apache.org>
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 30, 2025

Codecov Report

❌ Patch coverage is 0% with 587 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (vector-index@c92710c). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...java/org/apache/hugegraph/api/auth/ManagerAPI.java 0.00% 105 Missing ⚠️
...g/apache/hugegraph/vector/ServerVectorRuntime.java 0.00% 76 Missing ⚠️
...pache/hugegraph/vector/ServerVectorStateStore.java 0.00% 60 Missing ⚠️
...apache/hugegraph/structure/HugeVectorIndexMap.java 0.00% 46 Missing ⚠️
...hugegraph/backend/serializer/BinarySerializer.java 0.00% 39 Missing ⚠️
...n/java/org/apache/hugegraph/core/GraphManager.java 0.00% 33 Missing ⚠️
...va/org/apache/hugegraph/api/filter/PathFilter.java 0.00% 22 Missing ⚠️
...apache/hugegraph/type/define/IndexVectorState.java 0.00% 20 Missing ⚠️
...he/hugegraph/store/client/query/QueryExecutor.java 0.00% 15 Missing ⚠️
...he/hugegraph/backend/tx/GraphIndexTransaction.java 0.00% 14 Missing ⚠️
... and 37 more
Additional details and impacted files
@@              Coverage Diff               @@
##             vector-index   #2922   +/-   ##
==============================================
  Coverage                ?   0.07%           
  Complexity              ?      22           
==============================================
  Files                   ?     785           
  Lines                   ?   65385           
  Branches                ?    8367           
==============================================
  Hits                    ?      51           
  Misses                  ?   65332           
  Partials                ?       2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Tsukilc and others added 8 commits January 4, 2026 16:27
Signed-off-by: slightsharp <slightsharp@outlook.com>
…e#2941)

Add a unit test that explicitly covers the failure scenario described in the PR,
where ContextCallable fails before entering runAndDone().

The test verifies that Consumers.await() does not hang when the worker task
fails during ContextCallable execution, relying on safeRun() to always
decrement the latch in its finally block.

This test would deadlock on the previous implementation and passes with the
current fix, ensuring the issue cannot regress.
dpol1 and others added 14 commits May 15, 2026 11:25
…e#3017)

- Align legacy cache invalidation producers and listeners on ACTION_INVALID and
  ACTION_CLEAR, removing the obsolete ACTION_INVALIDED/ACTION_CLEARED constants.
- Add EventHub.notifyExcept(...) so cache transactions and the cache notifier
  bridge can avoid re-processing their own local listener while still delivering
  events to other listeners.
- Track registered graph/schema cache listeners per graph so notifyExcept(...)
  uses the listener instance actually registered on the EventHub, including
  multi-transaction cases where later transactions reuse the first listener.
- Update cache notifier forwarding to prevent local RPC bridge loops after action
  names are unified.
- Add regression coverage for notifyExcept semantics, graph/schema action names,
  listener teardown/re-registration, and notifier no-loop behavior.

- The holder keeps the EventHub listener registered while any transaction for the
  graph is alive, and unregisters/removes it only when the last transaction
  releases it. The registry update, ref-count decrement, and hub unlisten now run
  inside ConcurrentMap.compute() to avoid owner-closes-first invalidation gaps.

  Also add graph/schema regression coverage for owner-first close and last-close
  cleanup, including graph close/reopen handling for stale EventHub holders.
## Main Changes

Change `ServerInfoManager.selfNodeId()` which returns "server-1" previously to "{graphname}/server-1"

## Upgrade impact

This change namespaces the server id by graph name, so old unfinished tasks in the non-PD local scheduler may still reference the previous bare server id, for example `server-1`.

Those historical tasks can remain visible, but they may not be restored or cancelled by the new namespaced server id after upgrade. The impact is limited to unfinished local-scheduler tasks that already existed before upgrading; newly scheduled tasks use the new namespaced id. To avoid this compatibility edge case, finish or cancel pending local tasks before upgrading.
- Fix .gitattribut -> .gitattributes typo in .dockerignore
- Fix **/*.tar.gz* -> **/*.tar.gz (remove unintended trailing wildcard)
- Remove **/target/dist/ (redundant, already covered by **/target/)
- Restore cron to apt-get install in all 4 Dockerfiles to keep the
  existing start-hugegraph.sh -m true monitor path working
Normalize server-side schema ~create_time userdata in SchemaElement so serializer reloads and fromMap paths keep the Date contract.

Add SchemaElement, TextSerializer, and BinarySerializer coverage.

The builder accumulates userdata via Userdata.put() before eliminate()
runs, so `.userdata(CREATE_TIME, "").eliminate()` parsed "" as a date
and threw before the key-only removal path. Pass a blank ~create_time
through unchanged; non-empty malformed values still throw on the add
path, so the existing contract is unchanged.
…compose way (apache#3021)

-Added hbase-shaded-client and hbase-endpoint dependencies instead of custom hbase-shaded-endpoint library.
-Added docker files and HBASE.md containing instructions for HBase backend
- Updated known-dependencies.txt to reflect the minimal allowlist.
Improved pom.xml comments to document exclusion rationales and
addressed automated review feedback regarding dependency management.
- Replace exploratory README steps with the actual packaged archive path
- Use the version placeholder instead of hard-coded 1.7.0
- Keep the PR focused on the source-build documentation fix

---------

Co-authored-by: imbajin <jin@apache.org>
…3035)

Normalizes PropertyKey default values to their declared data type upon retrieval. Previously, values stored in userdata could lose their original type during serialization and deserialization (e.g., Date becoming String), leading to type mismatches.

The `defaultValue()` method now converts deserialized string representations back to their expected runtime types. This change is verified with extensive tests covering schema parsing, vertex property assignment, and both binary and text serializers.
…ipts (apache#3044)

The previous implementation captured $! after the daemon/foreground
if/else block. The script blocked at hugegraph-server.sh until Java
exited, then $! was empty, the pid file got an empty string, and the
script exited 0, losing Java's exit code entirely.
…he#3047)

In foreground mode (-d false), start-hugegraph-pd.sh had no foreground
branch — the script always backgrounded Java with exec ... &, wrote $!
to the pid file, and exited 0, losing Java's exit code entirely.

Fix: add DAEMON="true" default and -d flag to getopts. In the daemon
branch, keep the existing exec ... & pattern. In the foreground branch,
write $$ to the pid file before exec (exec replaces the shell with Java,
so $$ == Java's PID after exec), then exec java without & so the process
blocks and Java's exit code propagates out directly.

No trap needed in the foreground branch — exec replaces the shell
process with Java, so signals from Docker/systemd go directly to Java
without a wrapper to forward through.

Add test-start-hugegraph-pd.sh with 4 tests (daemon regression,
foreground blocking, exit code propagation on SIGKILL, SIGTERM
forwarding via exec) — 12 assertions, all pass after the fix.

Baseline on unmodified code: 3 passed, 9 failed.
After fix: 12 passed, 0 failed.

Wire test into pd-store-ci.yml for the RocksDB backend.

Related to: apache#3043
* optimize: Optimize RocksDB batch query performance

* Refactor getByIds to queryByIds in RocksDBTable

* Modify queryByIds to use super method temporarily

Temporarily use super.queryByIds() instead of getByIds() for batch version support.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
hahahahbenny and others added 15 commits June 6, 2026 21:14
* feat(server): Add the vector index type and the detection of related fields to the index label.

* fix code format

* add annsearch API

* add doc to explain the plan
# This is the 1st commit message:

add Licensed to files

# This is the commit message apache#2:

feat(server): support vector index in graphdb  (apache#2856)

* feat(server): Add the vector index type and the detection of related fields to the index label.

* fix code format

* add annsearch API

* add doc to explain the plan

delete redundency in vertexapi
…uenceAllocator interface and the VectorIdAllocator class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Incompatible dependencies of package feature New feature size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.