Skip to content

*: Phase 2.B — vendor pqarrow/builder + arrowutils into pkg/query/internal#6355

Open
thorfour wants to merge 1 commit into
remove-frostdb-phase2from
remove-frostdb-phase2b
Open

*: Phase 2.B — vendor pqarrow/builder + arrowutils into pkg/query/internal#6355
thorfour wants to merge 1 commit into
remove-frostdb-phase2from
remove-frostdb-phase2b

Conversation

@thorfour
Copy link
Copy Markdown
Contributor

@thorfour thorfour commented May 8, 2026

Summary

Phase 2.B of the FrostDB removal. Stacks on #6354 (Phase 2.A).

Drops the pqarrow imports from the query path by vendoring the two helper packages.

Why vendor instead of replacing with arrow-go stdlib builders

The Opt* builders (OptInt64Builder, OptInt32Builder, OptBooleanBuilder) expose random-access mutation methods — Set(idx, v), Add(idx, delta), Value(idx), AppendData([]int64) — that the flamegraph and table query algorithms rely on for backfilling cumulative counts and child indices. arrow-go's array.*Builder is append-only, so swapping would require rewriting both algorithms. Vendoring is mechanical, low-risk, and unblocks Phase 4 (drop frostdb from go.mod).

What's vendored

  • pkg/query/internal/builder/listbuilder.go, optbuilders.go, recordbuilder.go, utils.go from frostdb/pqarrow/builder. Plus optbuilders_test.go (passes against the vendored copy). The AppendParquetValues methods on the Opt builders and the parquet-go import are stripped — parca uses these builders only on the query side and never to convert parquet values, so they're dead code.
  • pkg/query/internal/arrowutils/groupranges.go, merge.go, nullarray.go, schema.go, sort.go, utils.go from frostdb/pqarrow/arrowutils. Plus sort_test.go. merge_test.go and schema_test.go are skipped — they depend on frostdb/internal/records, which is not importable from outside the frostdb module.

Each vendored package gets a doc.go explaining provenance.

Consumers updated

  • pkg/query/{flamegraph_arrow,table}.go: import path swap from frostdb/pqarrow/builder to internal/builder.
  • pkg/query/columnquery.go: import path swap from frostdb/pqarrow/arrowutils to internal/arrowutils.
  • pkg/parcacol/querier.go: still on the upstream arrowutils because pkg/parcacol can't import pkg/query/internal/arrowutils. That file goes away in Phase 3.

go.mod: parquet-go is demoted from direct to indirect (still pulled in by frostdb itself; goes away in Phase 4).

Test plan

  • go build ./...
  • go vet ./...
  • go test -short ./...
  • Vendored builder tests pass (go test ./pkg/query/internal/builder/...)
  • Vendored arrowutils sort tests pass (go test ./pkg/query/internal/arrowutils/...)
  • CI green
  • Manual smoke: open a flamegraph against a ClickHouse-backed Parca

🤖 Generated with Claude Code

Phase 2.B of the FrostDB removal. Drops the pqarrow imports from the
query path by vendoring the two helper packages.

Why vendor instead of replace with arrow-go stdlib builders:

The Opt* builders (OptInt64Builder, OptInt32Builder, OptBooleanBuilder)
expose random-access mutation methods — Set(idx, v), Add(idx, delta),
Value(idx), AppendData([]int64) — that the flamegraph and table query
algorithms rely on for backfilling cumulative counts and child indices.
arrow-go's array.*Builder is append-only, so swapping would require
rewriting both algorithms. Vendoring is mechanical, low-risk, and
unblocks Phase 4 (drop frostdb from go.mod).

What's vendored:

* pkg/query/internal/builder/ — listbuilder.go, optbuilders.go,
  recordbuilder.go, utils.go from frostdb/pqarrow/builder. Plus
  optbuilders_test.go (passes against the vendored copy). The
  AppendParquetValues methods on the Opt builders and the parquet-go
  import are stripped — parca uses these builders only on the query
  side and never to convert parquet values, so they're dead.
* pkg/query/internal/arrowutils/ — groupranges.go, merge.go,
  nullarray.go, schema.go, sort.go, utils.go from
  frostdb/pqarrow/arrowutils. Plus sort_test.go. merge_test.go and
  schema_test.go are skipped — they depend on
  frostdb/internal/records which is not importable from outside the
  frostdb module.

Each vendored package gets a doc.go explaining provenance.

Consumers updated:

* pkg/query/{flamegraph_arrow,table}.go: import path swap from
  frostdb/pqarrow/builder to internal/builder.
* pkg/query/columnquery.go: import path swap from
  frostdb/pqarrow/arrowutils to internal/arrowutils.
* pkg/parcacol/querier.go: still on the upstream arrowutils because
  pkg/parcacol can't import pkg/query/internal/arrowutils. That file
  goes away in Phase 3 anyway.

go.mod: parquet-go is demoted from direct to indirect (still pulled in
by frostdb itself; goes away in Phase 4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alwaysmeticulous
Copy link
Copy Markdown

alwaysmeticulous Bot commented May 8, 2026

✅ Meticulous spotted 0 visual differences across 288 screens tested: view results.

Meticulous evaluated ~4 hours of user flows against your PR.

Expected differences? Click here. Last updated for commit 299a0c0 *: Vendor pqarrow/builder + pqarrow/arrowutils into pkg/query/internal. This comment will update as new commits are pushed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant