Skip to content

Add Cassandra compaction progress metrics via JMX handler#2912

Open
jkoronaAtCisco wants to merge 6 commits into
open-telemetry:mainfrom
jkoronaAtCisco:cassandra_compaction_progress
Open

Add Cassandra compaction progress metrics via JMX handler#2912
jkoronaAtCisco wants to merge 6 commits into
open-telemetry:mainfrom
jkoronaAtCisco:cassandra_compaction_progress

Conversation

@jkoronaAtCisco

Copy link
Copy Markdown

Description:

Adds two new Cassandra gauges to the JMX scraper:

  • cassandra.compaction.progress.bytes — bytes completed for in-flight compactions
  • cassandra.compaction.progress.total — total bytes for in-flight compactions

Both metrics carry taskType, keyspace, and columnfamily attributes and are grouped by that composite key. Only compactions measured in bytes are reported. Values are summed across parallel compaction tasks sharing the same key.

Implemented as a code-based ExperimentalJmxMetricHandler (SPI introduced in opentelemetry-java-instrumentation 2.29.0) because the Compactions MBean attribute requires iteration, grouping, and BigInteger string parsing — none of which can be expressed in declarative YAML rules.

Testing:

Unit tests added to cover new CassandraCompactionProgressHandler.

The Cassandra integration test was refactored to reliably produce visible compaction activity: data seeding now happens before the scraper starts, and compaction is triggered only after the scraper is running, eliminating a race condition in the original approach.

Notes:

Action required before merge: dependencyManagement/build.gradle.kts currently pins otelInstrumentationVersion = "2.29.0-alpha-SNAPSHOT". This must be updated to the stable published release (e.g. 2.29.0-alpha) before the PR is submitted, as SNAPSHOT versions are non-deterministic across CI builds and do not resolve from Maven Central.

@jkoronaAtCisco jkoronaAtCisco requested a review from a team as a code owner June 15, 2026 13:57
Copilot AI review requested due to automatic review settings June 15, 2026 13:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds Cassandra compaction byte-progress metrics to the JMX scraper via a new code-based handler, and wires it into the Cassandra target system configuration and tests.

Changes:

  • Introduces CassandraCompactionProgressHandler emitting cassandra.compaction.progress.bytes / .total grouped by taskType, keyspace, columnfamily.
  • Registers the handler via Java SPI and Cassandra YAML, and adds unit + integration test coverage.
  • Adds an integration-test lifecycle hook to trigger long-running compactions after the scraper starts.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
jmx-scraper/src/main/java/io/opentelemetry/contrib/jmxscraper/handler/CassandraCompactionProgressHandler.java New handler that queries Cassandra CompactionManager and emits progress gauges.
jmx-scraper/src/main/resources/META-INF/services/io.opentelemetry.instrumentation.jmx.internal.ExperimentalJmxMetricHandler Registers the new handler via ServiceLoader.
jmx-scraper/src/main/resources/cassandra.yaml Enables the handler for the Cassandra target system.
jmx-scraper/src/test/java/io/opentelemetry/contrib/jmxscraper/handler/CassandraCompactionProgressHandlerTest.java Adds unit tests for grouping/parsing/skipping behavior.
jmx-scraper/src/integrationTest/java/io/opentelemetry/contrib/jmxscraper/target_systems/TargetSystemIntegrationTest.java Adds an afterScraperStarted hook to support target-side actions post-start.
jmx-scraper/src/integrationTest/java/io/opentelemetry/contrib/jmxscraper/target_systems/CassandraIntegrationTest.java Seeds data + triggers compaction to validate new progress metrics in integration tests.
jmx-scraper/build.gradle.kts Adds Mockito dependency for new unit tests.
dependencyManagement/build.gradle.kts Updates instrumentation version to a SNAPSHOT.
CHANGELOG.md Documents the new compaction progress metrics.

Comment thread dependencyManagement/build.gradle.kts Outdated
@SylvainJuge

Copy link
Copy Markdown
Contributor

A while ago with @robsunday we started working on trying to align JMX metrics across multiple implementation, the goal is to use instrumentation as the reference for JMX metrics definitions, then those metrics definitions are inherited by jmx-scraper.

There are still parts that have not yet been processed, they are listed in
open-telemetry/opentelemetry-java-instrumentation#12158 comments and sub-tasks.

Unfortunately the part about Cassandra hasn't been prioritized yet, open-telemetry/opentelemetry-java-instrumentation#14277, so I think it would be better to first contribute on enhancing/aligning Cassandra metrics into instrumentation first before adding new metrics.

@jkoronaAtCisco the benefit you would get is getting better (who would be closer to semconv recommendations) and more stable (as only defined in a single central place) metrics for Cassandra, so anything you would be building on top of that would require less maintenance over time. Would you be interested into contribute to that @jkoronaAtCisco ? If so, I think @robsunday and I could help you if needed.

In addition to that, I think it would be worth checking how trying to capture JMX metrics with handler like in this PR can get inherited from instrumentation into jmx-scraper, I assume it should work like yaml-only metrics but it would be worth double-checking to be sure.

@jkoronaAtCisco

jkoronaAtCisco commented Jun 23, 2026

Copy link
Copy Markdown
Author

Hey @SylvainJuge, I'm ok with taking on the Cassandra alignment for instrumentation. My understanding of the scope if following (please correct me if I'm off):

  • Author a cassandra.yaml rule set in instrumentation/jmx-metrics/library/.../jmx/rules/ covering the Cassandra MBeans, aligned to semconv as we go
  • Add the unit + testcontainers integration tests for the target system.
  • Sort out the compatibility/migration story with the existing cassandra.yaml in jmx-scraper.
  • Then add the compaction metrics from this PR on top of the aligned baseline.

If that all sounds right, I'd suggest we treat this PR as on hold, and I'll open a draft PR for the aligned cassandra.yaml first.

On your second point — whether the handler-based approach in this PR can be inherited by jmx-scraper like YAML-only metrics — I went and checked, and it looks like yes. The ExperimentalJmxMetricHandler SPI from #18782 lives in the shared jmx-metrics library, and discovery goes through ComponentLoader/BeanFinder, which resolves a handler by the name referenced in YAML. Since jmx-scraper already consumes that library via JmxTelemetry/JmxTelemetryBuilder, a handler defined upstream and referenced from cassandra.yaml should be picked up automatically — the one thing to confirm is that the handler is registered as an SPI service (META-INF/services) and ships on the scraper's classpath. I could write a small test on the scraper side to prove the end-to-end inheritance if that'd be useful.

@SylvainJuge

Copy link
Copy Markdown
Contributor

Hey @SylvainJuge, I'm ok with taking on the Cassandra alignment for instrumentation.

This would be awesome ! I would be happy to help you by providing reviews on PR in instrumentation. From experience, I would suggest to start by adding the integration test with a single jmx rule as first step.

One of the common challenges we have with those JMX metrics definitions is that we (on the instrumentation side) often miss expertise on actual target system usage in practice, so if you have this in your skillset or have requirements from your own users it's usually a good driver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants