Skip to content

feat(analytics): log request duration and client IP#153

Open
alukach wants to merge 5 commits into
mainfrom
feat/analytics-duration-client-ip
Open

feat(analytics): log request duration and client IP#153
alukach wants to merge 5 commits into
mainfrom
feat/analytics-duration-client-ip

Conversation

@alukach

@alukach alukach commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds two fields to the Analytics Engine request event:

  • double3: duration_ms — wall-clock time spent dispatching the request through the gateway (measured with Date.now() around the gateway dispatch, so it reflects time awaiting the origin). Also added to the response tracing log line.
  • blob8: client_ip — taken from the CF-Connecting-IP header as a first-class attribute. user_id (blob5) is unchanged and remains empty for anonymous requests.

Both fields are appended after the existing columns, so existing queries against blob1–7 / double1–2 are unaffected.

Refactor for testability

RequestEvent now exposes index() / blobs() / doubles() so the schema ordering lives in pure functions; log_request iterates over them. The worker-dependent writer is cfg-gated to wasm32, which lets tests/analytics.rs compile the module natively via the same #[path] pattern used by the existing routing/pagination tests.

Tests

New tests/analytics.rs (9 tests) covering:

  • extract_path_segments (empty, account-only, account/product, nested key)
  • index format and blob/double schema ordering (including the new client_ip and duration_ms columns)
  • file_path truncation to 256 bytes, including UTF-8 char-boundary backoff

🤖 Generated with Claude Code

Adds two fields to the Analytics Engine request event:

- double3: duration_ms — wall-clock time spent dispatching the request
  through the gateway, also added to the response tracing log
- blob8: client_ip — taken from the CF-Connecting-IP header; user_id
  (blob5) remains empty for anonymous requests

Refactors RequestEvent to expose index()/blobs()/doubles() so the
schema ordering is pure and unit-testable on native targets, with the
worker-specific writer cfg-gated to wasm32. Adds tests/analytics.rs
covering path segment extraction, schema ordering, and blob truncation.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown

🚀 Latest commit deployed to https://source-data-proxy-pr-153.source-coop.workers.dev

  • Date: 2026-06-16T21:46:55Z
  • Commit: fc95dee

@alukach alukach requested a review from tylere June 13, 2026 00:02
@claude

claude Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Claude finished @alukach's task in 1m 20s —— View job


✅ No blocking issues — safe to merge.

  • src/analytics.rs:35-47blobs() returns [&str; 8] with lifetime tied to &self rather than 'a. This is intentional (the array is consumed immediately in every call site) but worth noting: callers cannot outlive the borrow of the RequestEvent. No bug in practice.

  • src/analytics.rs:76-82 — The builder loop (for blob in event.blobs()) materialises the array on every call. Fine for a [&str; 8], just confirming it's not an issue.

  • src/lib.rs:210duration_ms = js_sys::Date::now() - start_ms wraps the await point, so it correctly measures wall-clock latency including the async suspension to the origin. ✅

  • src/lib.rs:327client_ip is sourced from cf-connecting-ip. On Cloudflare Workers accessed through the Cloudflare proxy this header is injected by Cloudflare and cannot be spoofed by clients. If the Worker is ever accessed directly (bypassing Cloudflare), the header could be attacker-controlled. Since this is analytics-only (no access control decisions made on this value), the impact is limited to polluted analytics data, not a security vulnerability.

  • src/analytics.rs:106-115truncate_to_byte_limit is only applied to file_path (blob3). Other blobs (user_id, content_type, client_ip) are passed through untruncated. For the new client_ip field specifically, IPv6 addresses top out at ~45 characters so this is fine in practice.

  • tests/analytics.rs — Good coverage of the schema ordering, truncation, and char-boundary backoff. The #[path] pattern correctly exercises the native-compilable subset of analytics.rs.

@tylere tylere left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed off-thread, suggest moving to storing a hashed client IP (w/ fixed secret salt) rather than the raw client IP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants