Skip to content

feat: IPv6 canonicalization to RFC 5952 behind the experimental flag#499

Merged
SJrX merged 3 commits into
242.xfrom
ipv6-canonicalization-v2
Jul 1, 2026
Merged

feat: IPv6 canonicalization to RFC 5952 behind the experimental flag#499
SJrX merged 3 commits into
242.xfrom
ipv6-canonicalization-v2

Conversation

@SJrX

@SJrX SJrX commented Jun 29, 2026

Copy link
Copy Markdown
Owner

What

A weak-warning inspection + quick-fix that rewrites a non-canonical IPv6 address to its RFC 5952 form (e.g. 2001:DB8:0:0:0:0:0:12001:db8::1), anywhere an IPv6 address appears in a grammar-validated value.

  • Ipv6.kt — pure-Kotlin RFC 5952 canonicalization.
  • Ipv6CanonicalFormInspection — walks the value-coloring layer's labeledRegions to find the IPv6-address span and flags it if not canonical (WEAK_WARNING, flag-gated).
  • CanonicalizeIpv6QuickFix — rewrites it.

Why a new PR

This is the same work as #498, which got merged into the grammar-value-coloring branch instead of 242.x — value coloring landed first via #497's squash, so the IPv6 commits never actually reached 242.x. This re-cuts them cleanly against current 242.x (which already has the value-coloring layer + the Role.IDENTIFIER removal it depends on).

Verification

./gradlew test and ./gradlew test -Dsystemd.unit.grammarParseEngine=true both pass.

🤖 Generated with Claude Code

Steve Ramage and others added 2 commits June 29, 2026 05:26
Offers a quick-fix to rewrite a non-canonical IPv6 address to its recommended form.
Behind the experimental flag.

- canonicalizeIpv6 (pure, dependency-free): parses to 8 groups and reformats per RFC
  5952 §4 — lowercase hex, drop leading zeros, compress the longest zero run to "::"
  (leftmost on ties, only for runs of 2+, never a single zero group). Returns null for
  non-IPv6 input and, for now, for embedded-IPv4 (§5 mixed notation) addresses.
- Combinator.labeledRegions(value): the grammar's explicit Labeled spans (e.g. a whole
  IP address) from the first fully-valid parse — lets features act on semantic spans.
- Ipv6CanonicalFormInspection: flag-gated; for grammar-backed options it scans labeled
  spans and registers a WEAK_WARNING + CanonicalizeIpv6QuickFix on any IPv6 that isn't
  already canonical. Reuses the IPV4_ADDR/IPV6_ADDR Labeled(LITERAL) spans we added for
  coloring, so no IPv6-specific engine markers were needed.

Tests: canonicalizer cases incl. zero-run/tie/single-zero/idempotence/non-IPv6; e2e
warning + quick-fix rewriting 2001:DB8::1 -> 2001:db8::1, and nothing when canonical or
the flag is off.

Closes #363. Refs #467

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
labeledRegions() was inserted between colorize()'s doc comment and its body;
reorder so each function sits under its own KDoc.
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

Unit Test Results (grammar engine false)

1 178 tests   1 178 ✅  47s ⏱️
  308 suites      0 💤
  308 files        0 ❌

Results for commit 421ae7c.

♻️ This comment has been updated with latest results.

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

Unit Test Results (grammar engine true)

1 178 tests   1 178 ✅  48s ⏱️
  308 suites      0 💤
  308 files        0 ❌

Results for commit 421ae7c.

♻️ This comment has been updated with latest results.

The IPv6 canonicalization inspection walked every Labeled value span and
ran canonicalizeIpv6() on each, treating "the string happens to parse as
IPv6" as "this is an IPv6 address". That was only correct by luck: the
sole Labeled spans today are IPV4_ADDR and IPV6_ADDR, and IPv4 is excluded
because canonicalizeIpv6() bails on a dot. The invariant ("no other
Labeled span ever matches an 8-hextet shape") lived nowhere in the code,
and its violation wouldn't just false-positive — the quick-fix would
rewrite the span, corrupting whatever it really was.

Give Labeled an optional SemanticTag threaded onto Region, tag IPV6_ADDR
as SemanticTag.IPV6, and have the inspection act only on tagged spans. The
grammar now declares "this span is IPv6" and the inspection trusts it;
canonicalizeIpv6() is demoted from detector to pure formatter (it still
returns null for out-of-scope IPv4-tail forms). Role stays colour-only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@SJrX SJrX merged commit 8202a40 into 242.x Jul 1, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant