Skip to content

fix!: reject non-ASCII headers at the model layer and resolve pagination follow-ups#207

Merged
OmarAlJarrah merged 3 commits into
mainfrom
pagination-followups-and-ascii-headers
Jul 3, 2026
Merged

fix!: reject non-ASCII headers at the model layer and resolve pagination follow-ups#207
OmarAlJarrah merged 3 commits into
mainfrom
pagination-followups-and-ascii-headers

Conversation

@OmarAlJarrah

@OmarAlJarrah OmarAlJarrah commented Jul 3, 2026

Copy link
Copy Markdown
Member

Summary

Resolves the six open issues in the #167#206 range (the rest were already closed as part of the earlier simplification sweep). Two themes: tightening header validation, and finishing the pagination-unification follow-ups.

Header validation (breaking)

Headers.Builder and HttpHeaderName.fromString previously accepted non-ASCII header names and values, rejecting only control characters. But both reference transports — OkHttp and the JDK HttpClient — reject non-ASCII, so a caller who set a non-ASCII header had it silently dropped from the wire on every transport, with only a per-request WARN log to show for it.

  • Reject non-ASCII at the model layer (HeaderValidation): names and values are now validated to printable ASCII at construction, so a header no transport can send fails fast with IllegalArgumentException instead of vanishing mid-dispatch. Pre-1.0 validation break; no public signatures change in sdk-core, so its apiCheck is unchanged.

  • Make the transport dropped-header log policy configurable. After the tightening, the only header that still reaches the transport's drop path is a model-valid but non-token one (e.g. a name with an interior space, which OkHttp/JDK reject under their stricter RFC 7230 token grammar). How loudly that surfaces is now the caller's choice via a new DroppedHeaderLogging enum, exposed on both transports' Builder and their BYO create(client, …) factory:

    • PER_OCCURRENCE — WARN on every drop (a per-request audit trail).
    • ONCE_PER_HEADER (default) — WARN the first time each header name is dropped by a transport, verbose thereafter: visible at default log levels without flooding a hot path.
    • VERBOSE_ONLY — never WARN; drops log only at verbose.

    The log level remains the operator's to filter via SLF4J; the policy only chooses the level and dedup the transport emits, which a level alone can't express.

Resolves #167, #168.

Pagination follow-ups

Testing

./gradlew :sdk-core:test :sdk-core:ktlintCheck :sdk-core:detekt :sdk-core:apiCheck, plus the equivalents for sdk-transport-okhttp and sdk-transport-jdkhttp, all pass locally. The transports' drop-not-throw tests were repurposed from a non-ASCII header (no longer constructible) to a model-valid non-token name (interior space) that the transports still reject. The DroppedHeaderLogging decision is unit-tested directly on RequestAdapter (the transport test modules run on slf4j-nop, so emitted logs can't be captured), with a Builder wiring smoke test per transport. New public API (DroppedHeaderLogging, the Builder setter, and the create overloads) is captured via apiDump.

Closes #167, #168, #203, #204, #205, #206

…ion follow-ups

The header model previously accepted non-ASCII header names and values, but both
reference transports (OkHttp and the JDK HttpClient) reject them, so such a header
was silently dropped from the wire on every transport. Tighten HeaderValidation to
reject non-ASCII names and values at construction — one fail-fast contract that
matches what the transports can actually emit — and lower the transports' residual
header-drop log (now only a model-valid but non-token header, e.g. one carrying an
interior space) from WARN to verbose so a hot path re-setting such a header no longer
floods the log. Pre-1.0 validation break; no public signature changes.

Also resolves the follow-ups to the pagination unification:

- byPage().stream() no longer strands the held page's connection when a
  short-circuiting terminal (findFirst/limit/anyMatch) abandons the stream: the
  stream registers the view's close() as its close handler, so try-with-resources on
  the Stream releases the held page.
- Dedupe the pagination internals: a shared parseOrClose (the close-on-parse-failure
  invariant, previously copied into both paginators), a single orderedStream
  iterator-to-Stream adapter, and a shared requirePositiveMaxPages guard called from
  all four constructors.
- Fix stale KDoc and reference docs left over from the unification (strategy parse
  returns PageInfo, not Page; a delivered page's body is typically already drained;
  three shipped strategies, not four).
- Add coverage for the response-close branches that had none: the sync parse-failure
  release, the drainPage throwing-sink-over-throwing-close intersection, and both
  closeStagedPageQuietly swallow paths.

Closes #167, #168, #203, #204, #205, #206
Building on the header-drop change, expose how loudly a transport surfaces a
model-valid header it cannot encode (e.g. a name with an interior space, which
OkHttp and the JDK client reject under their stricter RFC 7230 token grammar).

Pure-verbose logging left the loss invisible at default log levels, and warning
on every request floods the log when a hot path re-sets the same header. Neither
extreme suits everyone, so make it the caller's choice via a new
DroppedHeaderLogging enum:

- PER_OCCURRENCE — WARN on every drop (a per-request audit trail).
- ONCE_PER_HEADER — WARN the first time each header name is dropped by a
  transport, verbose thereafter. The default: visible without flooding.
- VERBOSE_ONLY — never WARN; drops log only at verbose.

Both transports expose it on their Builder and on the BYO create(client, ...)
factory; RequestAdapter carries the policy and a per-(transport, header-name)
dedup guard. The log level itself remains the operator's to filter via SLF4J —
this only chooses the level and dedup the transport emits, which a level alone
cannot express.

The per-header decision is unit-tested directly (the transport test modules run
on slf4j-nop and cannot capture emitted logs), with a Builder wiring smoke test
per transport.
…paths

Header value validation was tightened to reject non-ASCII bytes, but the same
validator gates the inbound path, so a response header carrying legal obs-text
(RFC 7230 permits 0x80+ in a field-value, e.g. a Latin-1 Content-Disposition
filename) was silently dropped from the parsed Response. The tightening was also
justified as "no transport can send it", which is false for the JDK HttpClient
(it serialises obs-text on the wire).

- Add Headers.Builder.addUnsafeNonAscii: strict name, lenient value (control
  characters still rejected, non-ASCII permitted). Both response adapters use it
  when copying response headers, mirroring OkHttp's lenient read path, so legal
  obs-text values survive.
- Keep outbound, caller-set value validation strict, matching OkHttp's
  field-value grammar; reword the docs to state that as a deliberate
  most-restrictive-transport policy rather than a claim about all transports.
- Align MediaType's value check with the shared header-value predicate so a
  media type's rendered form always satisfies the outbound grammar.

Also fixes three pagination/transport robustness issues:

- parseOrClose no longer lets a throwing response close() mask the parse failure;
  the close error is attached as suppressed.
- CloseablePages.stream() wraps the checked IOException from close() in
  UncheckedIOException instead of sneaky-throwing it undeclared through
  Stream.close().
- The transport dropped-header WARN dedup is gated on the WARN level being
  enabled (so a disabled level does not consume the one-shot) and bounded so a
  caller emitting unbounded distinct malformed names cannot grow it without limit.
@OmarAlJarrah OmarAlJarrah merged commit b015fd3 into main Jul 3, 2026
1 check passed
@OmarAlJarrah OmarAlJarrah deleted the pagination-followups-and-ascii-headers branch July 3, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment