feat: per-connection backend authentication via OIDC federation#147
feat: per-connection backend authentication via OIDC federation#147alukach wants to merge 13 commits into
Conversation
|
🚀 Latest commit deployed to https://source-data-proxy-pr-147.source-coop.workers.dev
|
Add a `BackendAuth` field to `DataConnectionDetails` (the Source API's
data-connection shape): `Unsigned` (default, public bucket) or
`S3WebIdentityRole { role_arn }` (federate the proxy's OIDC identity into a
customer role). `resolve_product` branches on it via `apply_backend_auth`:
- Unsigned -> skip_signature (current behavior). Default, so every existing
connection is unchanged since the API omits `authentication` for now.
- S3WebIdentityRole -> auth_type=oidc + oidc_role_arn + a per-connection subject
(scv1:conn:{id}), leaving signing ON so multistore's OIDC backend-auth
middleware injects the federated temporary credentials.
Replaces the long-standing `// TODO: provide real backend credentials` at the
forced `skip_signature` insert.
Not yet live: the federated branch needs the `MaybeOidcAuth` middleware wired
into dispatch (next step). Until then no connection sends `authentication`.
Note: the proxy lib is `cdylib` + `test = false` (wasm-only deps block native
compilation), so this logic isn't unit-tested; verified via
`cargo check/clippy --target wasm32`.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three fixes to the per-connection backend-auth wiring, matching the app-side schema (source.coop): - `authentication` is a SIBLING of `details` on the connection, not nested inside it. The API returns it at the top level of `DataConnection`, so the proxy was silently never seeing the role config — move it to `DataConnection`. - Tolerate auth types this build doesn't implement (the app's scaffolded `gcp_workload_identity` / `azure_workload_identity`): add `#[serde(other)] Unsupported` so an unknown `type` deserializes gracefully and is served unsigned with a warning, instead of failing the whole request. - Hardcode the AWS web-identity audience: set `oidc_audience=sts.amazonaws.com` on the federated branch (a constant — AWS's web-identity convention). Still inert until the `MaybeOidcAuth` middleware is wired (multistore also takes the audience at provider construction today), but the deserialization + option set now match what the API emits and what the middleware will consume. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Finalizes the proxy side of federated backend access. Adds a reqwest-backed
`FetchHttpExchange` (the `HttpExchange` impl for the worker) and wires
`MaybeOidcAuth(AwsBackendAuth(OidcCredentialProvider))` into the gateway via
`.with_middleware`, mirroring multistore's cf-workers example.
For a connection resolved with auth_type=oidc, the middleware now mints the
proxy's RS256 assertion (iss = OIDC_PROVIDER_ISSUER, aud = sts.amazonaws.com,
sub = scv1:conn:{id}), exchanges it at AWS STS (AssumeRoleWithWebIdentity) over
fetch, and injects the temporary credentials so the backend request is signed.
A no-op for connections without auth_type=oidc (unsigned/public).
The audience is hardcoded on the provider (sts.amazonaws.com), so the redundant
per-bucket oidc_audience option is dropped.
Still gated end-to-end on the app surfacing the role to the proxy (#327/#329):
the API redacts `authentication`, so the proxy resolves Unsigned in production
until then — but the proxy path is now complete.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The [patch.crates-io] tracked `branch = "main"`, a moving target: `cargo update` would silently float to a newer commit and a force-push upstream could break the build non-reproducibly — and this ships to production via the deploy workflow. Pin all five crates to the exact commit the lockfile already resolved instead, so the source of truth is explicit. (Still temporary — drop on the crates.io release.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ests Move `BackendAuth`, `apply_backend_auth`, and `AWS_STS_AUDIENCE` out of registry.rs into a new wasm-free `src/backend_auth.rs`, and add `tests/backend_auth.rs` which includes it via `#[path]` — the lib is `cdylib` with `test = false`, so this is the only way to natively unit-test it (mirrors `tests/pagination.rs`). The federation-critical logic was previously untested. Now covered: serde round-trips (unknown type -> Unsupported, the s3_web_identity_role variant) and the option-set translation for each variant. Pure move, no behavior change. Also drops the stale "inert until the middleware is wired (next step)" doc note that this branch already obsoleted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The proxy parses the entire data-connection list in one `serde_json::from_str`, so a single connection with a malformed `authentication` (null, wrong-typed, or a known type missing required fields like `role_arn`) would fail the whole parse and break resolution for *every* product. `#[serde(default)]` only covers an absent field, not a present-but-invalid one. Add a lenient `deserialize_with` on the field: a present value that doesn't parse degrades to `Unsupported` (and `null` to `Unsigned`) instead of erroring, so one bad connection can't poison the list. Covered by new tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The gateway (and its OIDC backend-auth middleware) are rebuilt on every fetch(), so the provider's credential cache was discarded each request — every federated request would re-mint a JWT and re-run AssumeRoleWithWebIdentity to the same role. Hold the provider in an isolate-level OnceLock and clone it per request; cloning shares the cache (enabled by the multistore `feat/shareable-credential-cache` change this rev now pins), so repeat requests reuse cached temporary credentials. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Exercises the full federated path against a deployed proxy: a request for an `s3_web_identity_role`-backed product must mint the proxy assertion, assume the role via AWS STS, and serve a signed read. Auto-discovered by `pytest tests/` and SKIPS unless FEDERATION_TEST_ACCOUNT/PRODUCT/KEY are set, so it's inert in CI until staging is wired with a federated test product + the customer-side IAM OIDC provider/role. No live AWS resources are committed here — that setup is the remaining go-live step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`apply_backend_auth` previously served `Unsupported` (the app-side GCP/Azure workload-identity variants, or a malformed `authentication`) as unsigned with a per-request warning. Serving unsigned could expose an anonymously-readable backend, and the warning spammed once per request for a misconfigured connection. Return `ProxyError::BackendAuthError` instead — deny so the misconfiguration surfaces explicitly (a connection that can't be authenticated shouldn't be served at all). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AWS STS returns its <ErrorResponse> document in the body on 4xx/5xx, and multistore's parse_response reads the error from the body — so FetchHttpExchange must return the body regardless of status. Add a comment so a future maintainer doesn't "fix" it with error_for_status(), which would discard the diagnostic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a `kind()` label to BackendAuth (unsigned / s3_web_identity_role / unsupported — no secrets) and record it on the resolve_product span, so an operator can see which backend-auth path a request took (and correlate a fail-closed Unsupported or an STS 403) without leaking the role ARN. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
apply_backend_auth federates unconditionally, which can read as a missing confused-deputy guard. Document that the guard is the subject-scoped Source API fetch: the caller is authorized for the product/connection before resolution reaches federation, so the proxy never mints a role token for data the caller can't access. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
715c454 to
57068f8
Compare
|
Claude finished @alukach's task in 2m 0s —— View job ✅ No blocking issues — safe to merge.
|
Adapts the proxy's backend connections to make authenticated requests, replacing the always-unsigned path. The proxy side is complete end-to-end; production activation is gated on the Source API surfacing the role (see Production status).
Model
DataConnectiongains anauthenticationfield — a sibling ofdetails, matching the Source API's shape:resolve_productbranches viaapply_backend_auth, replacing the long-standing// TODO: provide real backend credentialsforcedskip_signatureinsert:skip_signature(current behavior).auth_type=oidc+oidc_role_arn+ a per-connection subjectscv1:conn:{id}, leaving signing on so the middleware injects the federated credentials.BackendAuthErrorrather than silently serving unsigned.End-to-end federation
lib.rswiresMaybeOidcAuth(AwsBackendAuth(OidcCredentialProvider))into the gateway via.with_middleware, backed by a reqwestFetchHttpExchangefor the worker. For anauth_type=oidcconnection the middleware mints the proxy's RS256 assertion (iss = OIDC_PROVIDER_ISSUER,aud = sts.amazonaws.com,sub = scv1:conn:{id}), exchanges it at AWS STS (AssumeRoleWithWebIdentity) over fetch, and injects the temporary credentials so the backend request is signed. A no-op for unsigned/public connections.Resilience
serde_json::from_str, so adeserialize_withdegrades a present-but-malformedauthentication(null, wrong shape, missingrole_arn, unknowntype) toUnsigned/Unsupportedinstead of failing the entire parse and breaking resolution for every product.Unsupported(the app-sidegcp_workload_identity/azure_workload_identityvariants this build doesn't implement, or malformed config) — a connection that can't be authenticated isn't served at all, rather than falling back to an anonymously-readable unsigned request.Performance
The gateway is rebuilt per
fetch(), so the credential provider is held in an isolate-levelOnceLockand cloned per request; cloning shares the cache (enabled by the pinned multistorefeat/shareable-credential-cache), so repeat federated requests reuse cached temporary credentials instead of re-minting a JWT and re-runningAssumeRoleWithWebIdentityevery call.Observability
BackendAuth::kind()(unsigned/s3_web_identity_role/unsupported— no secrets, no ARN) is recorded on theresolve_productspan, so an operator can see which backend-auth path a request took and correlate a fail-closedUnsupportedor an STS 403 without leaking the role ARN.Testing
backend_authis extracted into a wasm-free module so it can be natively unit-tested despite the lib beingcdylib+test = false—tests/backend_auth.rsincludes it via#[path], mirroringtests/pagination.rs. Covered: serde round-trips (unknowntype→Unsupported, malformed → degrade, thes3_web_identity_rolevariant) and the option-set translation for each variant, including fail-closed. An env-gatedtests/test_federation.pysmoke test exercises the full live path and skips unlessFEDERATION_TEST_ACCOUNT/PRODUCT/KEYare set.Temporary multistore pin
The consolidated backend-auth work (oidc-provider owning the credential exchange, the shareable credential cache) isn't on crates.io yet, so
[patch.crates-io]pins all five multistore crates to an exact rev (notbranch=main, whichcargo updatecould silently float) for reproducible builds. Drop the patch and bump the versions once multistore ships.Production status
Non-breaking and currently inert: the Source API still redacts
authentication, so every connection resolvesUnsignedand behaves exactly as before. Remaining go-live steps (outside this PR):role_arnto the proxy (source.coop #327 / #329).aud = sts.amazonaws.com,sub = scv1:conn:{id}) and a federated test product in staging to activate the smoke test.🤖 Generated with Claude Code