subcommand to summarise gateway state to json#20
Merged
Conversation
…ntegration
Adds `y-cluster gateway state` -- a JSON snapshot of the
cluster's GatewayClass, Gateway, HTTPRoute, GRPCRoute,
ClientTrafficPolicy, and BackendTrafficPolicy resources --
and wires it into the appliance export pipeline so the
reconciled snapshot ships alongside the qcow2/OVA/gcp-tar
deliverables.
Subcommand:
- `y-cluster gateway state [--context=NAME]` prints JSON to
stdout. Each kind carries spec AND status, so consumers
can answer maintenance-relevant questions deterministically:
Is HTTPS ready? (walk gateways[].status.listeners[] for
port==443, programmed==true, attachedRoutes>0). Is port 80
redirect-only? (walk httpRoutes[].rules for filters of type
RequestRedirect). Are ClientTrafficPolicy settings actually
in effect? (walk clientTrafficPolicies[].status.ancestors[]
for Accepted=True alongside spec.clientIPDetection.xForwardedFor.numTrustedHops).
- `y-cluster gateway clear-dns-hint-ip [--context=NAME]
[--gateway-class=y-cluster]` removes the
yolean.se/dns-hint-ip annotation from the GatewayClass.
Idempotent; used by prepare-export.
The shape is documented as a generated JSON Schema at
pkg/provision/schema/gateway-state.schema.json (added to
schemagen alongside the provider config schemas).
prepare-export reshape:
- Now requires a RUNNING cluster. Earlier behavior (require
stopped cluster, error otherwise) is inverted: the new live
phase runs `gateway clear-dns-hint-ip` (so the per-deploy
LB IP doesn't bake into the customer snapshot) followed by
`gateway state` (dumping to <cacheDir>/<name>-gateway-state.json),
both needing the apiserver up. After the live phase,
prepare-export stops the VM internally before the existing
offline virt-customize phase. The previous explicit
`y-cluster stop` step in callers becomes redundant.
- Preflight reordering: virt-customize + kubectl LookPath
checks fire first, so missing-tool errors surface before
the running-state check.
Export changes:
- `pkg/provision/qemu/export.go` copies the gateway-state.json
sibling into BUNDLE_DIR. Best-effort: a build that skipped
prepare-export (or one that ran before this change) won't
have the file -- log + skip rather than fail the export.
Script update:
- `scripts/appliance-qemu-to-gcp.sh` drops the explicit
`y-cluster stop` line before `y-cluster prepare-export`.
With the new live phase that step is wrong (would bring
down the cluster prepare-export needs up).
Schema generation:
- `cmd/internal/schemagen/main.go` gains a writeOutputSchema
helper for non-provider-config schemas. Generates
`gateway-state.schema.json` from gateway.State{} via the
same invopop reflector, but with FieldNameTag=json (output
is JSON, not YAML) and no provider-narrowing post-process.
- `pkg/gateway.SchemaID` is the canonical $id; a fresh Fetch
embeds it as `$schema` in the produced JSON so consumers
can validate by URL.
Smoke-tested against the live appliance-gcp-build VM:
gateway state returns 1 GatewayClass (programmed listener on
port 80, attachedRoutes=3), 3 HTTPRoutes (external-http,
keycloak-admin, echo), 1 ClientTrafficPolicy (trust-lb-xff
with numTrustedHops=1, Accepted=True). The currently-set
dns-hint-ip annotation (`127.0.0.1` from the local-provision
default) is what prepare-export will clear before snapshot.
Tests:
- `pkg/gateway/state_test.go` covers the targetRefs-shape
flatten (singular vs plural), the case-insensitive
Programmed-condition check, the SchemaID surfacing in
the JSON output, and the zero-value-no-null-slices
invariant.
- `pkg/provision/qemu/prepare_export_test.go` updates the
VM-state assertion (was: "expects stopped"; now: "expects
running") and trims the unused filepath import.
E2e against /dev/kvm not run in this commit; the existing
qemu e2e suite's prepare-export coverage will now exercise
the live phase automatically when re-run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a `schemaVersion` field to the gateway state JSON (currently "1") and constrains it via a single-value enum on the generated schema. Lets a consumer reading the JSON detect a snapshot shape they don't recognise -- the enum value at fetch time must match what the consumer's copy of the schema doc expects, or validation fails. The schema URL stays UNVERSIONED (the `$id` / `https://yolean.se/y-cluster/schema/gateway-state.schema.json` always points at the current schema doc). Per-document version comes from the new schemaVersion field. Backward-incompatible shape changes (renames, removals) require: bump `gateway.SchemaVersion`, regenerate the schema (so the enum catches up), update consumers in lockstep. Old snapshots remain identifiable by their schemaVersion field; they would validate against an archived copy of the previous schema doc once we need to serve one. Additive changes (new omitempty fields) do NOT require a bump. Implementation: - `gateway.SchemaVersion = "1"` exported constant. - `State.SchemaVersion` field (json:"schemaVersion"), populated by Fetch() from the constant. - `cmd/internal/schemagen` gains an `enumPin` post-process helper -- a small (DefName, PropName, Values) tuple -- plumbed through `writeOutputSchema` as variadic. The gateway-state schema is the only consumer today; the helper generalises cleanly for future single-value enums on other output schemas. - `pkg/gateway/state.schema.json` regenerated with the `schemaVersion: { enum: ["1"], type: "string" }` constraint. Tests: - TestStateZeroValueMarshals updated to assert `"schemaVersion":"1"` in the marshalled output. - TestSchemaVersionMatchesEnum reads back the regenerated schema doc and asserts its schemaVersion enum equals [SchemaVersion]. Fails fast in CI if the constant gets bumped without a `go generate` follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
v1 payload: {
"gatewayClass": {
"name": "y-cluster",
"controllerName": "gateway.envoyproxy.io/gatewayclass-controller",
"conditions": [
{
"type": "Accepted",
"status": "True",
"reason": "Accepted",
"message": "Valid GatewayClass"
}
]
},
"gateways": [
{
"namespace": "y-cluster",
"name": "y-cluster",
"gatewayClassName": "y-cluster",
"listeners": [
{
"name": "http",
"port": 80,
"protocol": "HTTP",
"allowedRoutes": {
"namespaces": {
"from": "All"
}
}
}
],
"status": {
"conditions": [
{
"type": "Accepted",
"status": "True",
"reason": "Accepted",
"message": "The Gateway has been scheduled by Envoy Gateway"
},
{
"type": "Programmed",
"status": "True",
"reason": "Programmed",
"message": "Address assigned to the Gateway, 1/1 envoy replicas available"
}
],
"listeners": [
{
"name": "http",
"attachedRoutes": 3,
"conditions": [
{
"type": "Programmed",
"status": "True",
"reason": "Programmed",
"message": "Sending translated listener configuration to the data plane"
},
{
"type": "Accepted",
"status": "True",
"reason": "Accepted",
"message": "Listener has been successfully translated"
},
{
"type": "ResolvedRefs",
"status": "True",
"reason": "ResolvedRefs",
"message": "Listener references have been resolved"
}
],
"programmed": true
}
]
}
}
],
"httpRoutes": [
{
"namespace": "my-app",
"name": "external-http",
"parentRefs": [
{
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster",
"namespace": "y-cluster"
}
],
"hostnames": [
"my-app.example.net"
],
"rules": [
{
"backendRefs": [
{
"group": "",
"kind": "Service",
"name": "gateway-v4-cluster",
"port": 8080,
"weight": 1
}
],
"matches": [
{
"path": {
"type": "PathPrefix",
"value": "/"
}
}
]
}
],
"status": {
"parents": [
{
"parentRef": {
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster",
"namespace": "y-cluster"
},
"controllerName": "gateway.envoyproxy.io/gatewayclass-controller",
"conditions": [
{
"type": "Accepted",
"status": "True",
"reason": "Accepted",
"message": "Route is accepted"
},
{
"type": "ResolvedRefs",
"status": "True",
"reason": "ResolvedRefs",
"message": "Resolved all the Object references for the Route"
}
]
}
]
}
},
{
"namespace": "keycloak-v3",
"name": "keycloak-admin",
"parentRefs": [
{
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster",
"namespace": "y-cluster"
}
],
"hostnames": [
"keycloak-admin"
],
"rules": [
{
"backendRefs": [
{
"group": "",
"kind": "Service",
"name": "keycloak-proxied",
"port": 8080,
"weight": 1
}
],
"matches": [
{
"path": {
"type": "PathPrefix",
"value": "/"
}
}
],
"timeouts": {
"backendRequest": "120s",
"request": "120s"
}
}
],
"status": {
"parents": [
{
"parentRef": {
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster",
"namespace": "y-cluster"
},
"controllerName": "gateway.envoyproxy.io/gatewayclass-controller",
"conditions": [
{
"type": "Accepted",
"status": "True",
"reason": "Accepted",
"message": "Route is accepted"
},
{
"type": "ResolvedRefs",
"status": "True",
"reason": "ResolvedRefs",
"message": "Resolved all the Object references for the Route"
}
]
}
]
}
},
{
"namespace": "y-cluster",
"name": "echo",
"parentRefs": [
{
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster"
}
],
"rules": [
{
"backendRefs": [
{
"group": "",
"kind": "Service",
"name": "echo",
"port": 80,
"weight": 1
}
],
"matches": [
{
"path": {
"type": "PathPrefix",
"value": "/q/envoy/echo"
}
}
]
}
],
"status": {
"parents": [
{
"parentRef": {
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster"
},
"controllerName": "gateway.envoyproxy.io/gatewayclass-controller",
"conditions": [
{
"type": "Accepted",
"status": "True",
"reason": "Accepted",
"message": "Route is accepted"
},
{
"type": "ResolvedRefs",
"status": "True",
"reason": "ResolvedRefs",
"message": "Resolved all the Object references for the Route"
}
]
}
]
}
}
],
"grpcRoutes": [],
"clientTrafficPolicies": [
{
"namespace": "y-cluster",
"name": "trust-lb-xff",
"targetRefs": [
{
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster"
}
],
"spec": {
"clientIPDetection": {
"xForwardedFor": {
"numTrustedHops": 1
}
},
"targetRefs": [
{
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster"
}
]
},
"status": {
"ancestors": [
{
"ancestorRef": {
"group": "gateway.networking.k8s.io",
"kind": "Gateway",
"name": "y-cluster",
"namespace": "y-cluster"
},
"controllerName": "gateway.envoyproxy.io/gatewayclass-controller",
"conditions": [
{
"type": "Accepted",
"status": "True",
"reason": "Accepted",
"message": "Policy has been accepted."
}
]
}
]
}
}
],
"backendTrafficPolicies": [],
"fetchedAt": "2026-05-06T11:21:21Z",
"$schema": "https://yolean.se/y-cluster/schema/gateway-state.schema.json",
"schemaVersion": "1"
} |
Two CI failures on this PR (run 25433079702): - lint (staticcheck S1016) on pkg/gateway/fetch.go: toConditions and the listener projection in fetchGateways used full struct literals to copy from raw* types into the exported types they shadow field-for-field. Identical-shape conversions are clearer and what staticcheck flags. Replaced with `Condition(c)` and `Listener(l)`. - pkg/provision/qemu test failures on ubuntu-latest: TestPrepareExport_NoSavedState and TestPrepareExport_VMNotRunning passed locally on hosts with libguestfs-tools installed but failed on stock GHA runners because PrepareExport's first step is a virt-customize LookPath guard. The tests want to assert the saved-state and not-running error paths that come AFTER the LookPath guards, so we stub virt-customize + kubectl on PATH (empty shims; the assertion-target branches return before invoking either binary). The existing TestPrepareExport_MissingVirtCustomize keeps its explicit PATH="" override and still proves the LookPath hint fires when the binary is genuinely absent. Other notes from reviewing the PR (no changes needed, just flagging things I confirmed are sound): - The pkg/gateway split between rawCondition / Condition (and the parallel Listener / rawListener pair) is intentional -- rawCondition is the kubectl-JSON unmarshal target, Condition is the public output type. They happen to be identical today; keeping them separate gives room to project / filter without breaking consumers when the kubectl shape evolves. - schemagen now writes two distinct kinds of schema: provision-config schemas under pkg/provision/schema/ and output schemas alongside the Go type that produces them (e.g. pkg/gateway/state.schema.json). The split is documented in the package doc and the gen path is symmetric with the per-provider one. - The unversioned $id + enum-of-one schemaVersion pattern on gateway.State (SchemaVersion = "1") is the right shape for forward compatibility: the canonical URL stays stable, the version stamp identifies any given snapshot, and a future bump means versioning the schema doc URL while leaving the unversioned pointer at the latest. - gateway.Fetch issues one kubectl invocation per kind. That's fine for the volume here (~6 kinds) but a single `kubectl get gatewayclass,gateway,httproute,... -A -o json` is a non-blocking follow-up if the round-trip count ever matters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The raw reconciled-resource dump is hard to consume directly. Add two top-level fields on the state JSON: - summary: a fully-typed routing-tree projection in industry-neutral terms (listener -> host -> route -> match/backend). numTrustedHops + trustedCIDRs surface at listener level, where ClientTrafficPolicy actually attaches. Routes without a hostname bucket under "*" (sorted last). GRPC method matches render as "Method=Type:Service/Method" in the same Path field as HTTP path matches. - envoy: a sample of dataplane state (version + verbatim /config_dump) from any one envoy-gateway proxy pod. envoy admin binds 127.0.0.1:19000 inside a distroless container, so we kubectl port-forward (kubelet's apiserver /pods/<n>:19000/proxy can't reach localhost-bound ports). Best-effort: skipped silently when no proxy pod runs yet. Summary is unit-tested with Gateway API payloads + an empty envoy object as input. Envoy.config is schema-typed as type=object via a jsonschema struct tag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
prepare-export's live phase writes the reconciled Gateway snapshot to <cacheDir>/<name>-gateway-state.json, but the teardown artefact list didn't include it. The JSON survived teardown, and the next prepare-export bundle picked up a stale dump from the prior cluster. Add the path to perVMArtefacts and update both the explicit-list teardown test and the TestPerVMArtefacts pin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The appliance build flow's external HTTPS LoadBalancer stage needs a SAN list for its self-signed cert. Today the operator declares it twice -- once in HTTPRoute manifests, once in TLS_DOMAINS=foo,bar -- and drift between the two means either the cert covers hostnames the cluster doesn't serve, or the cluster serves hostnames the cert doesn't cover. Add `y-cluster gateway hostnames` that reads the existing `gateway state` snapshot and projects unique non-wildcard hostnames from the typed Summary (.summary.listeners[].hosts[].hostname). Default output is one hostname per line; --csv joins with `,` -- exactly the format TLS_DOMAINS / do_tls_frontend expect. Implementation is a small pure-Go helper (`Hostnames(*State) []string`) plus cobra wiring next to `gateway state`. The filter logic (skip "" and "*", dedupe across listeners) is unit-tested. The "*" sentinel from Summary is the catch-all bucket for routes that declare no `.spec.hostnames` -- not a hostname suitable for a cert SAN. Wildcard support (e.g. *.example.com literal SANs) is out of scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two loose props at the listener root (numTrustedHops, trustedCIDRs) didn't carry their context: a consumer reading the JSON saw the values but had to know they were XFF settings, not generic listener tuning. Group them under a `xForwardedFor` wrapper that mirrors the source CRD shape (`spec.clientIPDetection.xForwardedFor` on a ClientTrafficPolicy), at one wrapping level. Single-level wrap matches the only currently-defined detection mechanism in envoy-gateway; if `customHeader` (the alternate clientIPDetection mechanism) becomes relevant, it lands as a sibling at the same level. Schema regenerated; tests updated to walk the new path (`l.XForwardedFor.NumTrustedHops` / `.TrustedCIDRs`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.