Skip to content

Harden Cloud Run deployment with GCLB, dry-run, and shell tests#298

Open
luis5tb wants to merge 14 commits into
RHEcosystemAppEng:mainfrom
luis5tb:worktree-cloud-run-harden
Open

Harden Cloud Run deployment with GCLB, dry-run, and shell tests#298
luis5tb wants to merge 14 commits into
RHEcosystemAppEng:mainfrom
luis5tb:worktree-cloud-run-harden

Conversation

@luis5tb

@luis5tb luis5tb commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@luis5tb

luis5tb commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator Author

This is a new version of #250, where the things that has been already merged as removed/not-included

@luis5tb luis5tb force-pushed the worktree-cloud-run-harden branch from e94148f to cdb31da Compare June 15, 2026 13:59
@luis5tb luis5tb force-pushed the worktree-cloud-run-harden branch 2 times, most recently from 8d9a9a6 to 75fded2 Compare June 23, 2026 12:10
IlonaShishov
IlonaShishov previously approved these changes Jun 23, 2026

@IlonaShishov IlonaShishov left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,
One thing to keep in mind: CI only validates that existing shell tests still pass — if deploy.sh is changed without updating tests/shell/, the new code paths won't be tested and CI will still pass.

@luis5tb

luis5tb commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

LGTM, One thing to keep in mind: CI only validates that existing shell tests still pass — if deploy.sh is changed without updating tests/shell/, the new code paths won't be tested and CI will still pass.

nice catch, I'll add documentation about this, and perhaps we can also add a coverage script or something similar

luis5tb and others added 8 commits June 23, 2026 16:15
Add --dry-run flag that intercepts all gcloud calls: mutations are logged
with [DRY-RUN] prefix but not executed, describe/list commands return
failure so create-if-not-exists paths run end-to-end. Prerequisite checks
(SSL cert, service existence) are skipped in dry-run mode since they
can't validate without real GCP.

Move show_service_info() and update_agentcard_urls() above a source guard
so the script can be sourced for unit testing without executing the main
deployment body.

Usage: ./deploy/cloudrun/deploy.sh --dry-run --service all

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Yuval Kashtan <yuvalkashtan@gmail.com>
Add 15 bats tests covering argument parsing, variable validation,
update_agentcard_urls logic, dry-run end-to-end, and ingress behavior.
Tests use a mock gcloud script that returns configurable canned responses.

- tests/shell/deploy.bats: test suite
- tests/shell/mock_gcloud.sh: configurable gcloud stub
- Makefile: add test-shell target (bats tests/shell/)
- CI: add test-shell job parallel to lint/test/build, wired into CI gate

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Yuval Kashtan <yuvalkashtan@gmail.com>
- Add dry-run guards to show_service_info() and update_agentcard_urls()
  so post-deployment output is clean instead of showing confusing
  "Could not retrieve" warnings
- Add --dry-run to the deploy flags table in README.md
- Add shellcheck to the test-shell CI job for static analysis of deploy
  scripts
- Add 5 new bats tests: dry-run with setup_service_lb (NEG/backend/
  forwarding rule), dry-run with Cloud Armor (WAF rules), dry-run with
  configure_pubsub_push, and dry-run guards in show_service_info and
  update_agentcard_urls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Yuval Kashtan <yuvalkashtan@gmail.com>
The dry-run gcloud override returns 1 for describe calls, which caused
misleading warnings in unguarded post-deployment sections: "Agent service
not found" in the handler case and "SSL certificate is not yet active"
in the LB info blocks. Add DRY_RUN guards consistent with the existing
show_service_info and update_agentcard_urls pattern. Also fix missing
comma and inaccurate skip description in CI gate comment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Yuval Kashtan <yuvalkashtan@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Makes lock-check output easier to read by showing +/- context lines
instead of raw diff, helping spot sync errors quickly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document in CONTRIBUTING.md that changes to deploy/cloudrun/deploy.sh
must include corresponding test updates in tests/shell/. Covers what to
test (new functions, renamed functions, new flags, new code paths) and
how to run shell tests locally.

Also adds the Shell Tests job to the CI pipeline listing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove build_agent_image(), build_handler_image(), and the --build flag
from deploy.sh. Images are built externally by Konflux and pulled from
Quay.io via cloudbuild.yaml — the local build path was unused.

Add tests/shell/check_coverage.sh which extracts function names from
deploy.sh and verifies each one is referenced in the bats test suite.
This catches new functions added without corresponding tests.

Add a deploy_handler test to close the remaining coverage gap.
Add the coverage check as a CI step in the test-shell job.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
yuvalk and others added 5 commits June 23, 2026 15:00
…y.sh

The --build flag and build_agent_image/build_handler_image functions were
removed when --dry-run was added, but the section header remained as an
empty orphan between the config output and the deploy functions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…erage.sh

grep -oP uses Perl regex (lookahead), which is a GNU extension unavailable
on macOS BSD grep. Replace with grep -oE plus sed to strip the trailing
parentheses, making the script portable across Linux and macOS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…check

The CI test-shell job runs shellcheck, bats, and check_coverage.sh, but
the Makefile target only ran bats. Developers using make test-shell locally
would miss shellcheck and coverage failures that CI would catch. Now the
local target matches CI exactly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The existing deploy_handler test only verified gcloud run services replace
was called, but did not test the IAM binding path when
--allow-unauthenticated is set. Add a test that verifies the
add-iam-policy-binding call with allUsers member.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The only existing test for configure_pubsub_push covered the dry-run
failure case. Add two tests: one verifying early return when
ENABLE_MARKETPLACE is not true, and one verifying the subscription
creation path when the handler URL is available via mock gcloud.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yuvalk

yuvalk commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

see my PR review
#246
feel free to accept it fully or partially (IMHO commits for the 2 first issues are not important)

@luis5tb

luis5tb commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator Author

see my PR review #246 feel free to accept it fully or partially (IMHO commits for the 2 first issues are not important)

I suppose you mean luis5tb#7

Worktree cloud run harden PR#298 review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants