Skip to content

ci: pre-deploy smoke gate (Option C) so a JS-dead render can't reach isamples.org#225

Merged
rdhyee merged 2 commits into
isamplesorg:mainfrom
rdhyee:smoke-gate-option-c
May 15, 2026
Merged

ci: pre-deploy smoke gate (Option C) so a JS-dead render can't reach isamples.org#225
rdhyee merged 2 commits into
isamplesorg:mainfrom
rdhyee:smoke-gate-option-c

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented May 15, 2026

Problem

The deploy workflow runs quarto render and ships whatever docs/ it produces. Neither code review nor pytest --collect-only ever loads the rendered page in a browser, so a render that "succeeds" but yields a JS-dead explorer (DuckDB-WASM never inits, Cesium blank, search returns nothing) deploys to isamples.org anyway. This is the failure class behind past "we reviewed it and it still broke" incidents.

What this adds

  • tests/test_smoke.py — fundamental-liveness gate. Single fresh context, one navigation, poll-for-readiness (deliberately no reload loop; rapid reloads exhaust the DuckDB-WASM worker and produce false failures). Four unambiguous assertions:
    1. DuckDB-WASM initialized (SESAR facet count populated)
    2. Cesium canvas attached (globe actually drew)
    3. A world search via the visible #searchSubmitBtn returns results
    4. No uncaught JS exception / regression-fingerprint console error
  • quarto-pages.yml — smoke step inserted between quarto render and Deploy, serving the rendered docs/ locally. Fail-closed: smoke failure fails the job, so Deploy is skipped and a broken render never reaches production. trap-reaps the static server under GitHub's bash -e.

Validation

  • Passes the current known-good build in ~15s (fast; no false-closed).
  • Raises TimeoutError on a rendered-but-JS-dead page → pytest fails → step fails → Deploy skipped (fail-closed confirmed).

Notes

  • The gate executes on push-to-main (the deploy trigger), so it self-validates on its first real deploy after merge. Worst case of a false-fail is a blocked deploy, not a broken site.
  • Follow-up (Option A, separate): post-deploy check vs live isamples.org (cache-busted) as a backstop for prod-only data/CDN issues.

🤖 Generated with Claude Code

rdhyee and others added 2 commits May 15, 2026 16:06
The deploy workflow runs `quarto render` and ships whatever docs/ it
produces; nothing ever loads the rendered page in a browser, so a render
that "succeeds" but yields a page where DuckDB-WASM never inits, Cesium
never draws, or search returns nothing has historically deployed anyway
(the failure class behind past "reviewed and still broke" incidents).

Adds tests/test_smoke.py: single fresh context, one navigation,
poll-for-readiness (no reload loop — rapid reloads exhaust the
DuckDB-WASM worker and false-fail). Asserts four unambiguous liveness
signals: facet query populated, Cesium canvas attached, a world search
returns results, no uncaught JS exception / regression-fingerprint
console error.

Wires it into quarto-pages.yml between render and Deploy, serving the
rendered docs/ locally. Fail-closed: smoke failure fails the job and the
Deploy step is skipped. trap-reaps the static server under `bash -e`.

Validated both directions: passes the known-good build in ~15s; raises
TimeoutError on a rendered-but-JS-dead page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…g#225

- Scope _FATAL_CONSOLE to same-origin scripts: a third-party
  console.error (Cesium CDN, injected extension) can no longer block a
  deploy; pageerror stays the unconditional hard signal for uncaught
  app exceptions.
- Cesium check now also asserts non-zero canvas dimensions, catching
  the "widget mounted but globe never sized" case without flaky pixel
  readback.
- Search-result wait 60s -> 90s, aligned with the perf test budget, so
  a slow CI cold DuckDB-WASM query + remote parquet fetch doesn't
  false-fail a healthy build.

Re-validated: passes the known-good build (~20s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 15, 2026

How to test this gate / what to look for

Run it exactly as CI will

git fetch origin smoke-gate-option-c && git checkout smoke-gate-option-c
python3 -m http.server 8080 --directory docs &
ISAMPLES_BASE_URL=http://localhost:8080 pytest tests/test_smoke.py -s -q
kill %1

Expected (healthy build): SMOKE OK — search result: '50+ results for "pottery"' then 1 passed in ~15–20s.

Prove it's fail-closed (catches a break)

cp docs/explorer.html /tmp/explorer.bak
sed -i '' "s/data-value='SESAR'/data-value='BROKEN'/g" docs/explorer.html
ISAMPLES_BASE_URL=http://localhost:8080 pytest tests/test_smoke.py -s -q   # -> 1 failed (TimeoutError on readiness)
cp /tmp/explorer.bak docs/explorer.html

What each assertion proves

Signal Proves Pass criterion
SESAR facet count DuckDB-WASM initialized & ran a query text like (4,389,231) within 90s
Cesium canvas Globe actually drew, not just a container canvas attached + non-zero clientWidth/Height
World search Visible #searchSubmitBtn → query path alive #searchResults shows …results… with a digit, ≤90s
pageerror / same-origin fatal console No uncaught JS exception zero captured

Review sanity-checks

  • Fail-closed wiring: in quarto-pages.yml the smoke step sits before Deploy 🚀; pytest is the last command, so a non-zero exit aborts the job (GitHub's bash -eo pipefail) and Deploy never runs.
  • No false-fails: console check is scoped to same-origin scripts (Cesium CDN / browser extensions can't block a deploy); pageerror is the unconditional hard signal.
  • Self-validation: the gate only executes on push-to-main (the deploy trigger), so its first real run is the post-merge deploy. A false-fail there blocks a deploy but does not break the live site.

CI-only difference from the local run: CI first does pip install pytest playwright && playwright install --with-deps chromium and trap-reaps the background server. Otherwise identical.

@rdhyee rdhyee merged commit d5091df into isamplesorg:main May 15, 2026
1 check passed
rdhyee added a commit that referenced this pull request May 21, 2026
Adds the floating in-map detail card from Hana Figma 222:456 / 225:1700,
anchored near the clicked sample dot with viewport-edge collision
avoidance. Same data path as the side-panel sample card; the floating
card is the new map-anchored surface companion.

Unifies click semantics across map dot and samples table (#226): click
= open detail card; the external link to the source record (OpenContext,
SESAR, etc.) lives inside the card title, never as the row's default
action. Previously the table title rendered as <a href={sourceUrl}> and
the row-click handler bailed on anchor targets, so clicking the title
opened the external site while clicking elsewhere updated the panel —
the most-attractive click target did the least-useful thing.

Lazy-load now resolves vocabulary URIs to human labels via
vocab_labels.parquet, so Material / Specimen Type render as e.g.
"Biogenic non-organic material" / "Architectural element" instead of
the raw w3id.org URI strings.

Wide-parquet only — no narrow-format references.

Codex-reviewed iteration:
* XSS-hardened: escapeHtml() on all interpolated values; thumbnail
  image built via DOM construction, not innerHTML, so the onerror
  fallback is a property rather than an inline-JS attribute string.
* Removed the table-row card-vs-detail race: card now shows immediately
  at canvas centre (where the sample lands after the camera flyTo),
  rather than deferring to flyTo.complete and racing the detail query.
* z-index bumped to 1001 so the card stays above the search overlay,
  results line, and color legend (all at 1000).
* Table title is now a role="button" tabindex="0" span with aria-label,
  not an <a>. Keeps the blue-underline visual styling; screen readers
  announce as a button (correct) rather than a link (misleading).
  Keyboard parity via Enter/Space handler on .table-link.

Pre-deploy smoke gate (#225) passes locally: 1 passed in 12.75s.

Followup: #227 tracks the same click-semantics fix for the
nearby-samples and search-results panels (separate PR to keep this
diff focused).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant