explorer: architectural direction — make filter semantics coherent across all surfaces

## Purpose

Capture the architectural direction for the explorer's filter semantics, sequenced as a roadmap, with explicit decision rationale. Filed for review (Codex, Gemini, and human collaborators). The goal is alignment *before* implementation begins on the substantive steps.

## Context

The explorer has five surfaces that show numbers about samples — map dots, "Samples in View" stat, "Samples Rendered" stat, samples table count, facet-legend counts, and the search-results line. Today each surface applies a different combination of constraints:

| Constraint | Map / Stats / Table | Facet counts | Search results line |
|---|---|---|---|
| Source checkboxes | ✅ | ✅ | ✅ |
| Material / Sampled Feature / Specimen Type | ✅ | ✅ | ✅ |
| Bbox (viewport) | ✅ | ❌ | optional (area-scope) |
| Search text | ❌ | ❌ | ✅ |

Result: three different "filter" semantics on one page. The 2026-05-22 investigation session (see `#229` closure note, and the design briefing in `~/dev-journal/projects/isamples-facets.md`) hit three concrete confusions stemming from this:

- "I have `pottery Cyprus` in the search box but the facet counts and 'Samples in View' don't reflect it."
- "I filtered to `material=soil` but the cluster dots include non-SESAR colors even though most soil is SESAR."
- "What does '5,451 samples match the current filters' actually mean?"

## The decision space

Three orthogonal axes (named for cross-reference):

- **A1**: search is a global filter — restricts map, table, stats, facet counts.
  **A2**: search is a side-panel lookup — restricts only the search-results list. *(current)*
- **B1**: facet counts reflect viewport bbox — pan, counts change.
  **B2**: facet counts stay global regardless of viewport. *(current)*
- **C1**: cluster mode honestly reflects facet filter — H3 dots per filtered subset *(expensive — pre-bake per facet, or live aggregate)*.
  **C2**: cluster mode ignores filter, surface this loudly with `#facetNote`. *(current — but the note is bugged on URL-load)*.
  **C3**: auto-switch to point mode when any facet is active — no cluster dishonesty, but point density problems (see #231).

## Direction picked

**A1 + B1 + (C3-when-feasible, C2-with-prominent-warning-when-too-dense)**, with **progressive refinement** (sampled-fast then full-when-idle) underlying every dynamic surface, and **issue #233's progressive heatmap** as the eventual unifier that retires the cluster-vs-point dichotomy.

### Mental model the user gets

> "The explorer is a single coherent answer to: *what samples match my current intent?* Every number on the page tells me the size of that intent. Every dot tells me where one of those samples is. If there are too many dots to draw individually, the page tells me so and falls back to cluster mode with a visible warning that it's an approximation."

### Why this combination

- **A1 (search global)**: the search box stops being decorative. Users naturally assume what they type restricts what they see; the page should honor that.
- **B1 (counts viewport-aware)**: legend becomes "what's in front of me" — agreeing with the table and the stats. The legend stops being a global pivot tool (which is conceptually clean but in practice confused users in the 2026-05-22 session).
- **C3-then-C2 fallback**: cluster mode is treated as a perf optimization, not a feature. When it's feasible to draw individual dots, draw them. When the count exceeds a threshold (still TBD), keep cluster but warn loudly that what you see isn't the filtered set.
- **Progressive refinement**: addresses the "want both snappy *and* honest" tension. Counts/dots show a coarse approximation during active panning, refine to honest values when the user sits still for ~500ms. Cancellation on any new pan. The `facetCountsReqId` and `requestId` patterns already in the codebase generalize directly.

### Why NOT the "cleanest" earlier framing (A1 + B2)

An earlier version of this briefing recommended A1 + B2 — keep the legend global as a *pivot tool* ("what could I navigate to"). Decision made to go with B1 instead because:

- The explorer's primary user is studying data, not navigating around. "What's here" matters more than "where could I go."
- All other numbers on the page reflect the viewport; making the legend the lone exception causes silent disagreement.
- Progressive refinement makes B1's perf cost (100-300 ms recompute per pan) feel acceptable — the user sees stale counts go italic instantly, then update.

## Sequenced roadmap

| # | Step | Effort | Architectural change | Unblocks |
|---|---|---|---|---|
| 1 | Fix `#facetNote`-on-URL-load bug | 1-2 hours | None — pure bug fix | C2 honesty when arriving via shared URL |
| 2 | #232: "50+" → real count | ½ day | None — adds a COUNT query | Honest disclosure of search-result size |
| 3 | **B1**: facet counts viewport-aware, with `.recomputing` italic state during query | 1-2 days | Add bbox predicate to `updateCrossFilteredCounts` live-query path; cube fast-path falls back when bbox is non-global | Legend agrees with table and stats |
| 4 | **A1**: search as global filter — add `ILIKE` search predicate to facet counts, table query, and `loadViewportSamples` | 2-3 days | Touches every count surface; biggest behavior change | Search box becomes a real filter |
| 5 | **C3**: auto-promote to point mode when any facet active, with density-cap fallback to cluster + prominent "showing cluster — too dense for individual dots" warning | 2-3 days | Mode-selection logic now considers facet state, not just zoom | Map dots honestly reflect filter |
| 6 | **#233**: progressive heatmap spike — third visualization that replaces the cluster apology with an actually-filter-honest density layer | ~1 week | New visualization mode; reuses DuckDB-WASM + wide-parquet stack | Retires the cluster-vs-point tradeoff for high-density filtered views |

Steps 1-2 are quick-win, independent of the architectural direction. Steps 3-4-5 are the substantive coherence work. Step 6 is the long-term answer that makes the cluster-mode apology obsolete.

## Progressive refinement pattern (applies to steps 3, 5, 6)

A single debounce-+-cancel-+-progressive scaffold reused across surfaces:

```
moveStart:
  - snapshot current values
  - apply `.recomputing` italic state

moveEnd + 250 ms (debounce — cancels if another move comes):
  - kick off coarse-pass query (10% TABLESAMPLE for counts; cube for legend single-axis case)
  - apply result; keep `.recomputing` if there's a refine pass pending

moveEnd + 1-2 s still idle:
  - run full-scan query
  - apply result; drop `.recomputing`

any new move / filter change:
  - bump request token; in-flight queries discard their result via stale guard
```

The codebase already has the cancellation primitives (`facetCountsReqId`, `requestId`, `freshSelectionToken`) and the `.recomputing` CSS class.

## Open questions for review

1. **Is A1 + B1 actually the right call?** The earlier framing argued B2 keeps the legend stable and avoids per-pan jitter. We chose B1 because it makes the page coherent and the user is studying data. But B2 + A1 is also defensible — does the review prefer it?
2. **Density-cap threshold for C3 → C2 fallback?** When should auto-point-mode give up and revert to cluster? 5,000 dots? 50,000? Empirically test, or pick a number?
3. **"Snappy vs accurate" explicit toggle?** Or is progressive refinement enough that the user doesn't need to choose? The current lean is no toggle — the page just behaves fast-while-moving and honest-when-still.
4. **Does step 4 (A1) require the FTS work in #168-#172 to land first?** The current search uses `ILIKE` against three text columns. Performance-acceptable for the search-results list at LIMIT 50, but folding it into every count query means scanning the same columns much more often. Might need the BM25 index to be ready before A1 is shippable at scale.
5. **C3's interaction with #231 (point saturation).** Auto-promoting to point mode reveals the yellow-saturation bug at high densities. Should #231's fix (sub-options A/B/C/D in that issue) land before C3, or alongside it?
6. **Should step 6 (#233 heatmap) actually come earlier?** If the spike works, it might supplant the C3 work entirely — no point auto-promoting to point if a filter-honest heatmap is the better visualization. Risk vs. value of trying the spike before committing to C3.

## What's NOT in this issue

- Implementation details for any step (file separate PRs as steps land)
- Refactoring `explorer.qmd` (#163 territory)
- Changes to the parquet generation pipeline (separate repo)
- The C1 path (pre-baked per-facet H3 cubes) — possible but expensive; deferred unless the live-query path proves too slow

## Cross-refs

- #163 — explorer rethink umbrella
- #164 — state contract / search-semantics decision (A1 lands the search-semantics piece)
- #185 — depth-test-distance / point-rendering quirks
- #226 — most recent point-mode UX work
- #230 — remaining facet-counts work (steps 3-4 above subsume / supersede it; will close once those land)
- #231 — point-overlap saturation (gates step 5)
- #232 — "50+" → real count (step 2)
- #233 — progressive heatmap spike (step 6)
- Design briefing: `~/dev-journal/projects/isamples-facets.md` (private to rdhyee, contains this same content with rougher framing)

## Acceptance for this issue (not the implementation)

- Sign-off from at least one reviewer (Codex, Gemini, or human) on the A1 + B1 + C3/C2 direction (or explicit pushback with rationale).
- Density-cap threshold for step 5 picked or punted to "TBD during implementation."
- Open question 4 (FTS dependency) resolved — does step 4 require #168-#172 to land first?
- Open question 6 (heatmap-first vs C3-first) decided.

Once those are settled, individual PRs follow against the existing tracking issues (#230, #231, #232, #233 plus the `#facetNote` bug to be filed).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explorer: architectural direction — make filter semantics coherent across all surfaces #234

Purpose

Context

The decision space

Direction picked

Mental model the user gets

Why this combination

Why NOT the "cleanest" earlier framing (A1 + B2)

Sequenced roadmap

Progressive refinement pattern (applies to steps 3, 5, 6)

Open questions for review

What's NOT in this issue

Cross-refs

Acceptance for this issue (not the implementation)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Constraint	Map / Stats / Table	Facet counts	Search results line
Source checkboxes	✅	✅	✅
Material / Sampled Feature / Specimen Type	✅	✅	✅
Bbox (viewport)	✅	❌	optional (area-scope)
Search text	❌	❌	✅

#	Step	Effort	Architectural change	Unblocks
1	Fix `#facetNote`-on-URL-load bug	1-2 hours	None — pure bug fix	C2 honesty when arriving via shared URL
2	#232: "50+" → real count	½ day	None — adds a COUNT query	Honest disclosure of search-result size
3	B1: facet counts viewport-aware, with `.recomputing` italic state during query	1-2 days	Add bbox predicate to `updateCrossFilteredCounts` live-query path; cube fast-path falls back when bbox is non-global	Legend agrees with table and stats
4	A1: search as global filter — add `ILIKE` search predicate to facet counts, table query, and `loadViewportSamples`	2-3 days	Touches every count surface; biggest behavior change	Search box becomes a real filter
5	C3: auto-promote to point mode when any facet active, with density-cap fallback to cluster + prominent "showing cluster — too dense for individual dots" warning	2-3 days	Mode-selection logic now considers facet state, not just zoom	Map dots honestly reflect filter
6	#233: progressive heatmap spike — third visualization that replaces the cluster apology with an actually-filter-honest density layer	~1 week	New visualization mode; reuses DuckDB-WASM + wide-parquet stack	Retires the cluster-vs-point tradeoff for high-density filtered views

explorer: architectural direction — make filter semantics coherent across all surfaces #234

Description

Purpose

Context

The decision space

Direction picked

Mental model the user gets

Why this combination

Why NOT the "cleanest" earlier framing (A1 + B2)

Sequenced roadmap

Progressive refinement pattern (applies to steps 3, 5, 6)

Open questions for review

What's NOT in this issue

Cross-refs

Acceptance for this issue (not the implementation)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions