Skip to content

explorer: spike a progressive heatmap layer as filter-honest alternative to cluster mode #233

@rdhyee

Description

@rdhyee

Idea

Add a third visualization mode alongside cluster (H3 dots) and point (individual samples): a progressively-refined heatmap that's regenerated per filter change, giving filter-honest density visualization at any zoom.

Surfaced during the 2026-05-22 session that produced #230 / #231 / #232. The motivating problem: cluster-mode H3 dominant_source lies about facet filters — a cell with 1,200 SESAR-soil samples + 8,000 OpenContext-not-soil samples shows as a red OpenContext dot even when the user has filtered to material=soil. A heatmap regenerated per filter change naturally avoids this — the dots are generated from filtered data rather than pre-baked.

Why this is interesting

Three problems converge:

  1. Cluster-mode filter dishonesty — see briefing for material=soil example. Cluster H3 summaries don't account for facet filters; the #facetNote is meant to apologize for this, and is bugged on URL load too.
  2. Point-mode density artifacts (explorer: dense point overlap saturates to yellow, looks like Smithsonian dots #231) — at high zoom on dense sites, hundreds of translucent dots stack and saturate to spurious yellow/orange.
  3. No "where's the density of my filtered data" view exists. Cluster shows where data is. Point shows individual records. Neither answers "where is the soil I'm filtering for?"

A progressive heatmap, derived per pan/zoom/filter change from a DuckDB-WASM query against the wide parquet, addresses all three.

Approach

Cesium has no built-in heatmap primitive. Three implementation paths:

Approach Mechanism Effort Quality
(1) heatmap.js + SingleTileImageryProvider Render to 2D canvas via heatmap.js; wrap as imagery; drape on globe 1-2 days MVP Decent; needs recompute on pan
(2) Pre-baked density tile pyramid Offline TMS/XYZ tiles per zoom level 1-2 weeks; needs pipeline Great but stale relative to filters
(3) Custom WebGL primitive GLSL shader doing additive blending with Gaussian kernel 2-3 weeks Highest quality + fully dynamic

For an MVP spike: (1) is the right starting point. It composes with the existing DuckDB-WASM + wide-parquet stack with minimal new infrastructure.

Progressive refinement pattern

The "progressive" part is the interesting trick — show something in <100 ms, then keep improving while the user sits still:

Time after moveEnd What renders
0 ms Cached coarse layer for this view (or stale)
~50 ms H3 res 4 dots (already loaded)
~250 ms Heatmap from 1% TABLESAMPLE of in-viewport filtered points
~1 s Heatmap from 10% sample + tighter kernel
~3 s Heatmap from 100% in-viewport filtered points + sharpest kernel
Indefinite Cache by (viewport-hash, filter-hash) — reuse on return

DuckDB-WASM TABLESAMPLE N PERCENT makes the sampling trivial. Cancellation reuses the existing facetCountsReqId / requestId patterns.

-- Pass N
SELECT latitude, longitude FROM read_parquet('${wide_url}')
  TABLESAMPLE ${pctForPassN} PERCENT
  WHERE lat BETWEEN ${s} AND ${n} AND lng BETWEEN ${w} AND ${e}
    ${sourceFilterSQL()}
    ${facetFilterSQL()}
    -- optionally: ${searchFilterSQL()} once #164 lands

What you'd get

A new toggle alongside cluster/point: "Heatmap." When active:

  • Cluster's #facetNote apology goes away — heatmap IS the filter-aware density view.
  • Point mode's saturation bug (explorer: dense point overlap saturates to yellow, looks like Smithsonian dots #231) becomes moot at world/regional zoom — heatmap supplants point-mode at high counts.
  • A clean visual answer to "where is the soil I'm filtering for?" / "where is the pottery search matching?"
  • Works as the natural "honest cluster" path (option C1 in the design briefing).

What's hard

  1. Kernel sigma calibration. Screen-pixel-stable kernel vs world-meter-stable kernel give very different impressions; need to pick one (or expose as a toggle).
  2. Color ramp. Avoid re-creating explorer: dense point overlap saturates to yellow, looks like Smithsonian dots #231's saturation bug — heatmap.js gradients are well-tested but need tuning per dataset density.
  3. Cache invalidation. When the user changes filters, every cached heatmap for every viewport-hash is now stale. Reasonable LRU + bounded memory.
  4. First-pass coldness. Even the 1% sample is a fresh parquet scan. If users pan fast, every pan triggers a fresh scan. Debounce 250-500 ms before kicking off the first pass.

Effort estimate

Phase Work Time
Spike: 1-pass heatmap.js on moveEnd Wire up SingleTileImageryProvider, basic query, basic styling 2-3 days
Add progressive (1% → 10% → 100%) Cancellation + 3 passes + incremental setData 1-2 days
Polish: cache, kernel adaptation, color ramp Iteration; bug-fix passes 1-2 days
MVP total ~1 week

A custom WebGL primitive (option 3) would be a separate multi-week project, deferred until the MVP proves the value.

Out of scope (for this issue)

  • Replacing cluster or point mode (heatmap is a third mode, not a replacement)
  • The H3 generation pipeline (separate repo)
  • Anything that requires server-side compute

Acceptance for the spike

  • Toggleable "Heatmap" mode in the explorer UI.
  • On moveEnd + any filter change, renders a filter-aware density heatmap of in-viewport samples.
  • Visible within ~250 ms (coarse), refines for ~3 s, cached after.
  • Cancels cleanly when user pans/types again.

Cross-refs

Disposition

This is a possible-idea / spike request, not yet a commitment. File it, sit with it, decide later whether to allocate the week. Closing as "won't do" is a fine outcome if the existing cluster+point model is good enough once #231 and the #facetNote-on-URL-load bugs are fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestexplorerInteractive Explorer featuresneeds-discussionRequires team input before implementing

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions