Skip to content

Restructuring: top-level layout + workflow/papers/scratch division#197

Merged
cailmdaley merged 116 commits into
developfrom
cleanup/restructuring
Jun 20, 2026
Merged

Restructuring: top-level layout + workflow/papers/scratch division#197
cailmdaley merged 116 commits into
developfrom
cleanup/restructuring

Conversation

@cailmdaley

@cailmdaley cailmdaley commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Restructuring: top-level layout + workflow / papers / scratch split

Reorganizes sp_validation around one principle — the things you run live at the top — with a clean split between analysis, paper figures, and personal scratch, plus a modular Snakemake workflow built for more than one person. Base branch: develop (untouched).

Layout

sp_validation/
├── src/sp_validation/   library code (incl. the glass_mock core)
├── cosmo_val/           shear / cosmology validation — code + config
├── cosmo_inference/     inference — code + config (cosmosis / cosmocov)
├── workflow/            ALL analysis — modular Snakemake, multi-person → results/
├── papers/              final-figure assembly only — bmodes, catalog, cosmo_val, harmonic
├── scripts/             reduction scripts (+ examples/, glass_mock/)
├── scratch/             per-person ad hoc work, tracked (cdaley/, guerrini/)
├── results/             analysis products + diagnostic plots (contents gitignored)
└── docs/  config/  + tests under src/sp_validation/tests/

Previously cosmo_val was buried inside notebooks/ while cosmo_inference/ sat at the top, so you had to hunt for where each piece lived. Now the things you actually run sit side by side, sharing src/ underneath.

Division of labor

The boundary is the inputs to a paper figure: everything up to that point is analysis, the figure itself is presentation.

  • workflow/ — all analysis. Generic, reusable, modular. Produces analysis products and diagnostic plots into results/. The bulk of the work lives here.
  • papers/{name}/ — final-figure assembly only. Figure PDFs, colour, layout. Tied to one paper; may never touch Snakemake.
  • scratch/{person}/ — personal, ad hoc, tracked. One-off experiments and custom workflows; tracked because seeing each other's scratch is useful.

Modular workflow

Nothing here is computed once — the catalog changed ~20× in the first release, and every paper varies the data vector, covariance, and inference. So the workflow is parameterized: the rules are shared, the config changes per run. Snakemake's module directive imports the rules under your own config and an output prefix, with per-rule override:

module analysis:
    snakefile: "../../workflow/Snakefile"
    config:    config           # this run's catalog, cuts, blind
    prefix:    "results/bmodes"  # products land here — no clobbering
use rule * from analysis

Each run namespaces under results/{run}/; a --dry-run on every composition guards against silent breakage as the structure grows.

Notebooks

The analysis tree is now notebook-free — the top-level notebooks/ directory is gone, and every notebook was moved to a proper home, converted, or deleted:

  • Reusable code → library / scripts. Reduction-notebook logic lives in src/sp_validation/; runnable workflows in scripts/examples/ (e.g. extract_info.py, calibrate_comprehensive_cat.py, leakage_minimal.py).
  • Tutorial → docs. tutorial_UNIONS_SP_v1.0 is now the live Sphinx page "Using the weak-lensing catalogues".
  • Paper-plot notebooks → papers/. The harmonic plot set and the catalog check_gaia plot moved under papers/{harmonic,catalog}/ (kept as notebooks — final-figure assembly).
  • cosmo_val/ resolved. Generic helpers lifted into src/sp_validation/basic.py; the five working notebooks converted to jupytext percent-light scripts under scratch/guerrini/. cosmo_val/ now holds only code + config + README.
  • Obsolete deleted. The exploratory reduction notebooks, glass_mock/validate_glass_mock, and defunct/ (quarantined since 2024) — all recoverable from develop history.
  • Discipline via tooling. nbstripout strips notebook outputs on commit, plus a large-file pre-commit hook (pre-commit install; see CONTRIBUTING).
Full per-notebook ledger (every .ipynb moved or deleted vs develop)

Moved → papers/harmonic/ (from cosmo_inference/notebooks/2D_harmonic_space_cosmic_shear_plots/, preserved as notebooks):
2025_09_26_plot_contours + its variants (_NL_modelling, _blind, _cl_vs_xi, _covariance, _glass_mock, _iNKA_vs_OneCov, _leakage, _scale_cut, _small_vs_large_scales, _weak_lensing), 2025_10_28_plot_whisker, S8_whisker

Moved → papers/catalog/ (from notebooks/cosmo_val/catalog_paper_plot/):
check_gaia

Promoted → docs (from notebooks/analyse_shear_cat/):
tutorial_UNIONS_SP_v1.0docs/source/using_the_catalogues.md

Converted → scripts in scratch/guerrini/ (from notebooks/cosmo_val/; reusable helpers chi2_and_pte, corr_from_cov, cov_from_one_covariance lifted to src/sp_validation/basic.py):
compute_pte_cell.py, one_covariance.py, plot_comparison.py, get_prior_leakage.py, exploration.py (+ its namaster_utils.py helper)

Deleted (recoverable from develop history):

  • notebooks/: main_set_up, metacal_global, metacal_local, psf_leakage, maps, maps_local, match_stats, correlation, cosmology, write_cat, frac_error_local_calib, analyse_matched_stars_UNIONS_HSC, analyse_shear_cat/m2_SP_LF_alpha
  • glass_mock/: validate_glass_mock
  • defunct/: TD_WL_cycle2_2021, validation_local_cal

glass_mock

The generation core folded into src/sp_validation/glass_mock.py; the runner scripts moved to scripts/glass_mock/. The top-level glass_mock/ directory is gone.

Still open

Intentionally left for follow-up passes:

  • cosmo_inference/ notebooks still need the same notebook-free treatment as cosmo_val/ — the 2D cosmic-shear paper plots belong in papers/ (parallel to papers/harmonic/), the working notebooks in scratch//scripts. Deferred to a careful pass on Candide where the inference stack runs, so each migration can be executed and verified rather than transformed blind.
  • scratch/ promotion candidates — some scratch material may still deserve a home in src/ or workflow/, most notably scratch/guerrini/namaster_utils.py, a NaMaster covariance toolkit that overlaps cosmo_val/harmonic_covariance_gaussian_sims.py. Consolidating the two (with tests) is a covariance-API decision best left to Sacha.

Safety net

A back-pressure guard suite holds structural invariants as files move — imports + standalone-scripts/ resolution, snakemake -n dry-runs, config-path existence, symlink integrity, and a dangling-reference / move-map guard. Sacha's develop foundation (paper plots, harmonic configs, library changes) is folded in, with the cosmology.py mnu KeyError fixed.

— Claude on behalf of Cail

sachaguer and others added 30 commits March 3, 2026 15:55
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fold Sacha's pending foundation (PR #192 head, sachaguer:develop @ c22f075)
onto current develop so the restructuring builds on his foundation without
racing his merge gesture (Cail's direction, 2026-06-05).

.gitignore conflict resolved in favour of develop: kept the .felt tracking
block, rejected sacha's broad cluster bans (*.png *.sh *.fits *.out *.err) —
those get narrowed during the restructuring gitignore pass, not adopted
wholesale. cosmo_val.py / cat_config.yaml auto-merged cleanly (origin's
docstring-RST polish + sacha's functional changes did not collide).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cosmology.py get_cosmo read planck_defaults["mnu"] but the dict never
defined the key, so every bare get_cosmo() call (no ccl_params, no mnu
arg) raised KeyError: 'mnu'. Add "mnu": PLANCK18["m_nu"] (0.06 eV).

Verified: test_cosmology.py 26/26 pass (was immediate KeyError before).
This is the one blocker that kept Sacha's foundation from running clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
docs/source/sp_validation.*.rst are regenerated on every docs build by
sphinx-apidoc (deploy-docs.yml: `sphinx-apidoc -feTMo docs/source
src/sp_validation`), matching the already-ignored fortuna.*/scripts.*
stubs — they should never be committed.

uv.lock: the container is the canonical runtime (CLAUDE.md), the lockfile
has never been tracked, so ignore it rather than make an unowned
pinned-dep commitment. One-line flip to track if we decide to pin.

Establishes a clean base for the restructuring branch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sacha's branch removed the cosmosis_pipeline_glass_mock_0*.ini and
_v0*.ini ignore patterns, which un-ignored ~700 generated glass-mock
pipeline configs in cosmo_inference/cosmosis_config/. Restore the two
specific patterns (not broad bans) so the tree returns to develop's
clean state. These are generated artifacts, never tracked.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
claude added 9 commits June 16, 2026 14:31
Repoint params.py config references (CLAUDE.md, quickstart.rst,
run_validation.md, post_processing.md, prepare_patch_for_spval.sh) at
scripts/examples/params.py, and the extract_information.* reference at the
moved scripts/examples/extract_info.py. Register notebooks/params.py ->
scripts/examples/params.py in the dangling-move-references guard.
Convert the user-facing tutorial_UNIONS_SP_v1.0 notebook (removed in the
notebooks/ cleanup) into a Sphinx User Guide page rather than deleting it.
Updated for the HDF5 catalogue format (>= v1.4.1) and cross-referenced to
sp_validation.calibration.get_calibrated_m_c / get_calibrate_e_from_cat,
which now automate the hand-rolled metacalibration steps.

https://claude.ai/code/session_011QuJMSPvnpsBkr7PBZVcPQ
The cosmo_val/ module was promoted from notebooks/cosmo_val/ during phase-2
with its notebooks riding along untouched. Resolve them per "reasonably
reusable code -> library, the rest -> scratch scripts":

- Lift three generic helpers into src/sp_validation/basic.py:
  chi2_and_pte, corr_from_cov, cov_from_one_covariance.
- Convert the five notebooks to jupytext percent-light .py under
  scratch/guerrini/ (magics guarded like run_cosmo_val.py, hardcoded
  paths preserved verbatim). compute_pte_cell and one_covariance now
  import the lifted helpers and SquareRootScale from sp_validation.rho_tau.
- Move the investigation namaster/ bundle (utils.py -> namaster_utils.py,
  exploration.ipynb -> exploration.py) and repair a latent broken import
  (sp_validation.utils_cosmo_val -> rho_tau) the phase-2 move missed.

cosmo_val/ is now notebook-free.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011QuJMSPvnpsBkr7PBZVcPQ
Conversion shipped in e3147b5; record outcome, closed status, and handoff.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011QuJMSPvnpsBkr7PBZVcPQ
claude added 2 commits June 17, 2026 14:50
Four .tldr/status + .tldrignore files (auto-generated by the TLDR tool,
status 'stopped') were committed by accident during the restructuring.
Remove them and add the patterns to .gitignore so they don't return.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011QuJMSPvnpsBkr7PBZVcPQ
Capture the cosmo_inference notebook reorg (make it notebook-free like
cosmo_val) as a TODO fiber, to be done carefully on Candide where the
inference stack runs. Also gitignore felt's SQLite WAL/shm sidecars.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011QuJMSPvnpsBkr7PBZVcPQ
@martinkilbinger

Copy link
Copy Markdown
Contributor

Can we cluster scripts that are used for calibration? These would be (at the moment, this might change for v2.0)

  • extract_info.py (to extract relevant metacal and other information from final shapepipe output)
  • create_joint_comprehensive_cat.py (create joined comprehensive catalogue merging all patches; v2.0 no longer has patches)
  • demo_apply_hsp_masks.py (add structural masks)
  • calibrate_comprehensive_cat.py (performs galaxy selection and calibrates)
    In the last step, config files from config/calibration are used.

@martinkilbinger

Copy link
Copy Markdown
Contributor

Please move to scratch:

  • analyse_matched_stars_UNIONS_HSC.ipynb (-> as py script)
  • demo_binned_mask.py
  • plot_binned_quantities.py

This can be deleted:

  • leakage_minimal.py
  • tests_bump.py
  • create_shear_mb_empty.py
  • demo_add_bands.py
  • demo_add_bands_to_empty.py
  • des_y3_cat.py
  • star_response.py

@sachaguer sachaguer left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks fine to me.

Regarding the namaster_utils script being redundant with other scripts, I think I will handle it when I will upgrade cosmo_val harmonic space scripts to tomography.
One objective would be to move the namaster_utils to the source so that one can use it directly and use cosmo_val to iterate on the basic functions. I will use it to refactor the script computing the Gaussian simulations with less duplication of code. I would keep it for the next PR.

Some paper scripts from configuration space paper remain to be moved to the papers directory. The folder 2D_consistency_check can be moved there as well. It contains figures regarding the consistency between harmonic and configuration space.

I am sure that we will find bugs when we will try to run some of the scripts but, as for me, the objective of this PR to refactor the global structure of the repository has been met.

cailmdaley and others added 2 commits June 20, 2026 02:40
…cv_init

These per-version rules called cv_params(version_list=["{version}"]) — a Python
call evaluated at parse time, so the literal string "{version}" reached
CosmologyValidation and raised KeyError('Version string {version} not found').
xi/rho_tau worked (string-param substitution); cv_pseudo_cl worked (iterates the
real CV_VERSIONS). Now cv_init is a lazy 'lambda w: cv_init_params(config,
[w.version])' so snakemake passes the resolved wildcard. Surfaced by the first
real cosmo_val run (job 791399/791400).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…er, consistency move

Martin's review:
- Cluster the calibration pipeline into scripts/calibration/: extract_info →
  create_joint_comprehensive_cat → demo_apply_hsp_masks → calibrate_comprehensive_cat
  (+ the shared params.py template + a README documenting the 4-step order).
- Delete dead example scripts: leakage_minimal, create_shear_mb_empty,
  demo_add_bands{,_to_empty}, des_y3_cat, star_response (tests_bump already gone).
- Move to scratch/kilbinger/: demo_binned_mask, plot_binned_quantities, and
  analyse_matched_stars_UNIONS_HSC (recovered from history, converted .ipynb→.py).

Sasha's review (procedural):
- Relocate the config-vs-harmonic consistency folder
  cosmo_inference/notebooks/2D_cosmic_shear_consistency → papers/consistency.

Docs + guards: post_processing/run_validation/quickstart/CLAUDE.md/prepare_patch
updated to the new scripts/calibration/ paths; MOVE_MAP gains the consistency
retirement and the params.py target; gitignore the cosmo_val SLURM run logs.
Structural guards green (10/10 package-free); test_basic + test_calibration
value-drift pins green in-container.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cailmdaley added a commit that referenced this pull request Jun 20, 2026
scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he
reserved for his next PR in the #197 review. So future workers don't touch it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The image rebuild pulled scipy 1.18.0 (py3.12+), which changed interpolators
to return arrays for scalar input. That breaks camb 1.6.x's BBN Y_He predictor
inside CAMBparams.set_cosmology — `self.YHe = Y_He(...)` raises
"TypeError: only 0-dimensional arrays can be converted to Python scalars".
Every get_cosmo-backed test (test_cosmology, test_cosmo_val, test_glass_mock —
27 failures) funnelled through it; none are code regressions.

Confirmed by isolated repro: scipy 1.18.0 + camb 1.6.6 fails the exact call,
scipy 1.17.x passes. Cap until camb ships a scipy-1.18-compatible BBN predictor.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cailmdaley added a commit that referenced this pull request Jun 20, 2026
scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he
reserved for his next PR in the #197 review. So future workers don't touch it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cailmdaley cailmdaley marked this pull request as ready for review June 20, 2026 11:47
@cailmdaley

Copy link
Copy Markdown
Collaborator Author

Thanks for the comments both of you! @martinkilbinger I have started addressing your requests in #199. Will merge this now so we can keep moving and do targeted bugfix/improvement PRs directly to develop.

@cailmdaley cailmdaley merged commit 4b7e0c6 into develop Jun 20, 2026
2 checks passed
@cailmdaley cailmdaley deleted the cleanup/restructuring branch June 20, 2026 13:04
cailmdaley added a commit that referenced this pull request Jun 23, 2026
)

* refactor(src): dissolve basic.py into calibration + statistics

The module named "basic" was a grab-bag: the 546-line `metacal` response
class (the heart of shear calibration) plus galaxy-selection masks and a
handful of cosmology-independent statistics helpers — none of which "basic"
described. Its symbols now live where they belong, and basic.py is deleted.

- `metacal` class + `mask_gal_size`/`mask_gal_SNR` (galaxy selection) →
  calibration.py, joining the m/c routines that already consumed a
  `gal_metacal` instance. One subsystem, one module.
- `jackknif_weighted_average2`, `corr_from_cov`, `chi2_and_pte`,
  `cov_from_one_covariance` → new statistics.py (a clean leaf: numpy/scipy
  only; calibration imports the jackknife from it).
- Every importer repointed (papers, scripts, the two scratch/guerrini
  import lines — path-only, his logic untouched); dead `from sp_validation
  import basic` lines removed from calibration.py and cat.py; `__all__` and
  the architecture docs updated.
- Tests split: metacal + mask pins → test_calibration.py, jackknife pin →
  test_statistics.py; test_basic.py removed.

All moved code is byte-identical to the original (md5-verified); value-drift
pins (metacal R-matrix rtol 1e-12) and the full suite pass in-container,
except the pre-existing galaxy/cs_util.size old-sandbox gap. No circular
imports. Verified by an adversarial multi-agent pass (byte-identity,
no-stale-refs, value-pins, no-cycles).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* felt: reserved-sasha — document Sasha's hands-off zones

scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he
reserved for his next PR in the #197 review. So future workers don't touch it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(statistics): pin the extracted helpers; drop dead code in calibration

Follow-up polish on the basic.py dissolution (#199).

Tests — characterization (value-drift) coverage for the three statistics.py
helpers that had none, with literals generated by running the real functions
in-container and teeth on each:
- corr_from_cov: unit diagonal + reconstruction from cov/outer(std,std)
- chi2_and_pte: diagonal reduces to sum((d/sigma)^2) with matching scipy PTE,
  plus a non-diagonal case exercising the full d^T C^-1 d path
- cov_from_one_covariance: gaussian(col 10) vs non-gaussian(col 9) selection
  and a row-major-layout check (a transpose would be caught)

Calibration — strictly behavior-preserving dead-code removal:
- 3 unused module imports (util, io, get_footprint — verified unreferenced)
- an unused local (col_noshear) in metacal._read_data
- the uncallable metacal._return method (defined without self, references
  self.* in its body — would NameError if ever invoked; referenced nowhere)

Value pins (metacal R-matrix, m/c bias) stay green; conservatively skipped any
change that would reorder float ops or restructure an estimator. Verified by an
adversarial behavior-preservation review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): delete dead info.py, fix cat.py version import

info.py had zero importers; its only content was a redundant
__name__ = 'sp_validation'. cat.py imported __version__ and __name__
from the package root but used only __version__ (line 607); the
software name at line 606 is already hardcoded. Repoint to the
canonical home: from sp_validation.version import __version__.

Register the retired import path in the dangling-move guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): make package __all__ honest

The old __all__ listed modules nobody imports through the package
(io, plot_style, cosmo_val) and omitted the two genuinely public
diagnostic modules rho_tau and b_modes. Replace it with the real
public surface, alphabetised, and drop the stale commented-out
explicit-import block. Nothing does `from sp_validation import *`,
so this is purely a documentation fix.

(util and run_joint_cat are renamed in the Tier-2 commits.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): hoist b_modes imports to module top level

Five methods each imported from .b_modes inside their bodies. b_modes
is import-time side-effect-light (it pulls only .cosmology) and has no
back-edge to cosmo_val, so the locals were defensive, not necessary.
Consolidate the union of imported names into one top-level block next
to the existing cosmology/rho_tau imports and drop the five inline
imports. test_cosmo_val: 11 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(run_joint_cat): drop shadowed confusion_matrix def

Two module-level defs named confusion_matrix existed; the first
(mask, confidence_level=0.9) was a near-duplicate of correlation_matrix
and was unconditionally shadowed by the second
(prediction, observation) ~40 lines later. Every caller — in
scripts/calibration and scripts/examples — uses the
(prediction, observation) signature. Remove the dead first def.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor: remove dead cluster/convergence-map helpers

Five functions had zero callers anywhere in the repo (src, scripts,
papers, scratch, notebooks), including across the star-imports of
plots:
  cosmology.py: get_clusters, stack_mm3, gamma_T_tc, xi_gal_gal_tc
  plots.py:     plot_map_stacked
Removing the cosmology block also orphaned the imports that existed
solely for it (treecorr, fits, canfar, radec2xy, cKDTree, tqdm,
get_footprint); drop those too. test_cosmology + test_plots: 29 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): rename util -> format; drop dead transform_nan import

util.py held only millify / print_millified — number formatting, not a
grab-bag. Rename it to format.py and sweep all importers:
  internal: cat.py, run_joint_cat.py (util.millify -> format.millify)
  scripts:  apply_alpha.py, examples/demo_calibrate_minimal_cat.py,
            calibration/extract_info.py (star-import x2)
  papers:   catalog/hist_mag.py
Register the rename in the dangling-move guard; update package __all__.

B1: scripts/plot_leakage.py imported transform_nan from the old util
module — a symbol removed from the library long ago and never used in
the script. Drop the dead import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): move SquareRootScale to plots, re-export from rho_tau

SquareRootScale is a matplotlib ScaleBase subclass — plotting
infrastructure, not rho/tau logic. Move the class and its
register_scale call into plots.py (which now carries the
matplotlib.scale/ticker/transforms imports it needs). rho_tau.py keeps
a compat re-export so existing
`from sp_validation.rho_tau import SquareRootScale` callers — in
cosmo_inference, scratch/guerrini, papers/harmonic — still resolve.

workflow/scripts/plotting_utils.py holds a near-duplicate that
diverges (ScalarFormatter(useMathText=True); inverted transform method
named transform_non_affine not transform), so it is left in place.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(papers/harmonic): repoint SquareRootScale off dead utils_cosmo_val

Six harmonic-space scripts imported SquareRootScale from
sp_validation.utils_cosmo_val, a module that no longer exists. Repoint
to sp_validation.rho_tau (the compat re-export), matching the sibling
2026_03_17 script. get_params_rho_tau was already correctly imported
from rho_tau. No other utils_cosmo_val imports remain (the two
scratch/guerrini mentions are prose noting the module's removal).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): rename run_joint_cat -> catalog_builders

The module holds the catalogue-builder runner classes (JointCat,
ApplyHspMasks, CalibrateCat) and their run_* entry-point functions;
catalog_builders names that role. Sweep all importers (the
`as sp_joint` alias is preserved, only the module name changes):
  scripts/apply_hsp_masks.py
  scripts/examples/{demo_check_footprint, create_binned_mask_comprehensive,
    demo_comprehensive_to_minimal_cat, demo_create_footprint_mask,
    demo_calibrate_minimal_cat}.py
  scripts/calibration/{create_joint_comprehensive_cat (direct symbol),
    demo_apply_hsp_masks, calibrate_comprehensive_cat}.py
  scratch/kilbinger/demo_binned_mask.py
  papers/catalog/hist_mag.py
Also update the two prose references in docs and update __all__ and
the dangling-move guard.

The OPTIONAL masks.py extraction (Mask + mask-algebra fns) is deferred:
those symbols are reached externally through the sp_joint.* module
alias (Mask, get_masks_from_config, print_mask_stats across 4 scripts +
papers/catalog), so splitting them out would require either re-exports
or sweeping the public call surface — beyond a behavior-preserving move.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Extract masks.py from catalog_builders.py

Move the healsparse-backed spatial-masking cluster out of
catalog_builders.py into a dedicated masks.py: the Mask class plus
get_masks_from_config, print_mask_stats, correlation_matrix, and
confusion_matrix. Bodies are byte-identical; the move carries the
numexpr/scipy.stats imports those helpers need (now removed from
catalog_builders, which no longer references them).

catalog_builders.py re-exports the five symbols from sp_validation.masks
so external code using `from sp_validation import catalog_builders as
sp_joint` keeps resolving sp_joint.Mask, sp_joint.get_masks_from_config,
etc. The *Cat runner classes (ApplyHspMasks, ReadCat, run_* entry points)
stay; ApplyHspMasks uses healsparse directly, not the Mask class.

masks added to __init__.__all__. No MOVE_MAP entry: this is an
extraction-in-place (catalog_builders survives), not a retired path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Rename cat.py -> catalog.py: catalogue data layer

The catalogue module pair now reads by role: catalog.py is the data
layer (read/write/column-access/matching free functions), catalog_builders.py
is the construction pipeline (runner classes built on it). Module docstrings
state this hierarchy explicitly.

Behaviour-preserving. Every importer of the local sp_validation.cat module is
swept to sp_validation.catalog, each preserving its local binding (bare
`import cat` forms gain `as cat` so function bodies are unchanged). The
cs_util `cat` import is a different module and is left untouched throughout.

The dangling-reference guard registers the retired flat-import form
`sp_validation.cat import` rather than the bare `sp_validation.cat` token,
which would false-positive on the live `sp_validation.catalog` /
`sp_validation.catalog_builders` modules (same prefix trap as glass_mock).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Deleted obsolete script cosmo_val/match_LF_SP.py

* Deleted old scripts/create_joint_shape_cat.py

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Martin Kilbinger <martin.kilbinger@cea.fr>
cailmdaley added a commit that referenced this pull request Jun 30, 2026
* refactor(src): dissolve basic.py into calibration + statistics

The module named "basic" was a grab-bag: the 546-line `metacal` response
class (the heart of shear calibration) plus galaxy-selection masks and a
handful of cosmology-independent statistics helpers — none of which "basic"
described. Its symbols now live where they belong, and basic.py is deleted.

- `metacal` class + `mask_gal_size`/`mask_gal_SNR` (galaxy selection) →
  calibration.py, joining the m/c routines that already consumed a
  `gal_metacal` instance. One subsystem, one module.
- `jackknif_weighted_average2`, `corr_from_cov`, `chi2_and_pte`,
  `cov_from_one_covariance` → new statistics.py (a clean leaf: numpy/scipy
  only; calibration imports the jackknife from it).
- Every importer repointed (papers, scripts, the two scratch/guerrini
  import lines — path-only, his logic untouched); dead `from sp_validation
  import basic` lines removed from calibration.py and cat.py; `__all__` and
  the architecture docs updated.
- Tests split: metacal + mask pins → test_calibration.py, jackknife pin →
  test_statistics.py; test_basic.py removed.

All moved code is byte-identical to the original (md5-verified); value-drift
pins (metacal R-matrix rtol 1e-12) and the full suite pass in-container,
except the pre-existing galaxy/cs_util.size old-sandbox gap. No circular
imports. Verified by an adversarial multi-agent pass (byte-identity,
no-stale-refs, value-pins, no-cycles).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* felt: reserved-sasha — document Sasha's hands-off zones

scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he
reserved for his next PR in the #197 review. So future workers don't touch it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(statistics): pin the extracted helpers; drop dead code in calibration

Follow-up polish on the basic.py dissolution (#199).

Tests — characterization (value-drift) coverage for the three statistics.py
helpers that had none, with literals generated by running the real functions
in-container and teeth on each:
- corr_from_cov: unit diagonal + reconstruction from cov/outer(std,std)
- chi2_and_pte: diagonal reduces to sum((d/sigma)^2) with matching scipy PTE,
  plus a non-diagonal case exercising the full d^T C^-1 d path
- cov_from_one_covariance: gaussian(col 10) vs non-gaussian(col 9) selection
  and a row-major-layout check (a transpose would be caught)

Calibration — strictly behavior-preserving dead-code removal:
- 3 unused module imports (util, io, get_footprint — verified unreferenced)
- an unused local (col_noshear) in metacal._read_data
- the uncallable metacal._return method (defined without self, references
  self.* in its body — would NameError if ever invoked; referenced nowhere)

Value pins (metacal R-matrix, m/c bias) stay green; conservatively skipped any
change that would reorder float ops or restructure an estimator. Verified by an
adversarial behavior-preservation review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): delete dead info.py, fix cat.py version import

info.py had zero importers; its only content was a redundant
__name__ = 'sp_validation'. cat.py imported __version__ and __name__
from the package root but used only __version__ (line 607); the
software name at line 606 is already hardcoded. Repoint to the
canonical home: from sp_validation.version import __version__.

Register the retired import path in the dangling-move guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): make package __all__ honest

The old __all__ listed modules nobody imports through the package
(io, plot_style, cosmo_val) and omitted the two genuinely public
diagnostic modules rho_tau and b_modes. Replace it with the real
public surface, alphabetised, and drop the stale commented-out
explicit-import block. Nothing does `from sp_validation import *`,
so this is purely a documentation fix.

(util and run_joint_cat are renamed in the Tier-2 commits.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): hoist b_modes imports to module top level

Five methods each imported from .b_modes inside their bodies. b_modes
is import-time side-effect-light (it pulls only .cosmology) and has no
back-edge to cosmo_val, so the locals were defensive, not necessary.
Consolidate the union of imported names into one top-level block next
to the existing cosmology/rho_tau imports and drop the five inline
imports. test_cosmo_val: 11 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(run_joint_cat): drop shadowed confusion_matrix def

Two module-level defs named confusion_matrix existed; the first
(mask, confidence_level=0.9) was a near-duplicate of correlation_matrix
and was unconditionally shadowed by the second
(prediction, observation) ~40 lines later. Every caller — in
scripts/calibration and scripts/examples — uses the
(prediction, observation) signature. Remove the dead first def.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor: remove dead cluster/convergence-map helpers

Five functions had zero callers anywhere in the repo (src, scripts,
papers, scratch, notebooks), including across the star-imports of
plots:
  cosmology.py: get_clusters, stack_mm3, gamma_T_tc, xi_gal_gal_tc
  plots.py:     plot_map_stacked
Removing the cosmology block also orphaned the imports that existed
solely for it (treecorr, fits, canfar, radec2xy, cKDTree, tqdm,
get_footprint); drop those too. test_cosmology + test_plots: 29 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): rename util -> format; drop dead transform_nan import

util.py held only millify / print_millified — number formatting, not a
grab-bag. Rename it to format.py and sweep all importers:
  internal: cat.py, run_joint_cat.py (util.millify -> format.millify)
  scripts:  apply_alpha.py, examples/demo_calibrate_minimal_cat.py,
            calibration/extract_info.py (star-import x2)
  papers:   catalog/hist_mag.py
Register the rename in the dangling-move guard; update package __all__.

B1: scripts/plot_leakage.py imported transform_nan from the old util
module — a symbol removed from the library long ago and never used in
the script. Drop the dead import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): move SquareRootScale to plots, re-export from rho_tau

SquareRootScale is a matplotlib ScaleBase subclass — plotting
infrastructure, not rho/tau logic. Move the class and its
register_scale call into plots.py (which now carries the
matplotlib.scale/ticker/transforms imports it needs). rho_tau.py keeps
a compat re-export so existing
`from sp_validation.rho_tau import SquareRootScale` callers — in
cosmo_inference, scratch/guerrini, papers/harmonic — still resolve.

workflow/scripts/plotting_utils.py holds a near-duplicate that
diverges (ScalarFormatter(useMathText=True); inverted transform method
named transform_non_affine not transform), so it is left in place.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(papers/harmonic): repoint SquareRootScale off dead utils_cosmo_val

Six harmonic-space scripts imported SquareRootScale from
sp_validation.utils_cosmo_val, a module that no longer exists. Repoint
to sp_validation.rho_tau (the compat re-export), matching the sibling
2026_03_17 script. get_params_rho_tau was already correctly imported
from rho_tau. No other utils_cosmo_val imports remain (the two
scratch/guerrini mentions are prose noting the module's removal).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): rename run_joint_cat -> catalog_builders

The module holds the catalogue-builder runner classes (JointCat,
ApplyHspMasks, CalibrateCat) and their run_* entry-point functions;
catalog_builders names that role. Sweep all importers (the
`as sp_joint` alias is preserved, only the module name changes):
  scripts/apply_hsp_masks.py
  scripts/examples/{demo_check_footprint, create_binned_mask_comprehensive,
    demo_comprehensive_to_minimal_cat, demo_create_footprint_mask,
    demo_calibrate_minimal_cat}.py
  scripts/calibration/{create_joint_comprehensive_cat (direct symbol),
    demo_apply_hsp_masks, calibrate_comprehensive_cat}.py
  scratch/kilbinger/demo_binned_mask.py
  papers/catalog/hist_mag.py
Also update the two prose references in docs and update __all__ and
the dangling-move guard.

The OPTIONAL masks.py extraction (Mask + mask-algebra fns) is deferred:
those symbols are reached externally through the sp_joint.* module
alias (Mask, get_masks_from_config, print_mask_stats across 4 scripts +
papers/catalog), so splitting them out would require either re-exports
or sweeping the public call surface — beyond a behavior-preserving move.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Extract masks.py from catalog_builders.py

Move the healsparse-backed spatial-masking cluster out of
catalog_builders.py into a dedicated masks.py: the Mask class plus
get_masks_from_config, print_mask_stats, correlation_matrix, and
confusion_matrix. Bodies are byte-identical; the move carries the
numexpr/scipy.stats imports those helpers need (now removed from
catalog_builders, which no longer references them).

catalog_builders.py re-exports the five symbols from sp_validation.masks
so external code using `from sp_validation import catalog_builders as
sp_joint` keeps resolving sp_joint.Mask, sp_joint.get_masks_from_config,
etc. The *Cat runner classes (ApplyHspMasks, ReadCat, run_* entry points)
stay; ApplyHspMasks uses healsparse directly, not the Mask class.

masks added to __init__.__all__. No MOVE_MAP entry: this is an
extraction-in-place (catalog_builders survives), not a retired path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Rename cat.py -> catalog.py: catalogue data layer

The catalogue module pair now reads by role: catalog.py is the data
layer (read/write/column-access/matching free functions), catalog_builders.py
is the construction pipeline (runner classes built on it). Module docstrings
state this hierarchy explicitly.

Behaviour-preserving. Every importer of the local sp_validation.cat module is
swept to sp_validation.catalog, each preserving its local binding (bare
`import cat` forms gain `as cat` so function bodies are unchanged). The
cs_util `cat` import is a different module and is left untouched throughout.

The dangling-reference guard registers the retired flat-import form
`sp_validation.cat import` rather than the bare `sp_validation.cat` token,
which would false-positive on the live `sp_validation.catalog` /
`sp_validation.catalog_builders` modules (same prefix trap as glass_mock).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* style(ruff): repo-wide ruff format pass (mechanical, no semantic change)

* chore(ruff): wire ruff (pre-commit + CI) + region-aware lint policy

* fix(ruff): region-aware lint fixes (behavior-preserving, adversarially verified)

* chore(ruff): scope CI lint to library; broaden E402 ignores for per-paper analysis scripts

The gate revealed peripheral residual the per-file-ignore globs didn't reach
(analysis scripts under papers/<paper>/*.py, cosmo_inference notebooks) plus the
Snakemake-injected `snakemake` global (F821). Region-aware decision: enforce lint
on src/ only (matching the pre-commit hook), keep format repo-wide, run the
repo-wide lint advisory. src/ stays pristine (ruff check src/ => clean).

* chore(ruff): pivot lint gate to warn-locally / account-on-develop

Flip the lint discipline from "block" to "warn while you work, account
when it lands."

Pre-commit now auto-applies ruff's SAFE fixes (`ruff format` + `ruff
check --fix`: import sorting, unused imports) — a one-time re-stage, not
a block — and only WARNS on what ruff won't safely fix (undefined names
F821, unused variables F841), which never blocks a commit. That split is
ruff's own safe/unsafe-fix taxonomy, not a hand-curated list. Bloat
guards (nbstripout, large-files) still block.

CI lint now runs only on push to `develop` and PRs targeting `develop`.
On failure it goes red (blocks the merge) AND opens/updates a single
auto-closing lint-debt issue per committer, @-mentioning + assigning
them; a clean run closes it. Fork PRs are covered via
pull_request_target, hardened with `uvx --no-config` so an untrusted PR
can't redirect ruff's download to a trojaned index. Issue management
runs continue-on-error so it can't red a clean run, and a ruff/uvx
tooling error reds the gate without filing a committer issue.

Settled with Sacha, 2026-06-23.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SkTwLDgicfeoK8Np2bgwp3

* chore(ruff): clear lint baseline to the 5 genuine undefined names

Bring the repo-wide ruff baseline from 125 errors down to 5, so the
develop gate's first run is honest rather than a wall of false positives:

- builtins = ["snakemake"] silences ~110 false F821s — Snakemake injects
  a `snakemake` object into rule scripts at runtime, so ruff can't see
  where it's defined. Every other undefined name in those scripts is
  still caught.
- broaden the cosmo_inference E402 ignore from `*.py` to `**` so it also
  covers `.ipynb` cells (same intentional sys.path / mpl-backend-before-
  import pattern the .py scripts already get).
- normalize mixed tabs/spaces in download_gaia_catalogues.py's SQL
  f-string to spaces (E101; also unbreaks the surrounding dedent()).

Remaining are 5 genuine undefined names (cov_sim_gaussian in
papers/harmonic/...validation_cov_cell.py; index/n_contours in the
papers/consistency glass-mocks notebook), triaged with their authors.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SkTwLDgicfeoK8Np2bgwp3

* Fix ruff hooks. Removed consistency not book and moved it to my personal scratch. Removed the lines with undefined variable in the harmonic space script.

* chore(ruff): PR feedback as a PR comment, issue only for develop pushes

Refine the lint gate's feedback surface to fit the event (Cail's call: a
separate issue for a PR is disconnected — comment where the author is
already looking; issues are for things going straight into develop):

- PR against develop → red check (still blocks the merge) + a single
  auto-updating comment ON the PR with the full violation list, turning
  green when ruff passes. ruff annotations (--output-format=github) are
  emitted too (they surface in the run's Checks view; pull_request_target
  anchors them to the base commit, so not inline on the diff). No issue.
- direct push to develop → unchanged: one per-committer auto-closing
  lint-debt issue (no PR exists to comment on).

Adds pull-requests: write for the comment path. Fresh-eyes reviewed
(SHIP); the over-promised "inline on Files changed" wording was corrected
since the comment's file:line list is the reliable surface.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SkTwLDgicfeoK8Np2bgwp3

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Sacha Guerrini <sacha.guerrini@cea.fr>
cailmdaley added a commit that referenced this pull request Jun 30, 2026
* refactor(src): dissolve basic.py into calibration + statistics

The module named "basic" was a grab-bag: the 546-line `metacal` response
class (the heart of shear calibration) plus galaxy-selection masks and a
handful of cosmology-independent statistics helpers — none of which "basic"
described. Its symbols now live where they belong, and basic.py is deleted.

- `metacal` class + `mask_gal_size`/`mask_gal_SNR` (galaxy selection) →
  calibration.py, joining the m/c routines that already consumed a
  `gal_metacal` instance. One subsystem, one module.
- `jackknif_weighted_average2`, `corr_from_cov`, `chi2_and_pte`,
  `cov_from_one_covariance` → new statistics.py (a clean leaf: numpy/scipy
  only; calibration imports the jackknife from it).
- Every importer repointed (papers, scripts, the two scratch/guerrini
  import lines — path-only, his logic untouched); dead `from sp_validation
  import basic` lines removed from calibration.py and cat.py; `__all__` and
  the architecture docs updated.
- Tests split: metacal + mask pins → test_calibration.py, jackknife pin →
  test_statistics.py; test_basic.py removed.

All moved code is byte-identical to the original (md5-verified); value-drift
pins (metacal R-matrix rtol 1e-12) and the full suite pass in-container,
except the pre-existing galaxy/cs_util.size old-sandbox gap. No circular
imports. Verified by an adversarial multi-agent pass (byte-identity,
no-stale-refs, value-pins, no-cycles).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* felt: reserved-sasha — document Sasha's hands-off zones

scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he
reserved for his next PR in the #197 review. So future workers don't touch it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(statistics): pin the extracted helpers; drop dead code in calibration

Follow-up polish on the basic.py dissolution (#199).

Tests — characterization (value-drift) coverage for the three statistics.py
helpers that had none, with literals generated by running the real functions
in-container and teeth on each:
- corr_from_cov: unit diagonal + reconstruction from cov/outer(std,std)
- chi2_and_pte: diagonal reduces to sum((d/sigma)^2) with matching scipy PTE,
  plus a non-diagonal case exercising the full d^T C^-1 d path
- cov_from_one_covariance: gaussian(col 10) vs non-gaussian(col 9) selection
  and a row-major-layout check (a transpose would be caught)

Calibration — strictly behavior-preserving dead-code removal:
- 3 unused module imports (util, io, get_footprint — verified unreferenced)
- an unused local (col_noshear) in metacal._read_data
- the uncallable metacal._return method (defined without self, references
  self.* in its body — would NameError if ever invoked; referenced nowhere)

Value pins (metacal R-matrix, m/c bias) stay green; conservatively skipped any
change that would reorder float ops or restructure an estimator. Verified by an
adversarial behavior-preservation review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): delete dead info.py, fix cat.py version import

info.py had zero importers; its only content was a redundant
__name__ = 'sp_validation'. cat.py imported __version__ and __name__
from the package root but used only __version__ (line 607); the
software name at line 606 is already hardcoded. Repoint to the
canonical home: from sp_validation.version import __version__.

Register the retired import path in the dangling-move guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): make package __all__ honest

The old __all__ listed modules nobody imports through the package
(io, plot_style, cosmo_val) and omitted the two genuinely public
diagnostic modules rho_tau and b_modes. Replace it with the real
public surface, alphabetised, and drop the stale commented-out
explicit-import block. Nothing does `from sp_validation import *`,
so this is purely a documentation fix.

(util and run_joint_cat are renamed in the Tier-2 commits.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): hoist b_modes imports to module top level

Five methods each imported from .b_modes inside their bodies. b_modes
is import-time side-effect-light (it pulls only .cosmology) and has no
back-edge to cosmo_val, so the locals were defensive, not necessary.
Consolidate the union of imported names into one top-level block next
to the existing cosmology/rho_tau imports and drop the five inline
imports. test_cosmo_val: 11 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(run_joint_cat): drop shadowed confusion_matrix def

Two module-level defs named confusion_matrix existed; the first
(mask, confidence_level=0.9) was a near-duplicate of correlation_matrix
and was unconditionally shadowed by the second
(prediction, observation) ~40 lines later. Every caller — in
scripts/calibration and scripts/examples — uses the
(prediction, observation) signature. Remove the dead first def.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor: remove dead cluster/convergence-map helpers

Five functions had zero callers anywhere in the repo (src, scripts,
papers, scratch, notebooks), including across the star-imports of
plots:
  cosmology.py: get_clusters, stack_mm3, gamma_T_tc, xi_gal_gal_tc
  plots.py:     plot_map_stacked
Removing the cosmology block also orphaned the imports that existed
solely for it (treecorr, fits, canfar, radec2xy, cKDTree, tqdm,
get_footprint); drop those too. test_cosmology + test_plots: 29 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): rename util -> format; drop dead transform_nan import

util.py held only millify / print_millified — number formatting, not a
grab-bag. Rename it to format.py and sweep all importers:
  internal: cat.py, run_joint_cat.py (util.millify -> format.millify)
  scripts:  apply_alpha.py, examples/demo_calibrate_minimal_cat.py,
            calibration/extract_info.py (star-import x2)
  papers:   catalog/hist_mag.py
Register the rename in the dangling-move guard; update package __all__.

B1: scripts/plot_leakage.py imported transform_nan from the old util
module — a symbol removed from the library long ago and never used in
the script. Drop the dead import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): move SquareRootScale to plots, re-export from rho_tau

SquareRootScale is a matplotlib ScaleBase subclass — plotting
infrastructure, not rho/tau logic. Move the class and its
register_scale call into plots.py (which now carries the
matplotlib.scale/ticker/transforms imports it needs). rho_tau.py keeps
a compat re-export so existing
`from sp_validation.rho_tau import SquareRootScale` callers — in
cosmo_inference, scratch/guerrini, papers/harmonic — still resolve.

workflow/scripts/plotting_utils.py holds a near-duplicate that
diverges (ScalarFormatter(useMathText=True); inverted transform method
named transform_non_affine not transform), so it is left in place.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(papers/harmonic): repoint SquareRootScale off dead utils_cosmo_val

Six harmonic-space scripts imported SquareRootScale from
sp_validation.utils_cosmo_val, a module that no longer exists. Repoint
to sp_validation.rho_tau (the compat re-export), matching the sibling
2026_03_17 script. get_params_rho_tau was already correctly imported
from rho_tau. No other utils_cosmo_val imports remain (the two
scratch/guerrini mentions are prose noting the module's removal).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(src): rename run_joint_cat -> catalog_builders

The module holds the catalogue-builder runner classes (JointCat,
ApplyHspMasks, CalibrateCat) and their run_* entry-point functions;
catalog_builders names that role. Sweep all importers (the
`as sp_joint` alias is preserved, only the module name changes):
  scripts/apply_hsp_masks.py
  scripts/examples/{demo_check_footprint, create_binned_mask_comprehensive,
    demo_comprehensive_to_minimal_cat, demo_create_footprint_mask,
    demo_calibrate_minimal_cat}.py
  scripts/calibration/{create_joint_comprehensive_cat (direct symbol),
    demo_apply_hsp_masks, calibrate_comprehensive_cat}.py
  scratch/kilbinger/demo_binned_mask.py
  papers/catalog/hist_mag.py
Also update the two prose references in docs and update __all__ and
the dangling-move guard.

The OPTIONAL masks.py extraction (Mask + mask-algebra fns) is deferred:
those symbols are reached externally through the sp_joint.* module
alias (Mask, get_masks_from_config, print_mask_stats across 4 scripts +
papers/catalog), so splitting them out would require either re-exports
or sweeping the public call surface — beyond a behavior-preserving move.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Extract masks.py from catalog_builders.py

Move the healsparse-backed spatial-masking cluster out of
catalog_builders.py into a dedicated masks.py: the Mask class plus
get_masks_from_config, print_mask_stats, correlation_matrix, and
confusion_matrix. Bodies are byte-identical; the move carries the
numexpr/scipy.stats imports those helpers need (now removed from
catalog_builders, which no longer references them).

catalog_builders.py re-exports the five symbols from sp_validation.masks
so external code using `from sp_validation import catalog_builders as
sp_joint` keeps resolving sp_joint.Mask, sp_joint.get_masks_from_config,
etc. The *Cat runner classes (ApplyHspMasks, ReadCat, run_* entry points)
stay; ApplyHspMasks uses healsparse directly, not the Mask class.

masks added to __init__.__all__. No MOVE_MAP entry: this is an
extraction-in-place (catalog_builders survives), not a retired path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Rename cat.py -> catalog.py: catalogue data layer

The catalogue module pair now reads by role: catalog.py is the data
layer (read/write/column-access/matching free functions), catalog_builders.py
is the construction pipeline (runner classes built on it). Module docstrings
state this hierarchy explicitly.

Behaviour-preserving. Every importer of the local sp_validation.cat module is
swept to sp_validation.catalog, each preserving its local binding (bare
`import cat` forms gain `as cat` so function bodies are unchanged). The
cs_util `cat` import is a different module and is left untouched throughout.

The dangling-reference guard registers the retired flat-import form
`sp_validation.cat import` rather than the bare `sp_validation.cat` token,
which would false-positive on the live `sp_validation.catalog` /
`sp_validation.catalog_builders` modules (same prefix trap as glass_mock).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* style(ruff): repo-wide ruff format pass (mechanical, no semantic change)

* chore(ruff): wire ruff (pre-commit + CI) + region-aware lint policy

* fix(ruff): region-aware lint fixes (behavior-preserving, adversarially verified)

* chore(ruff): scope CI lint to library; broaden E402 ignores for per-paper analysis scripts

The gate revealed peripheral residual the per-file-ignore globs didn't reach
(analysis scripts under papers/<paper>/*.py, cosmo_inference notebooks) plus the
Snakemake-injected `snakemake` global (F821). Region-aware decision: enforce lint
on src/ only (matching the pre-commit hook), keep format repo-wide, run the
repo-wide lint advisory. src/ stays pristine (ruff check src/ => clean).

* refactor(cosmo_val): convert module to package (core.py + __init__ re-export)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): extract PseudoClMixin (pseudo_cl)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): extract CatalogCharacterizationMixin (catalog_characterization)

* refactor(cosmo_val): extract PSFSystematicsMixin (psf_systematics)

* refactor(cosmo_val): extract RealSpaceMixin (real_space)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): extract PureEBMixin (pure_eb)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): extract CosebisMixin (cosebis)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): fix sibling-module imports for package depth

Converting cosmo_val from a module to a package moved every file one level
deeper, so single-dot relative imports to top-level siblings now resolve
inside the package. Point them up a level:
- ..b_modes (core, pure_eb, cosebis)
- ..cosmology (core, pseudo_cl)
- ..rho_tau (psf_systematics, pseudo_cl)

Re-export cs_plots in __init__ from its origin (cs_util.plots) rather than
from core, which no longer needs the alias after the plotting methods moved
to their mixins. Preserves the sp_validation.cosmo_val.cs_plots attribute path.

Package now imports cleanly; test_cosmo_val + test_b_modes: 18 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(examples): call hsp_map_logical_or from plots, not the cosmo_val module

The demo called cosmo_val.hsp_map_logical_or, but that function has only ever
lived in plots.py — the call raised AttributeError. Import it from its home.
(Pre-existing bug surfaced by the cosmo_val package split.)

* refactor(cosmo_val): tighten + de-dupe the mixin modules (behavior-preserving)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): drop get_cov_from_onecov, use statistics.cov_from_one_covariance directly

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): _output_path helper for output-path construction

Add a private CosmologyValidation._output_path(*parts) that returns
os.path.abspath(os.path.join(self.cc["paths"]["output"], *parts)), and
route the 29 abspath-wrapped output-path constructions across the mixin
modules (catalog_characterization, pseudo_cl, psf_systematics,
real_space) through it.

Behavior-preserving: only the os.path.abspath(f"{output}/...") sites are
converted, where abspath(join(base, suffix)) is byte-identical to
abspath(f"{base}/{suffix}") for every suffix shape in the package
(verified, incl. embedded subdirs and trailing slashes). Raw f-strings
that are not abspath'd (psf_systematics output_dir handoffs to
shear_psf_leakage) are intentionally left as-is to avoid changing the
string they pass to external tools.

Value-drift gate green: 18 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): _binning helper for treecorr-config overrides

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): collapse print_* color helpers onto _cprint

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(stats): chi2_and_pte uses solve+sf; route B-mode PTE through it (re-bless pins ~ULP)

chi2_and_pte now computes chi2 via np.linalg.solve (not forming C^-1) and
PTE via stats.chi2.sf (not 1 - cdf), the numerically proper forms: solve is
more stable/cheaper than an explicit inverse, and the survival function
avoids catastrophic cancellation in the low-PTE tail.

The two hand-inlined C_l^BB chi2/PTE sites in cosmo_val (core.summarize_bmodes,
pseudo_cl.plot_pseudo_cl) already used solve+sf, so they now route through the
single chi2_and_pte primitive with no numerical change; their local scipy
imports are dropped. No pinned value shifted (the value-drift pins flow through
the independent b_modes Hartlap path), so no re-blessing was required.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): _calibrated_g / _read_shear_cols for shared shear calibration

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(cosmo_val): keep _calibrated_g strictly behavior-preserving per caller

The shared helper applied the DES R11/R22 branch to both callers, but
calculate_aperture_mass_dispersion historically used scalar R for every
version. Add a des_branch flag so each caller keeps its exact prior behavior
(2pcf: DES branch; aperture-mass: scalar R). The asymmetry is flagged in the
docstring as a likely latent bug for a deliberate, separate fix.

* fix(cosmo_val): aperture-mass uses the DES R11/R22 branch like 2pcf

calculate_aperture_mass_dispersion used scalar R for every version, including
DES, while calculate_2pcf used the catalog-averaged R11/R22 for DES. That
asymmetry was a latent bug; _calibrated_g now applies the DES branch
identically for both callers. Untested path (no DES fixture) — value-drift
suite unaffected and green.

* refactor(cosmo_val): extract survey-stat primitives to survey.py, thin the mixin

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(pseudo_cl): pin the pseudo-Cl estimator outputs (golden values)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(pseudo_cl): extract estimator primitives to top-level pseudo_cl.py, thin the mixin

Move the stateless pseudo-Cl computation out of PseudoClMixin into named free
functions in a new top-level sp_validation/pseudo_cl.py (mirrors b_modes.py /
rho_tau.py): make_namaster_bin, get_n_gal_map (weights kwarg, subsumes the
glass_mock unweighted twin), apply_random_rotation (rng injectable; default
preserves the no-arg np.random.seed() noise-debiasing behavior), and the
map-/catalog-based estimators get_pseudo_cls_map / get_pseudo_cls_catalog.
pseudo_cl_geometry centralizes the (lmin, lmax, b_lmax) triple.

The mixin methods are now thin wrappers (load via self -> call primitive),
public names/signatures unchanged so call sites do not move. glass_mock's
get_n_gal_map delegates to the primitive via a lazy import, preserving its
CAMB-only import-time independence from the harmonic stack.

Pins (test_pseudo_cl.py) green; value-drift gate (test_cosmo_val + test_b_modes)
18 passed / 1 skipped; glass_mock tests green; actual cross-module import
verified in the container.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(glass_mock): use shared pseudo_cl primitives, drop drifted copies

Repoint the GLASS-mock harmonic stats at the new src/sp_validation/pseudo_cl.py
primitives so the bandpower binning can no longer drift from the estimator the
rest of the package uses.

- powspace_bins: deleted. Its square-root bandpower spacing is exactly
  pseudo_cl.make_namaster_bin(..., "powspace", power=0.5) on the
  pseudo_cl_geometry(nside) range. Replaced by a thin _mock_powspace_bin wrapper
  that returns the same (bin, ell_eff, lmax, b_lmax) 4-tuple the mock helpers
  consume, now sourced from the shared primitive. Verified bitwise-identical
  binning (ell_eff and per-bin ell membership) across nside/lmin/n_bins configs.
- get_n_gal_map: already delegates to the weighted primitive with weights=None
  (unweighted counts); left as the count-default twin used by the map estimator.
- compute_two_point_cl / compute_two_point_cl_map: kept their own NaMaster field
  construction (documented) — the mock field is UNWEIGHTED and these return the
  COUPLED pseudo-Cl alongside the decoupled one with the ell axis prepended,
  which the primitives' (ell_eff, cl_all, wsp) contract does not expose. They now
  take their binning and galaxy-count map from the shared primitives.

Value-drift gate green (18 passed, 1 skipped); full non-slow suite 136 passed,
1 skipped, 1 xpassed, only the two known-environmental failures
(test_bmodes_workflow_dry_runs, test_configured_paths_exist_on_candide).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(pseudo_cl): make noise-debiasing reproducible (seed the rng)

apply_random_rotation called np.random.seed() with no argument, re-seeding
from OS entropy on every call, so the pseudo-Cl noise-debiasing realizations
were non-reproducible run-to-run. Drop the bare seed; the primitive now uses
its rng argument (default_rng() when None), and the debiasing + covariance
sampling paths thread a generator seeded from a new CosmologyValidation
cell_seed (default 8192). Same numerical scheme, now reproducible. The
test_pseudo_cl reproducibility test replaces the old non-determinism guard.

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants