refactor(src): dissolve basic.py, give every module an honest home#199
Merged
Conversation
The module named "basic" was a grab-bag: the 546-line `metacal` response class (the heart of shear calibration) plus galaxy-selection masks and a handful of cosmology-independent statistics helpers — none of which "basic" described. Its symbols now live where they belong, and basic.py is deleted. - `metacal` class + `mask_gal_size`/`mask_gal_SNR` (galaxy selection) → calibration.py, joining the m/c routines that already consumed a `gal_metacal` instance. One subsystem, one module. - `jackknif_weighted_average2`, `corr_from_cov`, `chi2_and_pte`, `cov_from_one_covariance` → new statistics.py (a clean leaf: numpy/scipy only; calibration imports the jackknife from it). - Every importer repointed (papers, scripts, the two scratch/guerrini import lines — path-only, his logic untouched); dead `from sp_validation import basic` lines removed from calibration.py and cat.py; `__all__` and the architecture docs updated. - Tests split: metacal + mask pins → test_calibration.py, jackknife pin → test_statistics.py; test_basic.py removed. All moved code is byte-identical to the original (md5-verified); value-drift pins (metacal R-matrix rtol 1e-12) and the full suite pass in-container, except the pre-existing galaxy/cs_util.size old-sandbox gap. No circular imports. Verified by an adversarial multi-agent pass (byte-identity, no-stale-refs, value-pins, no-cycles). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he reserved for his next PR in the #197 review. So future workers don't touch it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
902377f to
1ff913c
Compare
…ation Follow-up polish on the basic.py dissolution (#199). Tests — characterization (value-drift) coverage for the three statistics.py helpers that had none, with literals generated by running the real functions in-container and teeth on each: - corr_from_cov: unit diagonal + reconstruction from cov/outer(std,std) - chi2_and_pte: diagonal reduces to sum((d/sigma)^2) with matching scipy PTE, plus a non-diagonal case exercising the full d^T C^-1 d path - cov_from_one_covariance: gaussian(col 10) vs non-gaussian(col 9) selection and a row-major-layout check (a transpose would be caught) Calibration — strictly behavior-preserving dead-code removal: - 3 unused module imports (util, io, get_footprint — verified unreferenced) - an unused local (col_noshear) in metacal._read_data - the uncallable metacal._return method (defined without self, references self.* in its body — would NameError if ever invoked; referenced nowhere) Value pins (metacal R-matrix, m/c bias) stay green; conservatively skipped any change that would reorder float ops or restructure an estimator. Verified by an adversarial behavior-preservation review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
info.py had zero importers; its only content was a redundant __name__ = 'sp_validation'. cat.py imported __version__ and __name__ from the package root but used only __version__ (line 607); the software name at line 606 is already hardcoded. Repoint to the canonical home: from sp_validation.version import __version__. Register the retired import path in the dangling-move guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The old __all__ listed modules nobody imports through the package (io, plot_style, cosmo_val) and omitted the two genuinely public diagnostic modules rho_tau and b_modes. Replace it with the real public surface, alphabetised, and drop the stale commented-out explicit-import block. Nothing does `from sp_validation import *`, so this is purely a documentation fix. (util and run_joint_cat are renamed in the Tier-2 commits.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Five methods each imported from .b_modes inside their bodies. b_modes is import-time side-effect-light (it pulls only .cosmology) and has no back-edge to cosmo_val, so the locals were defensive, not necessary. Consolidate the union of imported names into one top-level block next to the existing cosmology/rho_tau imports and drop the five inline imports. test_cosmo_val: 11 passed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two module-level defs named confusion_matrix existed; the first (mask, confidence_level=0.9) was a near-duplicate of correlation_matrix and was unconditionally shadowed by the second (prediction, observation) ~40 lines later. Every caller — in scripts/calibration and scripts/examples — uses the (prediction, observation) signature. Remove the dead first def. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Five functions had zero callers anywhere in the repo (src, scripts, papers, scratch, notebooks), including across the star-imports of plots: cosmology.py: get_clusters, stack_mm3, gamma_T_tc, xi_gal_gal_tc plots.py: plot_map_stacked Removing the cosmology block also orphaned the imports that existed solely for it (treecorr, fits, canfar, radec2xy, cKDTree, tqdm, get_footprint); drop those too. test_cosmology + test_plots: 29 passed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
util.py held only millify / print_millified — number formatting, not a
grab-bag. Rename it to format.py and sweep all importers:
internal: cat.py, run_joint_cat.py (util.millify -> format.millify)
scripts: apply_alpha.py, examples/demo_calibrate_minimal_cat.py,
calibration/extract_info.py (star-import x2)
papers: catalog/hist_mag.py
Register the rename in the dangling-move guard; update package __all__.
B1: scripts/plot_leakage.py imported transform_nan from the old util
module — a symbol removed from the library long ago and never used in
the script. Drop the dead import.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SquareRootScale is a matplotlib ScaleBase subclass — plotting infrastructure, not rho/tau logic. Move the class and its register_scale call into plots.py (which now carries the matplotlib.scale/ticker/transforms imports it needs). rho_tau.py keeps a compat re-export so existing `from sp_validation.rho_tau import SquareRootScale` callers — in cosmo_inference, scratch/guerrini, papers/harmonic — still resolve. workflow/scripts/plotting_utils.py holds a near-duplicate that diverges (ScalarFormatter(useMathText=True); inverted transform method named transform_non_affine not transform), so it is left in place. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Six harmonic-space scripts imported SquareRootScale from sp_validation.utils_cosmo_val, a module that no longer exists. Repoint to sp_validation.rho_tau (the compat re-export), matching the sibling 2026_03_17 script. get_params_rho_tau was already correctly imported from rho_tau. No other utils_cosmo_val imports remain (the two scratch/guerrini mentions are prose noting the module's removal). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The module holds the catalogue-builder runner classes (JointCat,
ApplyHspMasks, CalibrateCat) and their run_* entry-point functions;
catalog_builders names that role. Sweep all importers (the
`as sp_joint` alias is preserved, only the module name changes):
scripts/apply_hsp_masks.py
scripts/examples/{demo_check_footprint, create_binned_mask_comprehensive,
demo_comprehensive_to_minimal_cat, demo_create_footprint_mask,
demo_calibrate_minimal_cat}.py
scripts/calibration/{create_joint_comprehensive_cat (direct symbol),
demo_apply_hsp_masks, calibrate_comprehensive_cat}.py
scratch/kilbinger/demo_binned_mask.py
papers/catalog/hist_mag.py
Also update the two prose references in docs and update __all__ and
the dangling-move guard.
The OPTIONAL masks.py extraction (Mask + mask-algebra fns) is deferred:
those symbols are reached externally through the sp_joint.* module
alias (Mask, get_masks_from_config, print_mask_stats across 4 scripts +
papers/catalog), so splitting them out would require either re-exports
or sweeping the public call surface — beyond a behavior-preserving move.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the healsparse-backed spatial-masking cluster out of catalog_builders.py into a dedicated masks.py: the Mask class plus get_masks_from_config, print_mask_stats, correlation_matrix, and confusion_matrix. Bodies are byte-identical; the move carries the numexpr/scipy.stats imports those helpers need (now removed from catalog_builders, which no longer references them). catalog_builders.py re-exports the five symbols from sp_validation.masks so external code using `from sp_validation import catalog_builders as sp_joint` keeps resolving sp_joint.Mask, sp_joint.get_masks_from_config, etc. The *Cat runner classes (ApplyHspMasks, ReadCat, run_* entry points) stay; ApplyHspMasks uses healsparse directly, not the Mask class. masks added to __init__.__all__. No MOVE_MAP entry: this is an extraction-in-place (catalog_builders survives), not a retired path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The catalogue module pair now reads by role: catalog.py is the data layer (read/write/column-access/matching free functions), catalog_builders.py is the construction pipeline (runner classes built on it). Module docstrings state this hierarchy explicitly. Behaviour-preserving. Every importer of the local sp_validation.cat module is swept to sp_validation.catalog, each preserving its local binding (bare `import cat` forms gain `as cat` so function bodies are unchanged). The cs_util `cat` import is a different module and is left untouched throughout. The dangling-reference guard registers the retired flat-import form `sp_validation.cat import` rather than the bare `sp_validation.cat` token, which would false-positive on the live `sp_validation.catalog` / `sp_validation.catalog_builders` modules (same prefix trap as glass_mock). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 20, 2026
Contributor
|
I am going through the flies; some are (my old ones and) obsolete, I will delete them here. |
martinkilbinger
approved these changes
Jun 23, 2026
cailmdaley
added a commit
that referenced
this pull request
Jun 30, 2026
#199 (the old PR base, refactor/src-module-layout) was squash-merged into develop (5b8ffdc), so the old base is dead — this brings the ruff work onto current develop instead. Conflicts resolved toward develop's content (the canonical post-#199 code), then `ruff format` + the region-aware policy re-applied on top, so chore/ruff's net contribution over develop stays exactly: the repo-wide format pass, the lint gate (pre-commit + CI), and the behavior-preserving bug fixes. develop's #200 dependency pins are carried in, and the two obsolete scripts #199 deleted (cosmo_val/match_LF_SP.py, scripts/create_joint_shape_cat.py) stay deleted. Verified: `ruff check .` clean except the 5 known undefined names (@sachaguer's, flagged on the PR), `ruff format --check .` clean, every tracked .py compiles. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SkTwLDgicfeoK8Np2bgwp3
cailmdaley
added a commit
that referenced
this pull request
Jun 30, 2026
* refactor(src): dissolve basic.py into calibration + statistics The module named "basic" was a grab-bag: the 546-line `metacal` response class (the heart of shear calibration) plus galaxy-selection masks and a handful of cosmology-independent statistics helpers — none of which "basic" described. Its symbols now live where they belong, and basic.py is deleted. - `metacal` class + `mask_gal_size`/`mask_gal_SNR` (galaxy selection) → calibration.py, joining the m/c routines that already consumed a `gal_metacal` instance. One subsystem, one module. - `jackknif_weighted_average2`, `corr_from_cov`, `chi2_and_pte`, `cov_from_one_covariance` → new statistics.py (a clean leaf: numpy/scipy only; calibration imports the jackknife from it). - Every importer repointed (papers, scripts, the two scratch/guerrini import lines — path-only, his logic untouched); dead `from sp_validation import basic` lines removed from calibration.py and cat.py; `__all__` and the architecture docs updated. - Tests split: metacal + mask pins → test_calibration.py, jackknife pin → test_statistics.py; test_basic.py removed. All moved code is byte-identical to the original (md5-verified); value-drift pins (metacal R-matrix rtol 1e-12) and the full suite pass in-container, except the pre-existing galaxy/cs_util.size old-sandbox gap. No circular imports. Verified by an adversarial multi-agent pass (byte-identity, no-stale-refs, value-pins, no-cycles). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * felt: reserved-sasha — document Sasha's hands-off zones scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he reserved for his next PR in the #197 review. So future workers don't touch it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(statistics): pin the extracted helpers; drop dead code in calibration Follow-up polish on the basic.py dissolution (#199). Tests — characterization (value-drift) coverage for the three statistics.py helpers that had none, with literals generated by running the real functions in-container and teeth on each: - corr_from_cov: unit diagonal + reconstruction from cov/outer(std,std) - chi2_and_pte: diagonal reduces to sum((d/sigma)^2) with matching scipy PTE, plus a non-diagonal case exercising the full d^T C^-1 d path - cov_from_one_covariance: gaussian(col 10) vs non-gaussian(col 9) selection and a row-major-layout check (a transpose would be caught) Calibration — strictly behavior-preserving dead-code removal: - 3 unused module imports (util, io, get_footprint — verified unreferenced) - an unused local (col_noshear) in metacal._read_data - the uncallable metacal._return method (defined without self, references self.* in its body — would NameError if ever invoked; referenced nowhere) Value pins (metacal R-matrix, m/c bias) stay green; conservatively skipped any change that would reorder float ops or restructure an estimator. Verified by an adversarial behavior-preservation review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): delete dead info.py, fix cat.py version import info.py had zero importers; its only content was a redundant __name__ = 'sp_validation'. cat.py imported __version__ and __name__ from the package root but used only __version__ (line 607); the software name at line 606 is already hardcoded. Repoint to the canonical home: from sp_validation.version import __version__. Register the retired import path in the dangling-move guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): make package __all__ honest The old __all__ listed modules nobody imports through the package (io, plot_style, cosmo_val) and omitted the two genuinely public diagnostic modules rho_tau and b_modes. Replace it with the real public surface, alphabetised, and drop the stale commented-out explicit-import block. Nothing does `from sp_validation import *`, so this is purely a documentation fix. (util and run_joint_cat are renamed in the Tier-2 commits.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): hoist b_modes imports to module top level Five methods each imported from .b_modes inside their bodies. b_modes is import-time side-effect-light (it pulls only .cosmology) and has no back-edge to cosmo_val, so the locals were defensive, not necessary. Consolidate the union of imported names into one top-level block next to the existing cosmology/rho_tau imports and drop the five inline imports. test_cosmo_val: 11 passed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(run_joint_cat): drop shadowed confusion_matrix def Two module-level defs named confusion_matrix existed; the first (mask, confidence_level=0.9) was a near-duplicate of correlation_matrix and was unconditionally shadowed by the second (prediction, observation) ~40 lines later. Every caller — in scripts/calibration and scripts/examples — uses the (prediction, observation) signature. Remove the dead first def. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor: remove dead cluster/convergence-map helpers Five functions had zero callers anywhere in the repo (src, scripts, papers, scratch, notebooks), including across the star-imports of plots: cosmology.py: get_clusters, stack_mm3, gamma_T_tc, xi_gal_gal_tc plots.py: plot_map_stacked Removing the cosmology block also orphaned the imports that existed solely for it (treecorr, fits, canfar, radec2xy, cKDTree, tqdm, get_footprint); drop those too. test_cosmology + test_plots: 29 passed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): rename util -> format; drop dead transform_nan import util.py held only millify / print_millified — number formatting, not a grab-bag. Rename it to format.py and sweep all importers: internal: cat.py, run_joint_cat.py (util.millify -> format.millify) scripts: apply_alpha.py, examples/demo_calibrate_minimal_cat.py, calibration/extract_info.py (star-import x2) papers: catalog/hist_mag.py Register the rename in the dangling-move guard; update package __all__. B1: scripts/plot_leakage.py imported transform_nan from the old util module — a symbol removed from the library long ago and never used in the script. Drop the dead import. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): move SquareRootScale to plots, re-export from rho_tau SquareRootScale is a matplotlib ScaleBase subclass — plotting infrastructure, not rho/tau logic. Move the class and its register_scale call into plots.py (which now carries the matplotlib.scale/ticker/transforms imports it needs). rho_tau.py keeps a compat re-export so existing `from sp_validation.rho_tau import SquareRootScale` callers — in cosmo_inference, scratch/guerrini, papers/harmonic — still resolve. workflow/scripts/plotting_utils.py holds a near-duplicate that diverges (ScalarFormatter(useMathText=True); inverted transform method named transform_non_affine not transform), so it is left in place. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(papers/harmonic): repoint SquareRootScale off dead utils_cosmo_val Six harmonic-space scripts imported SquareRootScale from sp_validation.utils_cosmo_val, a module that no longer exists. Repoint to sp_validation.rho_tau (the compat re-export), matching the sibling 2026_03_17 script. get_params_rho_tau was already correctly imported from rho_tau. No other utils_cosmo_val imports remain (the two scratch/guerrini mentions are prose noting the module's removal). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): rename run_joint_cat -> catalog_builders The module holds the catalogue-builder runner classes (JointCat, ApplyHspMasks, CalibrateCat) and their run_* entry-point functions; catalog_builders names that role. Sweep all importers (the `as sp_joint` alias is preserved, only the module name changes): scripts/apply_hsp_masks.py scripts/examples/{demo_check_footprint, create_binned_mask_comprehensive, demo_comprehensive_to_minimal_cat, demo_create_footprint_mask, demo_calibrate_minimal_cat}.py scripts/calibration/{create_joint_comprehensive_cat (direct symbol), demo_apply_hsp_masks, calibrate_comprehensive_cat}.py scratch/kilbinger/demo_binned_mask.py papers/catalog/hist_mag.py Also update the two prose references in docs and update __all__ and the dangling-move guard. The OPTIONAL masks.py extraction (Mask + mask-algebra fns) is deferred: those symbols are reached externally through the sp_joint.* module alias (Mask, get_masks_from_config, print_mask_stats across 4 scripts + papers/catalog), so splitting them out would require either re-exports or sweeping the public call surface — beyond a behavior-preserving move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Extract masks.py from catalog_builders.py Move the healsparse-backed spatial-masking cluster out of catalog_builders.py into a dedicated masks.py: the Mask class plus get_masks_from_config, print_mask_stats, correlation_matrix, and confusion_matrix. Bodies are byte-identical; the move carries the numexpr/scipy.stats imports those helpers need (now removed from catalog_builders, which no longer references them). catalog_builders.py re-exports the five symbols from sp_validation.masks so external code using `from sp_validation import catalog_builders as sp_joint` keeps resolving sp_joint.Mask, sp_joint.get_masks_from_config, etc. The *Cat runner classes (ApplyHspMasks, ReadCat, run_* entry points) stay; ApplyHspMasks uses healsparse directly, not the Mask class. masks added to __init__.__all__. No MOVE_MAP entry: this is an extraction-in-place (catalog_builders survives), not a retired path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Rename cat.py -> catalog.py: catalogue data layer The catalogue module pair now reads by role: catalog.py is the data layer (read/write/column-access/matching free functions), catalog_builders.py is the construction pipeline (runner classes built on it). Module docstrings state this hierarchy explicitly. Behaviour-preserving. Every importer of the local sp_validation.cat module is swept to sp_validation.catalog, each preserving its local binding (bare `import cat` forms gain `as cat` so function bodies are unchanged). The cs_util `cat` import is a different module and is left untouched throughout. The dangling-reference guard registers the retired flat-import form `sp_validation.cat import` rather than the bare `sp_validation.cat` token, which would false-positive on the live `sp_validation.catalog` / `sp_validation.catalog_builders` modules (same prefix trap as glass_mock). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * style(ruff): repo-wide ruff format pass (mechanical, no semantic change) * chore(ruff): wire ruff (pre-commit + CI) + region-aware lint policy * fix(ruff): region-aware lint fixes (behavior-preserving, adversarially verified) * chore(ruff): scope CI lint to library; broaden E402 ignores for per-paper analysis scripts The gate revealed peripheral residual the per-file-ignore globs didn't reach (analysis scripts under papers/<paper>/*.py, cosmo_inference notebooks) plus the Snakemake-injected `snakemake` global (F821). Region-aware decision: enforce lint on src/ only (matching the pre-commit hook), keep format repo-wide, run the repo-wide lint advisory. src/ stays pristine (ruff check src/ => clean). * chore(ruff): pivot lint gate to warn-locally / account-on-develop Flip the lint discipline from "block" to "warn while you work, account when it lands." Pre-commit now auto-applies ruff's SAFE fixes (`ruff format` + `ruff check --fix`: import sorting, unused imports) — a one-time re-stage, not a block — and only WARNS on what ruff won't safely fix (undefined names F821, unused variables F841), which never blocks a commit. That split is ruff's own safe/unsafe-fix taxonomy, not a hand-curated list. Bloat guards (nbstripout, large-files) still block. CI lint now runs only on push to `develop` and PRs targeting `develop`. On failure it goes red (blocks the merge) AND opens/updates a single auto-closing lint-debt issue per committer, @-mentioning + assigning them; a clean run closes it. Fork PRs are covered via pull_request_target, hardened with `uvx --no-config` so an untrusted PR can't redirect ruff's download to a trojaned index. Issue management runs continue-on-error so it can't red a clean run, and a ruff/uvx tooling error reds the gate without filing a committer issue. Settled with Sacha, 2026-06-23. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SkTwLDgicfeoK8Np2bgwp3 * chore(ruff): clear lint baseline to the 5 genuine undefined names Bring the repo-wide ruff baseline from 125 errors down to 5, so the develop gate's first run is honest rather than a wall of false positives: - builtins = ["snakemake"] silences ~110 false F821s — Snakemake injects a `snakemake` object into rule scripts at runtime, so ruff can't see where it's defined. Every other undefined name in those scripts is still caught. - broaden the cosmo_inference E402 ignore from `*.py` to `**` so it also covers `.ipynb` cells (same intentional sys.path / mpl-backend-before- import pattern the .py scripts already get). - normalize mixed tabs/spaces in download_gaia_catalogues.py's SQL f-string to spaces (E101; also unbreaks the surrounding dedent()). Remaining are 5 genuine undefined names (cov_sim_gaussian in papers/harmonic/...validation_cov_cell.py; index/n_contours in the papers/consistency glass-mocks notebook), triaged with their authors. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SkTwLDgicfeoK8Np2bgwp3 * Fix ruff hooks. Removed consistency not book and moved it to my personal scratch. Removed the lines with undefined variable in the harmonic space script. * chore(ruff): PR feedback as a PR comment, issue only for develop pushes Refine the lint gate's feedback surface to fit the event (Cail's call: a separate issue for a PR is disconnected — comment where the author is already looking; issues are for things going straight into develop): - PR against develop → red check (still blocks the merge) + a single auto-updating comment ON the PR with the full violation list, turning green when ruff passes. ruff annotations (--output-format=github) are emitted too (they surface in the run's Checks view; pull_request_target anchors them to the base commit, so not inline on the diff). No issue. - direct push to develop → unchanged: one per-committer auto-closing lint-debt issue (no PR exists to comment on). Adds pull-requests: write for the comment path. Fresh-eyes reviewed (SHIP); the over-promised "inline on Files changed" wording was corrected since the comment's file:line list is the reliable surface. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SkTwLDgicfeoK8Np2bgwp3 --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: Sacha Guerrini <sacha.guerrini@cea.fr>
cailmdaley
added a commit
that referenced
this pull request
Jun 30, 2026
…licts) #206 (ruff gate) and #199 (basic.py dissolve) squash-merged to develop while #207 still carried their pre-squash commits, so this merge collapses the stack. Resolved by category: - develop's side where it carried fixes #207 lacked: Sacha's cov_sim_gaussian removal in papers/harmonic/2025_09_11_validation_cov_cell.py (21c7d96), the consistency-notebook deletion (moved to scratch), the canonical #206 lint.yml / pre-commit enforcement model, the broader cosmo_inference/** E402 per-file-ignore. - #207's side for its genuine work: the cosmo_val/ package (flat cosmo_val.py deleted), the pseudo_cl primitives, survey.area_from_coords, chi2_and_pte solve+sf (pins re-blessed for it), the glass_mock dedup, the demo hsp_map_logical_or fix; dropped develop's duplicate SquareRootScale. Verified on the fresh container: ruff format+check clean repo-wide; CosmologyValidation imports (75 methods); value-drift + golden pins 32 passed / 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Jmk58xivawGCVFgejJJeDp
cailmdaley
added a commit
that referenced
this pull request
Jun 30, 2026
* refactor(src): dissolve basic.py into calibration + statistics The module named "basic" was a grab-bag: the 546-line `metacal` response class (the heart of shear calibration) plus galaxy-selection masks and a handful of cosmology-independent statistics helpers — none of which "basic" described. Its symbols now live where they belong, and basic.py is deleted. - `metacal` class + `mask_gal_size`/`mask_gal_SNR` (galaxy selection) → calibration.py, joining the m/c routines that already consumed a `gal_metacal` instance. One subsystem, one module. - `jackknif_weighted_average2`, `corr_from_cov`, `chi2_and_pte`, `cov_from_one_covariance` → new statistics.py (a clean leaf: numpy/scipy only; calibration imports the jackknife from it). - Every importer repointed (papers, scripts, the two scratch/guerrini import lines — path-only, his logic untouched); dead `from sp_validation import basic` lines removed from calibration.py and cat.py; `__all__` and the architecture docs updated. - Tests split: metacal + mask pins → test_calibration.py, jackknife pin → test_statistics.py; test_basic.py removed. All moved code is byte-identical to the original (md5-verified); value-drift pins (metacal R-matrix rtol 1e-12) and the full suite pass in-container, except the pre-existing galaxy/cs_util.size old-sandbox gap. No circular imports. Verified by an adversarial multi-agent pass (byte-identity, no-stale-refs, value-pins, no-cycles). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * felt: reserved-sasha — document Sasha's hands-off zones scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he reserved for his next PR in the #197 review. So future workers don't touch it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(statistics): pin the extracted helpers; drop dead code in calibration Follow-up polish on the basic.py dissolution (#199). Tests — characterization (value-drift) coverage for the three statistics.py helpers that had none, with literals generated by running the real functions in-container and teeth on each: - corr_from_cov: unit diagonal + reconstruction from cov/outer(std,std) - chi2_and_pte: diagonal reduces to sum((d/sigma)^2) with matching scipy PTE, plus a non-diagonal case exercising the full d^T C^-1 d path - cov_from_one_covariance: gaussian(col 10) vs non-gaussian(col 9) selection and a row-major-layout check (a transpose would be caught) Calibration — strictly behavior-preserving dead-code removal: - 3 unused module imports (util, io, get_footprint — verified unreferenced) - an unused local (col_noshear) in metacal._read_data - the uncallable metacal._return method (defined without self, references self.* in its body — would NameError if ever invoked; referenced nowhere) Value pins (metacal R-matrix, m/c bias) stay green; conservatively skipped any change that would reorder float ops or restructure an estimator. Verified by an adversarial behavior-preservation review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): delete dead info.py, fix cat.py version import info.py had zero importers; its only content was a redundant __name__ = 'sp_validation'. cat.py imported __version__ and __name__ from the package root but used only __version__ (line 607); the software name at line 606 is already hardcoded. Repoint to the canonical home: from sp_validation.version import __version__. Register the retired import path in the dangling-move guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): make package __all__ honest The old __all__ listed modules nobody imports through the package (io, plot_style, cosmo_val) and omitted the two genuinely public diagnostic modules rho_tau and b_modes. Replace it with the real public surface, alphabetised, and drop the stale commented-out explicit-import block. Nothing does `from sp_validation import *`, so this is purely a documentation fix. (util and run_joint_cat are renamed in the Tier-2 commits.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): hoist b_modes imports to module top level Five methods each imported from .b_modes inside their bodies. b_modes is import-time side-effect-light (it pulls only .cosmology) and has no back-edge to cosmo_val, so the locals were defensive, not necessary. Consolidate the union of imported names into one top-level block next to the existing cosmology/rho_tau imports and drop the five inline imports. test_cosmo_val: 11 passed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(run_joint_cat): drop shadowed confusion_matrix def Two module-level defs named confusion_matrix existed; the first (mask, confidence_level=0.9) was a near-duplicate of correlation_matrix and was unconditionally shadowed by the second (prediction, observation) ~40 lines later. Every caller — in scripts/calibration and scripts/examples — uses the (prediction, observation) signature. Remove the dead first def. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor: remove dead cluster/convergence-map helpers Five functions had zero callers anywhere in the repo (src, scripts, papers, scratch, notebooks), including across the star-imports of plots: cosmology.py: get_clusters, stack_mm3, gamma_T_tc, xi_gal_gal_tc plots.py: plot_map_stacked Removing the cosmology block also orphaned the imports that existed solely for it (treecorr, fits, canfar, radec2xy, cKDTree, tqdm, get_footprint); drop those too. test_cosmology + test_plots: 29 passed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): rename util -> format; drop dead transform_nan import util.py held only millify / print_millified — number formatting, not a grab-bag. Rename it to format.py and sweep all importers: internal: cat.py, run_joint_cat.py (util.millify -> format.millify) scripts: apply_alpha.py, examples/demo_calibrate_minimal_cat.py, calibration/extract_info.py (star-import x2) papers: catalog/hist_mag.py Register the rename in the dangling-move guard; update package __all__. B1: scripts/plot_leakage.py imported transform_nan from the old util module — a symbol removed from the library long ago and never used in the script. Drop the dead import. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): move SquareRootScale to plots, re-export from rho_tau SquareRootScale is a matplotlib ScaleBase subclass — plotting infrastructure, not rho/tau logic. Move the class and its register_scale call into plots.py (which now carries the matplotlib.scale/ticker/transforms imports it needs). rho_tau.py keeps a compat re-export so existing `from sp_validation.rho_tau import SquareRootScale` callers — in cosmo_inference, scratch/guerrini, papers/harmonic — still resolve. workflow/scripts/plotting_utils.py holds a near-duplicate that diverges (ScalarFormatter(useMathText=True); inverted transform method named transform_non_affine not transform), so it is left in place. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(papers/harmonic): repoint SquareRootScale off dead utils_cosmo_val Six harmonic-space scripts imported SquareRootScale from sp_validation.utils_cosmo_val, a module that no longer exists. Repoint to sp_validation.rho_tau (the compat re-export), matching the sibling 2026_03_17 script. get_params_rho_tau was already correctly imported from rho_tau. No other utils_cosmo_val imports remain (the two scratch/guerrini mentions are prose noting the module's removal). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(src): rename run_joint_cat -> catalog_builders The module holds the catalogue-builder runner classes (JointCat, ApplyHspMasks, CalibrateCat) and their run_* entry-point functions; catalog_builders names that role. Sweep all importers (the `as sp_joint` alias is preserved, only the module name changes): scripts/apply_hsp_masks.py scripts/examples/{demo_check_footprint, create_binned_mask_comprehensive, demo_comprehensive_to_minimal_cat, demo_create_footprint_mask, demo_calibrate_minimal_cat}.py scripts/calibration/{create_joint_comprehensive_cat (direct symbol), demo_apply_hsp_masks, calibrate_comprehensive_cat}.py scratch/kilbinger/demo_binned_mask.py papers/catalog/hist_mag.py Also update the two prose references in docs and update __all__ and the dangling-move guard. The OPTIONAL masks.py extraction (Mask + mask-algebra fns) is deferred: those symbols are reached externally through the sp_joint.* module alias (Mask, get_masks_from_config, print_mask_stats across 4 scripts + papers/catalog), so splitting them out would require either re-exports or sweeping the public call surface — beyond a behavior-preserving move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Extract masks.py from catalog_builders.py Move the healsparse-backed spatial-masking cluster out of catalog_builders.py into a dedicated masks.py: the Mask class plus get_masks_from_config, print_mask_stats, correlation_matrix, and confusion_matrix. Bodies are byte-identical; the move carries the numexpr/scipy.stats imports those helpers need (now removed from catalog_builders, which no longer references them). catalog_builders.py re-exports the five symbols from sp_validation.masks so external code using `from sp_validation import catalog_builders as sp_joint` keeps resolving sp_joint.Mask, sp_joint.get_masks_from_config, etc. The *Cat runner classes (ApplyHspMasks, ReadCat, run_* entry points) stay; ApplyHspMasks uses healsparse directly, not the Mask class. masks added to __init__.__all__. No MOVE_MAP entry: this is an extraction-in-place (catalog_builders survives), not a retired path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Rename cat.py -> catalog.py: catalogue data layer The catalogue module pair now reads by role: catalog.py is the data layer (read/write/column-access/matching free functions), catalog_builders.py is the construction pipeline (runner classes built on it). Module docstrings state this hierarchy explicitly. Behaviour-preserving. Every importer of the local sp_validation.cat module is swept to sp_validation.catalog, each preserving its local binding (bare `import cat` forms gain `as cat` so function bodies are unchanged). The cs_util `cat` import is a different module and is left untouched throughout. The dangling-reference guard registers the retired flat-import form `sp_validation.cat import` rather than the bare `sp_validation.cat` token, which would false-positive on the live `sp_validation.catalog` / `sp_validation.catalog_builders` modules (same prefix trap as glass_mock). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * style(ruff): repo-wide ruff format pass (mechanical, no semantic change) * chore(ruff): wire ruff (pre-commit + CI) + region-aware lint policy * fix(ruff): region-aware lint fixes (behavior-preserving, adversarially verified) * chore(ruff): scope CI lint to library; broaden E402 ignores for per-paper analysis scripts The gate revealed peripheral residual the per-file-ignore globs didn't reach (analysis scripts under papers/<paper>/*.py, cosmo_inference notebooks) plus the Snakemake-injected `snakemake` global (F821). Region-aware decision: enforce lint on src/ only (matching the pre-commit hook), keep format repo-wide, run the repo-wide lint advisory. src/ stays pristine (ruff check src/ => clean). * refactor(cosmo_val): convert module to package (core.py + __init__ re-export) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): extract PseudoClMixin (pseudo_cl) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): extract CatalogCharacterizationMixin (catalog_characterization) * refactor(cosmo_val): extract PSFSystematicsMixin (psf_systematics) * refactor(cosmo_val): extract RealSpaceMixin (real_space) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): extract PureEBMixin (pure_eb) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): extract CosebisMixin (cosebis) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): fix sibling-module imports for package depth Converting cosmo_val from a module to a package moved every file one level deeper, so single-dot relative imports to top-level siblings now resolve inside the package. Point them up a level: - ..b_modes (core, pure_eb, cosebis) - ..cosmology (core, pseudo_cl) - ..rho_tau (psf_systematics, pseudo_cl) Re-export cs_plots in __init__ from its origin (cs_util.plots) rather than from core, which no longer needs the alias after the plotting methods moved to their mixins. Preserves the sp_validation.cosmo_val.cs_plots attribute path. Package now imports cleanly; test_cosmo_val + test_b_modes: 18 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(examples): call hsp_map_logical_or from plots, not the cosmo_val module The demo called cosmo_val.hsp_map_logical_or, but that function has only ever lived in plots.py — the call raised AttributeError. Import it from its home. (Pre-existing bug surfaced by the cosmo_val package split.) * refactor(cosmo_val): tighten + de-dupe the mixin modules (behavior-preserving) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): drop get_cov_from_onecov, use statistics.cov_from_one_covariance directly Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): _output_path helper for output-path construction Add a private CosmologyValidation._output_path(*parts) that returns os.path.abspath(os.path.join(self.cc["paths"]["output"], *parts)), and route the 29 abspath-wrapped output-path constructions across the mixin modules (catalog_characterization, pseudo_cl, psf_systematics, real_space) through it. Behavior-preserving: only the os.path.abspath(f"{output}/...") sites are converted, where abspath(join(base, suffix)) is byte-identical to abspath(f"{base}/{suffix}") for every suffix shape in the package (verified, incl. embedded subdirs and trailing slashes). Raw f-strings that are not abspath'd (psf_systematics output_dir handoffs to shear_psf_leakage) are intentionally left as-is to avoid changing the string they pass to external tools. Value-drift gate green: 18 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): _binning helper for treecorr-config overrides Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): collapse print_* color helpers onto _cprint Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(stats): chi2_and_pte uses solve+sf; route B-mode PTE through it (re-bless pins ~ULP) chi2_and_pte now computes chi2 via np.linalg.solve (not forming C^-1) and PTE via stats.chi2.sf (not 1 - cdf), the numerically proper forms: solve is more stable/cheaper than an explicit inverse, and the survival function avoids catastrophic cancellation in the low-PTE tail. The two hand-inlined C_l^BB chi2/PTE sites in cosmo_val (core.summarize_bmodes, pseudo_cl.plot_pseudo_cl) already used solve+sf, so they now route through the single chi2_and_pte primitive with no numerical change; their local scipy imports are dropped. No pinned value shifted (the value-drift pins flow through the independent b_modes Hartlap path), so no re-blessing was required. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): _calibrated_g / _read_shear_cols for shared shear calibration Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(cosmo_val): keep _calibrated_g strictly behavior-preserving per caller The shared helper applied the DES R11/R22 branch to both callers, but calculate_aperture_mass_dispersion historically used scalar R for every version. Add a des_branch flag so each caller keeps its exact prior behavior (2pcf: DES branch; aperture-mass: scalar R). The asymmetry is flagged in the docstring as a likely latent bug for a deliberate, separate fix. * fix(cosmo_val): aperture-mass uses the DES R11/R22 branch like 2pcf calculate_aperture_mass_dispersion used scalar R for every version, including DES, while calculate_2pcf used the catalog-averaged R11/R22 for DES. That asymmetry was a latent bug; _calibrated_g now applies the DES branch identically for both callers. Untested path (no DES fixture) — value-drift suite unaffected and green. * refactor(cosmo_val): extract survey-stat primitives to survey.py, thin the mixin Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(pseudo_cl): pin the pseudo-Cl estimator outputs (golden values) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(pseudo_cl): extract estimator primitives to top-level pseudo_cl.py, thin the mixin Move the stateless pseudo-Cl computation out of PseudoClMixin into named free functions in a new top-level sp_validation/pseudo_cl.py (mirrors b_modes.py / rho_tau.py): make_namaster_bin, get_n_gal_map (weights kwarg, subsumes the glass_mock unweighted twin), apply_random_rotation (rng injectable; default preserves the no-arg np.random.seed() noise-debiasing behavior), and the map-/catalog-based estimators get_pseudo_cls_map / get_pseudo_cls_catalog. pseudo_cl_geometry centralizes the (lmin, lmax, b_lmax) triple. The mixin methods are now thin wrappers (load via self -> call primitive), public names/signatures unchanged so call sites do not move. glass_mock's get_n_gal_map delegates to the primitive via a lazy import, preserving its CAMB-only import-time independence from the harmonic stack. Pins (test_pseudo_cl.py) green; value-drift gate (test_cosmo_val + test_b_modes) 18 passed / 1 skipped; glass_mock tests green; actual cross-module import verified in the container. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(glass_mock): use shared pseudo_cl primitives, drop drifted copies Repoint the GLASS-mock harmonic stats at the new src/sp_validation/pseudo_cl.py primitives so the bandpower binning can no longer drift from the estimator the rest of the package uses. - powspace_bins: deleted. Its square-root bandpower spacing is exactly pseudo_cl.make_namaster_bin(..., "powspace", power=0.5) on the pseudo_cl_geometry(nside) range. Replaced by a thin _mock_powspace_bin wrapper that returns the same (bin, ell_eff, lmax, b_lmax) 4-tuple the mock helpers consume, now sourced from the shared primitive. Verified bitwise-identical binning (ell_eff and per-bin ell membership) across nside/lmin/n_bins configs. - get_n_gal_map: already delegates to the weighted primitive with weights=None (unweighted counts); left as the count-default twin used by the map estimator. - compute_two_point_cl / compute_two_point_cl_map: kept their own NaMaster field construction (documented) — the mock field is UNWEIGHTED and these return the COUPLED pseudo-Cl alongside the decoupled one with the ell axis prepended, which the primitives' (ell_eff, cl_all, wsp) contract does not expose. They now take their binning and galaxy-count map from the shared primitives. Value-drift gate green (18 passed, 1 skipped); full non-slow suite 136 passed, 1 skipped, 1 xpassed, only the two known-environmental failures (test_bmodes_workflow_dry_runs, test_configured_paths_exist_on_candide). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(pseudo_cl): make noise-debiasing reproducible (seed the rng) apply_random_rotation called np.random.seed() with no argument, re-seeding from OS entropy on every call, so the pseudo-Cl noise-debiasing realizations were non-reproducible run-to-run. Drop the bare seed; the primitive now uses its rng argument (default_rng() when None), and the debiasing + covariance sampling paths thread a generator seeded from a new CosmologyValidation cell_seed (default 8192). Same numerical scheme, now reproducible. The test_pseudo_cl reproducibility test replaces the old non-determinism guard. --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
refactor(src): dissolve
basic.py, give every module an honest homeA behavior-preserving reshape of
src/sp_validation/— no logic changed, only moves, renames, and dead-code removal. The grab-bagbasic.pyis dissolved, spatial masking gets its own module, and misnamed/dead files are cleaned up. Base:develop.flowchart LR classDef new fill:#d6f5e0,stroke:#2a9d52,color:#0b3d20 classDef gone fill:#fadbdd,stroke:#c0392b,color:#5b1a1f,stroke-dasharray:5 4 basic["basic.py"]:::gone util["util.py"] rjc["run_joint_cat.py"] rho["rho_tau.py"] cat["cat.py"] calib["calibration.py"] stats["statistics.py"]:::new masks["masks.py"]:::new fmt["format.py"] cb["catalog_builders.py"] plots["plots.py"] catalog["catalog.py"] basic -->|"metacal + size/SNR cuts"| calib basic -->|"math helpers"| stats util -->|rename| fmt rjc -->|rename| cb cb -->|"spatial Mask + algebra"| masks rho -->|"SquareRootScale"| plots cat -->|rename| catalog cb -.->|builds on| catalogstatistics.pyandmasks.pyare new;basic.pyis deleted. The catalogue layer now reads as a clear pair:catalog.py(data primitives — read/write/access/match) withcatalog_builders.py(the runner-class pipeline built on it). Two re-exports keep shared importers working (SquareRootScaleviarho_tau, the mask primitives viacatalog_builders); the renames preserve each caller's local binding, so no call sites change.Also folded in: deleted dead
info.py, five unused functions, and a shadowed duplicateconfusion_matrix; made__init__.__all__honest and fixed a phantom version-import; and repaired two pre-existing broken imports — a vanishedtransform_nan, and 6papers/harmonicfiles still pointing at the long-goneutils_cosmo_val.Martin's #197 requests — all ✅
Calibration scripts clustered under
scripts/calibration/·analyse_matched_stars→ scratch as a.py·demo_binned_mask/plot_binned_quantities→ scratch · all 7 deletions done. (Sacha'snamaster_utilsand config-space items are reserved for his own next PR.)— Claude on behalf of Cail