Ngmix v2.0 (CI mirror of #740)#741
Conversation
…to ngmix_update
- bin/ scripts were untracked, causing Docker build to fail - Fix license field to use SPDX string format (MIT) to resolve SetuptoolsDeprecationWarning Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- make_cat: read ngmix_mcal_flags/ngmix_id outside the moments branch; they were assigned only in the non-moments (else) branch but used unconditionally, so _save_ngmix_data(moments=True) raised NameError. - make_cat: remove a leftover debug print loop over the ELL columns. - runners (mccd_interp, read_ext_sexcat, psfex_interp, vignetmaker): the WCS log / exp-numbers files are positional inputs supplied via the config FILE_PATTERN in MULTI-EPOCH / post-process modes, but the @module_runner decorators only declare the base inputs (the optional ones cannot be encoded statically without breaking CLASSIC/base mode). Replace the bare input_file_list[1]/[2] reads with a clear ValueError naming the missing FILE_PATTERN entry instead of a cryptic IndexError. - ngmix compile_results: tie the hardcoded metacal-type list to METACAL_TYPES via a divergence check (order kept stable for byte-reproducible output). - ngmix: drop a stray marker comment and a dead commented-out vignet-path block; fold two debug prints into the ValueError message and fix a missing f-string prefix in an adjacent error message.
The fast-tests workflow runs 'uv sync --frozen --extra test' then pytest
on the GitHub runner (outside the dev image). Without a [build-system]
table, uv resolved shapepipe as a virtual project (uv.lock recorded
source = { virtual = "." }), so 'uv sync' installed only the
dependencies and not shapepipe itself — every test module failed
collection with 'No module named shapepipe'.
Declare the setuptools build backend (already the de-facto backend via
[tool.setuptools]); uv.lock now records source = { editable = "." } and
'uv sync' puts shapepipe on the path. Verified: 'uv sync --dry-run' now
plans the editable install, and the package builds cleanly with
setuptools.build_meta (wheel contains shapepipe/ + entry points).
test_mpirun_launch_imports_mpi forces an mpirun-style env, which makes
shapepipe.run import mpi4py.MPI. On the bare fast-tests runner mpi4py the
package imports but the MPI runtime library (libmpi.so, provided by the
dev image) is missing, so mpi4py.MPI raises RuntimeError — not the
ImportError that pytest.importorskip('mpi4py') guards against. Guard on
importing mpi4py.MPI directly and skip on any failure, matching the
workflow's self-skip pattern for system-dependent tests. The dev-image CI
still exercises the real MPI path. (run.py's launcher gate is unchanged
and correct; test_bare_launch_skips_mpi already passes everywhere.)
|
CI is green — we believe this PR is ready to merge. ✅
A final pass since the last review round: Cleanups & review fixes
CI fix (
Deferred, intentionally out of scope for this PR
— Claude, on behalf of Cail Generated by Claude Code |
|
Hello again Claude/Cail, I am sorry to insist but the changes to |
The test-suite constitution is explicit: tests run inside the dev image via deploy-image.yml's pytest step — 'the image is the environment, so the suite exercises exactly what ships. There is no second test workflow.' The fast-tests job ran a subset (-m 'not slow and not candide') in a bare uv venv with no system tools, so the binary/MPI tests only self-skipped there; the in-image 'pytest -rX' run is a strict superset and is the real gate. Removing the second workflow eliminates the environment divergence (and the skips that only existed to paper over it). The [build-system] declaration stays — it is correct independent of CI (makes 'uv sync' / 'pip install -e .' install the package, per the documented dev loop); only the comment's fast-tests reference is dropped.
|
Update — consolidated to a single in-image CI path. Following a quick discussion: rather than keep the separate Kept from the CI work: the All checks green on — Claude, on behalf of Cail Generated by Claude Code |
|
@cailmdaley we should maybe propagate both original and reconvolved PSF quantities Tpsf and Tpsf_o, following @aguinot's comment. |
|
Okay @aguinot I dug into the history of this and will try to express my understanding below for the sake of educating myself, tell me if this sounds right to you. Before this branch was opened both the original and reconvolved PSFs were fit, with the original obs.psf fit quantities appropriately assigned to keys with no o: shapepipe/src/shapepipe/modules/ngmix_package/ngmix.py Lines 888 to 896 in 0cfcf4f shapepipe/src/shapepipe/modules/ngmix_package/ngmix.py Lines 907 to 908 in 0cfcf4f psf_res_gT and tres get merged into a single dictionary).
Then, in d01503d the switch to ngmix 2.4 happened and shapepipe/src/shapepipe/modules/ngmix_package/ngmix.py Lines 1187 to 1189 in 60962ff *o keys that were no longer being calculated got assigned the reconvolved quantities:Thus, both the *psf and psfo keys carried the same reconvolved PSF, and I agree that this is very dangerous. It already caused a lot of confusion for example below, where two T_psfo keys were defined:which I then deduplicated in e57544b leaving the clobbered key as you noticed above. This didn't change any behavior, but it was a shortsighted change that didn't address the underlying issue and compounded confusion. Anyway, it sounds like the solution is to correctly record both the original and reconvolved PSF as you suggest, and clean up these original-reconvolved clobbers. This would probably mean adding back an ngmix fit to the original PSF in However I'm a bit confused because there are also per-epoch moment fits to the original PSF e.g. Here's a concrete proposal, which may be stupid: make available inverse-variance averaged moment — Cail (written by himself!) |
|
So I got confused by looking at the diff on github with this line: output_dict[name]["Tpsf"].append(results[idx]["T_PSFo"])Which is technically fine because before, here, you do: res['T_PSFo'] = psf_res['T_psf']Which is extemely misleading to say the least... Now, regaring your other questions. By computing the original PSF here we get the "coadded" PSF at the location of every detections in the catalogue, stars AND galaxies. The other PSF measurement you are refering too comes from the the star ONLY catalogue. So it cannot be used to compute the leakage at galaxy location, used for your object-wise PSF correction if I am not mistaken. In addition, the per-epoch with HSM is done in image coordinate while NGMix works in sky coordinates, which is a lot better (see my note at the end). If we wanted to "coadd" those PSF we would need to read the weight image at the star locations to get the weight values. Technically possible but a bookeeping nightmare and not useful. Probably require a specific module to do that, with matching, etc.
I am sorry but don't understand your last proposal, I am not sure what usage we would have of this. NOTE about HSM: |
|
Yes extremely confusing/misleading, it took me quite a while to trace this. I totally agree there should be a single source of truth for each quantity that is consistent with its column name and docstrings, although I'll note that all the non-human-readable elements were authored by humans :) I actually just discovered that the HSM moment fit is already calculated and saved at galaxy positions: shapepipe/src/shapepipe/modules/make_cat_package/make_cat.py Lines 631 to 635 in 216f407 So I will leave the HSM fits at galaxy positions alone (unless there are performance reasons to turn off the route altogether, since it doesn't seem to be used anywhere), clean up the original/convolved ngmix PSF keying, and open a separate PR for switching HSM fits to sky coordinates. Tell me if I'm still missing anything and I'll tag you when I have an implementation. Cheers |
# Conflicts: # uv.lock
# Conflicts: # uv.lock
|
Hello, in the previous ngmix version the way we dealt with neighbors was to fill in a noise image were SExtractor gave the segmentation of a neighbor. now we do not know if this strategy is perfect (for one thing it could vary a lot between exposures what the "neighbor" is) and we do have the undiagnosed B-modes when including FLAGS=2 galaxies, yet the idea was to propagate the same strategy in the new version for now, both because we lack a better alternative (since we are not going to implement uberseg) and because we fundamentally believe this should work. I discussed this with Eric Huff and he said that in Euclid the approach was to do a noise interpolation, so a very similar approach. The function get_noise is in ngmix.py but never used. Please let me know if I missed it. Maybe we could build a mask in sky coordinates to make sure all exposures have the same area masked and then apply a 1 or 2 pixel broadening? Since discarding FLAGS=2 objects lead to a strong selection not-metaclibratable selection bias I think this deserves some thought. EDIT: Sorry about that, I think this is okay for now, it actually appears as: if (~mask).any(): |
|
@cailmdaley propagating to the coadd the single epoch HSM quantites will be a lot more difficult than you think and I don't see a good reason to do so. |
|
@fabianhervaspeters I don't know why the All that being said, I don't think we should mask and replace by noise neighboring objects on the image. A better startegy would be:
For uberseg, it is really not difficult to implement. Here is a piece of code I made for another project to branch to the implementation of Erin Sheldon used in DES: import numpy as np
from meds import _uberseg
def fast_uberseg(
seg,
weight,
object_number,
):
obj_inds = np.where(seg != 0)
Nx, Ny = seg.shape
Ninds = len(obj_inds[0])
seg = seg.astype(np.int32)
weight = weight.astype(np.float32, copy=False)
obj_inds_x = obj_inds[0].astype(np.int32, copy=False)
obj_inds_y = obj_inds[1].astype(np.int32, copy=False)
_uberseg.uberseg_tree(
seg, weight, Nx, Ny, object_number, obj_inds_x, obj_inds_y, Ninds
)
return weightFor the symmetrization, it is implemented here. (This is obvious but everything above need testing) |
Fabian flagged the HSM adaptive-moment centroid as wrong for the metacal galaxy Jacobian origin; the catalog sky position projected through the WCS is the right choice (#767). Flip the production default: - ngmix_runner CENTROID_SOURCE fallback "hsm" -> "wcs" - example config template CENTROID_SOURCE = wcs, so the CI example pipeline exercises the WCS path on a real tile The WCS branch is already wired: make_ngmix_observation receives ra/dec/wcs_full per epoch from the stamp. The low-level function defaults (do_ngmix_metacal / make_ngmix_observation) stay "hsm" so the unit tests, which call them directly without astrometry, are unchanged; the full removal of the HSM centroid path is a separate PR (#767). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01DwHw42NeHoEWwsykwwDooP
conftest's on_candide() matches the hostname against ^(c\d|n\d{2}) and the
comment explicitly says to override it "for CI or odd hostnames" — but the
workflow never did. CI runs pytest via `docker run` with no --hostname, so
the container's hostname is its hex ID; ~4% of the time that ID starts with
c+digit, on_candide() false-positives, the candide-marked cluster tests run
instead of skipping, and they ERROR on the missing candide data
(FileNotFoundError) — a flaky red on an otherwise-green suite (275 passed,
2 should-be-skipped). Set the override the conftest already supports.
(Same one-liner should land on develop; the flake exists there too.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DwHw42NeHoEWwsykwwDooP
What this is
A same-repo mirror of #740 (@martinkilbinger's "Ngmix v2.0"), pushed to a branch on
CosmoStat/shapepipeso that CI actually runs. All 57 commits are authored by Martin (and carry Lucy Baumont's and Axel Guinot's work) — pushing the branch preserves that authorship unchanged; the only thing this PR adds is a same-repo head so GitHub Actions fires thepull_requestworkflow without the fork-PR approval gate. #740 received no CI runs at all for this reason.Substance is identical to #740 — see that PR for the full description. In short: upgrade ngmix to 2.4.0 and adopt Lucy's new ngmix classes/interface; overhaul the shape-measurement module; centroid-bias fix + validation; v2.0 production-run plumbing.
Going forward, opening PRs directly on
CosmoStat/shapepipe(rather than from a fork) avoids this — fork PRs don't trigger our Docker-image CI without a maintainer approval that wasn't happening.Closes/supersedes #740 once CI is green (leaving that call to Martin).
Review
A detailed review is on its way (read against Martin's checklist plus a science-quality pass). Headline from exercising the new fitter against real ngmix 2.4.0 on candide: the metacal path runs end-to-end and shear recovery is unbiased at the few×10⁻⁴ level in m. Full notes to follow.
— Claude on behalf of Cail