Skip to content

Working PR for ADR-191: Refactor DVC pipeline into proper OOP design patterns#202

Draft
jodavis wants to merge 27 commits into
mainfrom
feature/ADR-191-oop-ml-pipeline
Draft

Working PR for ADR-191: Refactor DVC pipeline into proper OOP design patterns#202
jodavis wants to merge 27 commits into
mainfrom
feature/ADR-191-oop-ml-pipeline

Conversation

@jodavis

@jodavis jodavis commented May 29, 2026

Copy link
Copy Markdown
Owner

@github-actions

github-actions Bot commented May 29, 2026

Copy link
Copy Markdown

Test Results

401 tests  ±0   400 ✅  - 1   2m 2s ⏱️ -3s
  5 suites ±0     0 💤 ±0 
  5 files   ±0     1 ❌ +1 

For more details on these failures, see this check.

Results for commit 9ab9062. ± Comparison against base commit 8f1b8b7.

This pull request removes 3 and adds 2 tests. Note that renamed tests count towards both.
,False)
AdaptiveRemote.Services.ProgrammaticSettings.PersistSettingsTests ‑ PersistSettings_Set_ValidatesKeyNameAsync (Hello
AdaptiveRemote.Services.ProgrammaticSettings.PersistSettingsTests ‑ PersistSettings_Set_ValidatesValueAsync (Invalid
AdaptiveRemote.Services.ProgrammaticSettings.PersistSettingsTests ‑ PersistSettings_Set_ValidatesKeyNameAsync (Hello
,False)
AdaptiveRemote.Services.ProgrammaticSettings.PersistSettingsTests ‑ PersistSettings_Set_ValidatesValueAsync (Invalid
,False)

♻️ This comment has been updated with latest results.

@jodavis

jodavis commented May 30, 2026

Copy link
Copy Markdown
Owner Author

@copilot rebase this PR branch on main so it gets the latest fixes. Make sure scripts/validate passes with no warnings, errors, or test failures.

Copilot AI commented May 30, 2026

Copy link
Copy Markdown
Contributor

@jodavis I'm unable to start working on this because of repository rules that prevent me from pushing to the branch:

  • Changes must be made through a pull request due to repository rules

See the documentation for more details.

ElwoodMoves and others added 7 commits June 2, 2026 22:50
…core/

- sample.py: Sample(ABC), SampleWithPath(Sample,ABC), TextSample, AudioSample,
  SampleSpectrogram, SampleTokens dataclasses with all spec fields.
- manifest.py: Manifest[S] generic with O(1) by_id/by_content_hash lookups;
  ManifestStore.read() dispatches on sample_type; write() emits version-1 JSON.
  TextSample JSON omits path/parent_content_hash/applied_values; Spectrogram and
  Tokens JSON omits applied_values and includes parent_id per spec.
- pyproject.toml: minimal pytest config so plain `pytest` from ml/ works.
- Ambiguity resolutions: mutable dataclasses (no frozen=True); seed not a default
  field (callers pass seed=0 explicitly, matching spec's "PhraseVariator sets seed=0").

https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
1. Manifest.__init__: raise ValueError on duplicate sample ids (UUIDs must
   be unique); keep first occurrence for duplicate content_hash (documented
   contract — deterministic stages can produce identical hashes).

2. ManifestStore.read(): use data.get("sample_type") and raise ValueError
   explicitly when the key is absent, matching the defensive pattern used
   for "version".

3. ManifestStore.write(): validate that all samples share the same type
   before writing; a mixed-type Manifest would produce a JSON file whose
   sample_type header mismatches later entries, causing read() to KeyError.

Adds four new tests covering each guard.

https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
TextSample now derives its id from content_hash via __post_init__,
so callers no longer pass id= explicitly. This means two TextSamples
with the same content always share the same id, enforcing the
deduplication invariant at the type level.

Update ManifestStore._deserialise to omit id= when constructing
TextSample, and update tests to match: duplicate-id and
duplicate-content-hash tests now use AudioSample (where id and
content_hash are independent) to exercise those code paths.

https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
@jodavis jodavis force-pushed the feature/ADR-191-oop-ml-pipeline branch from b835493 to cb8b515 Compare June 3, 2026 05:51
jodavis-claude and others added 18 commits June 3, 2026 09:40
* ADR-222: Implement seed-based randomisation engine (PassFilter, VariationGenerator)

Adds ml/pipeline/core/randomization.py with:
- PassFilter ABC (density, sample_domain)
- MinMaxFilter: uniform rejection over [min, max]
- NormalFilter: Gaussian density normalised to peak 1.0; rejects std_dev <= 0
- VariationGenerator: should_vary, generate (float rejection-sampling),
  generate_int (bitmask rejection), choose (direct index-0 hash)

All values derived from sha256 with deterministic key patterns; no random module.
generate_int special-cases range==0 to return min_val immediately.

46 unit tests covering determinism, exact values, boundary conditions,
ValueError after 1000 attempts, probability convergence, and stability
across MinMaxFilter range widening/narrowing using concrete seed values.

* Address code review: add input validation and strengthen independence tests

* VariationGenerator.generate() uses a power-of-2 modulo algorithm to improve variable value stability as parameters change

* Add Python pytest to CI and validate-tests scripts

* Add error handling for cd ml in validate-tests scripts

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
* Use dev-team plugin from feature branch

* ADR-223: Implement ModifierStage abstract base class

Adds ModifierStage[T_in, T_out] with the three-case transform() algorithm
(skip / regen-with-stored-seed / new sample), _compute_content_hash() static
helper, GC of orphaned output files, and manifest read/write integration.

Also adds randomization.py (ADR-222 prerequisite) containing PassFilter,
MinMaxFilter, NormalFilter, and VariationGenerator — required for
modifier_stage.py to compile and for the regen path to construct generators
from stored seeds.

Key decisions:
- ValueError from ManifestStore.read() propagates (hard error); no fallback
  to "no previous manifest" for unreadable/unsupported manifests.
- output_dir creation is caller's responsibility; transform() does not call
  mkdir (consistent with DVC/entry-point owning directory setup per spec).
- GC guards with output_dir.exists() so an empty first run does not raise.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ADR-223: address review feedback

Bound T_in TypeVar to Sample so mypy can verify content_hash access at all
four call sites in transform(); removes four type: ignore[arg-type] suppressions.

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…entry-points) (#217)

* ADR-224: uncommitted changes at validation

* #217: Add ml/params.yaml with pipeline and stage parameters as DVC variables

Moves variations_per_phrase, subsample_rate, and the five probability floats
out of source code and into a central params.yaml so they can be tracked and
overridden as DVC parameters.

* #217: Refactor PhraseVariator to use VariationGenerator for stable reproducible randomisation

Replace injectable rng: random.Random with vgen_factory: Callable[[int], VariationGenerator]
and five probability float constructor params. Remove module-level _*_CHANCE constants.

In generate(), use random.Random(42) to draw a phrase_seed per base phrase, then
random.Random(phrase_seed) to draw a variant_seed per variant, and pass each seed to
vgen_factory to get a VariationGenerator for that variant's _create_variation() call.

In _create_variation(), replace all self._rng.* calls with vgen.should_vary(),
vgen.choose(), and vgen.generate_int() using the stable variable name strings
specified in ADR-224.

Update tests to use the new constructor signature and a factory lambda.

* #217: Load stage params from params.yaml; remove CLI args for DVC variables

Remove --variations-per-phrase and --subsample-rate from argparse. Load all five
probability floats, variations_per_phrase, and subsample_rate from ml/params.yaml
(resolved relative to __file__). Construct PhraseVariator with the factory lambda
and probability params from the YAML.

* Minor changes during code review

* Addition of task to start wiring the DVC pipeline, added as a result of code review.

* Small changes  before pushing

* Address PR #217 review comments: params path convention, typed params object, PhraseVariator params object, move TestConventions

- conventions.params_path(project_root) centralises the params.yaml path
- PipelineParams/GeneratePhraseParams dataclasses deserialise params.yaml into a typed object
- PhraseVariator.__init__ now takes GeneratePhraseParams instead of five individual floats
- TestConventions moved from test_vocab_computer.py to its own test_conventions.py file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix CI ModuleNotFoundError: add pyyaml to ml/requirements.txt and install from it in CI

pyyaml was used in ml/pipeline/stages/params.py and ml/test/pipeline/stages/test_params.py
but was not listed as a dependency. The CI workflow only ran `pip install pytest`, so pyyaml
was never installed. Added ml/requirements.txt with pytest and pyyaml>=6.0, and updated the
python-tests job to install from that file instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
---
updated-dependencies:
- dependency-name: Microsoft.Extensions.Hosting
  dependency-version: 10.0.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
---
updated-dependencies:
- dependency-name: Azure.Monitor.OpenTelemetry.Exporter
  dependency-version: 1.8.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
---
updated-dependencies:
- dependency-name: Microsoft.NET.Test.Sdk
  dependency-version: 18.6.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
* ADR-278: uncommitted changes at validation

* Remove unneeded DVC files from the root of the repo

* DVC init

* Fix behavior of VocabComputer to read "content" instead of "label"

* Lock file after first `dvc repro` run

* Add a link to the Epic in the spec

* Address PR review: move download script, add --output-dir arg, prefix stage names

- Move ml/download_phoneme_dictionary.py to ml/pipeline/download_phoneme_dictionary.py
  so it lives alongside other pipeline scripts
- Add --output-dir argument to the download script instead of hardcoding the path
- Prefix all dvc.yaml stage names with intent_NN_ to match the script naming convention
- Update dvc.lock to reflect renamed stages and updated cmd

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* ADR-225: Implement text-to-speech sample generation using edge-tts

* ADR-225: Add soundfile and edge-tts to ml/requirements.txt

* Revert unrelated extraKnownMarketplaces change from ADR-225

The ADR-225 implementation commit accidentally cleared
extraKnownMarketplaces; restore the jodavis-agent-plugins marketplace
entry that was present before.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ADR-225: Add _doc_speech.md for ml/pipeline/speech package

CONTRIBUTING.md requires a documentation file for new subsystems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ADR-225: Remove multi-line docstring from EdgeTtsProvider

CLAUDE.md: never write multi-paragraph docstrings. The retry strategy is
self-evident from the constants and loop structure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ADR-225: Remove 10-line module docstring from speech_03_generate_samples.py

CLAUDE.md: never write multi-paragraph docstrings. The usage block
duplicates what argparse --help already surfaces.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ADR-225: Rename speech_03_generate_samples to speech_01_generate_samples

Speech and intent pipelines have separate number spaces. The speech
pipeline starts at 01, so this stage is speech_01, not speech_03.

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* ADR-226: Implement audio I/O protocols and DelayAugmentor stage

- ml/pipeline/io/: new AudioReader/AudioWriter Protocol classes with
  LibrosaAudioReader (mono conversion via librosa.load) and
  SoundfileAudioWriter (PCM_16 WAV write)
- ml/pipeline/speech/delay_stage.py: DelayAugmentor(ModifierStage)
  with independent prefix/suffix should_vary checks; both keys always
  in applied_values (0.0 when not applied) for hash stability;
  _derive_id format: "{id}_pre{ms}_suf{ms}"
- ml/pipeline/stages/speech_04_add_delays.py: thin DVC entry-point
- ml/pipeline/stages/params.py: AddDelaysParams dataclass + load()
- ml/params.yaml: add stages.add_delays section (6 float keys)
- ml/requirements.txt: add librosa>=0.10
- Tests: 28 delay stage tests, 9 audio I/O tests, 5 new params tests

* Fix a script error in validate-build.sh

* Fix build and test validation scripts for agentic workflows

* ADR-226: Fix blocking I/O in LibrosaAudioReader and SoundfileAudioWriter

Both async methods called synchronous disk I/O directly (librosa.load and
sf.write), blocking the event loop for the full duration of each file
operation. Offload each to a thread pool executor via run_in_executor so
the event loop remains free during I/O, matching the async contract.

* ADR-226: Move asyncio import to module top level in test_audio_io.py

import asyncio was placed inside each individual test method. Move it to
the top-level imports per standard Python convention.

* ADR-226: Add _doc_io.md for ml/pipeline/io subsystem

New subsystems require a _doc_*.md per CONTRIBUTING.md. Documents the
AudioReader/AudioWriter Protocol contracts, default implementations
(LibrosaAudioReader, SoundfileAudioWriter), algorithm choices, thread
pool offload rationale, and deferred import rationale.

* ADR-226: Update _doc_speech.md to reflect DelayAugmentor is implemented

delay_stage.py / DelayAugmentor was listed as a planned (not yet
implemented) stage. It is now implemented; add it to the Stages table
and remove it from the planned list.

* ADR-226: Replace tuple[np.ndarray, int] return type with AudioData named type

AudioReader.read() now returns AudioData(samples, sample_rate) instead of a raw
tuple. This improves call-site readability and makes intent explicit at the
protocol boundary. LibrosaAudioReader updated to construct AudioData; tests
updated to access result.samples and result.sample_rate instead of unpacking.

* ADR-226: Pass AddDelaysParams to DelayAugmentor; reduce MinMaxFilter precision; omit zero-delay ID parts

- DelayAugmentor constructor now accepts a single AddDelaysParams instead of six
  individual float arguments. The entry-point passes params.add_delays directly.
- MinMaxFilter precision reduced from 6 to 1 (100ms resolution is sufficient for
  audio delay augmentation; sub-millisecond granularity is unnecessary).
- _derive_id now omits _pre and _suf segments when the respective delay is 0.0,
  producing cleaner IDs (e.g. "TV_ON_Jenny_r77" instead of "TV_ON_Jenny_r77_pre0_suf0").
- Tests updated: _make_stage accepts AddDelaysParams; TestDeriveId updated for new
  format; TestGenerateOutput/TestAppliedValues use precision=1-valid delay values (0.1s).

* ADR-226: Rename speech_04_add_delays to speech_02; update params.yaml defaults for precision=1

Renamed speech_04_add_delays.py to speech_02_add_delays.py because only speech_01
exists; the delay stage is the second speech pipeline stage, not the fourth.

Updated params.yaml default delay range from 0.02-0.15s to 0.1-0.5s so that the
defaults are valid at MinMaxFilter precision=1 (100ms resolution). The previous
defaults (20ms, 150ms) would round to 0.0 and 0.1 at precision=1, making the
effective range degenerate.

* ADR-226: Update _doc_io.md for AudioData return type

Document AudioData dataclass in the Protocols table and update the AudioReader
contract description to reflect the new return type.

* ADR-226: Use AudioData in SoundfileAudioWriter.write() and callers

SoundfileAudioWriter.write() was still taking data/sample_rate as
separate arguments rather than AudioData, inconsistent with the
AudioWriter Protocol definition (line 44) that was updated in the
previous fix iteration.

Updated SoundfileAudioWriter.write() to accept audio: AudioData and
unpack audio.samples / audio.sample_rate internally. Updated
delay_stage._generate_output() to pass AudioData to the writer.
Updated _RecordingAudioWriter stub and all test call sites in
test_audio_io.py and test_delay_stage.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ADR-226: Update _doc_io.md AudioWriter contract to AudioData

The AudioWriter contract description still showed the old
data/sample_rate signature. Updated to reflect the current
audio: AudioData parameter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ages (#231)

* ADR-227: Add BackgroundNoiseAugmentor and MicrophoneNoiseAugmentor

- background_noise_stage.py: BackgroundNoiseAugmentor with NoiseProvider protocol;
  noise_file always chosen for hash stability; noise_start_s bounds derived from file
  durations at runtime; all three keys always present in applied_values
- mic_noise_stage.py: MicrophoneNoiseAugmentor with Gaussian noise seeded from output_seed
- speech_05_add_background_noise.py, speech_06_add_mic_noise.py: entry-point scripts
- params.py: AddBackgroundNoiseParams, AddMicNoiseParams; PipelineParams.load() extended
- params.yaml: add_background_noise and add_mic_noise sections
- _doc_speech.md: updated stages table and key design decisions

* ADR-227: Fix executor.submit coroutine form; store _noise_dir to remove redundant list_files() calls

Issue 1: Changed executor.submit(asyncio.run, _read()) to executor.submit(lambda: asyncio.run(_read()))
so the coroutine is created inside the worker thread. The old form passed a pre-created coroutine
object across threads, which is fragile (the coroutine is bound to the originating thread's context).
The lambda form creates a fresh coroutine in the worker thread per asyncio.run() semantics.

Issues 2 & 3: Added noise_dir: Path constructor parameter to BackgroundNoiseAugmentor and stored it
as _noise_dir. Both _get_applied_values (line 106) and _generate_output (line 151) previously called
self._noise_provider.list_files()[0].parent to recover the noise directory — re-globbing the filesystem
on each call and raising an opaque IndexError on an empty directory. _noise_dir eliminates both calls.

* ADR-227: Remove unused imports from test_background_noise_stage.py and test_mic_noise_stage.py

Issues 4 & 5: 'from typing import Any' and 'import pytest' were imported but never referenced
in either test file. Also updates test_background_noise_stage.py to pass the new noise_dir
parameter required by the BackgroundNoiseAugmentor constructor.

* ADR-227: Pass noise_dir to BackgroundNoiseAugmentor; add validation for empty noise directory

Issue 6: _DirectoryNoiseProvider.list_files() now raises ValueError with the directory path
if no WAV files are found. Previously it returned [] which caused VariationGenerator.choose()
to raise ValueError("options must be non-empty") — an opaque message that didn't identify
the misconfigured path. The new guard surfaces the root cause immediately.

Also passes args.noise_dir as the new noise_dir constructor parameter added to
BackgroundNoiseAugmentor.

* ADR-227: Remove noise_start_s variation and _read_duration_s

Per user decision (PRRT_kwDOQYcoLM6KbINz, PRRT_kwDOQYcoLM6KbIOC, PRRT_kwDOQYcoLM6KjMkD):
noise_start_s is no longer varied — it is always 0.0. This eliminates the hacky
synchronous _read_duration_s helper that called asyncio.run inside a ThreadPoolExecutor
to avoid blocking the event loop. noise_start_s is still stored in applied_values for
hash stability. Update _doc_speech.md to remove the now-obsolete bounds-derivation
design note.

* ADR-227: Avoid output_samples copy when amplitude is 0 in MicrophoneNoiseAugmentor

Per Copilot review (PRRT_kwDOQYcoLM6KbINm) and user decision (PRRT_kwDOQYcoLM6KjhqB):
only add Gaussian noise if amplitude > 0. Use audio.samples directly on the no-noise
path instead of creating an unnecessary copy. New array is only allocated when mixing occurs.

* ADR-227: Redesign NoiseProvider; return input when noise not applied

- NoiseProvider.list_files() now returns list[tuple[str, AudioData]] so that
  implementations can load and resample files once at construction time,
  avoiding per-sample I/O and ensuring sample rate consistency.
- _DirectoryNoiseProvider gains a sample_rate: int constructor parameter;
  it loads all WAV files eagerly via librosa (resampling to sample_rate),
  satisfying the sample-rate equality contract.
- BackgroundNoiseAugmentor removes noise_dir from its constructor (no longer
  needed; noise audio is retrieved directly from the provider).
- _generate_output in both BackgroundNoiseAugmentor and MicrophoneNoiseAugmentor
  returns input_sample unchanged when noise_volume / amplitude == 0.0 —
  no file is written and no I/O occurs, eliminating unnecessary file copies.
- MicrophoneNoiseAugmentor._derive_id returns input_sample.id when amplitude
  == 0.0 (no _mic### suffix added for unmodified samples).
- A sample_rate: int field is added to PipelineParams (pipeline-level param)
  with a sample_rate: 16000 default in params.yaml.
- Added ValueError guard in _generate_output when noise and audio sample
  rates differ, with a diagnostic message explaining the fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…try-points (#233)

* ADR-228: Add SpectrogramStage, TokenStage, SetManifestSplitter and DVC entry-points

- SpectrogramStage: deterministic stage producing (n_mels, time_steps) .npy files via librosa
- TokenStage: deterministic stage producing padded phoneme token .json files
- SetManifestSplitter: splits augmented manifest into train/val/test by percentage
- DVC entry-points: speech_07_compute_tokens, speech_08_compute_spectrograms, speech_09_create_set_manifests
- PipelineParams extended with ComputeSpectrogramsParams, ComputeTokensParams, CreateSetManifestsParams
- params.yaml updated with compute_spectrograms, compute_tokens, create_set_manifests sections

* ADR-228: Offload TokenStage JSON write to run_in_executor for consistency with SpectrogramStage

* ADR-228: Update _doc_speech.md to cover featurisation stages and SetManifestSplitter

* ADR-228: Replace round() with int() truncation in SetManifestSplitter for predictable split boundaries

Python's round() uses bankers rounding, which can produce unintuitive results at
.5 boundaries. Integer truncation for train/val with remainder assigned to test is
simpler and avoids unexpected off-by-one behaviour on small manifests.

* ADR-228: Enforce float32 dtype on SpectrogramStage output

librosa.power_to_db may return float64; adding .astype(np.float32) ensures the
saved .npy always matches the documented contract. Added test_output_npy_dtype_is_float32
to verify.

* ADR-228: Include config params in _get_applied_values for SpectrogramStage and TokenStage

Reviewer (jodavis) confirmed that per-file content_hash is used to determine
whether individual outputs are up-to-date between pipeline runs, not just the
DVC stage-level dependency. Returning {} from _get_applied_values means that
changing n_mels/time_steps/input_token_length/phoneme_list does not invalidate
previously-cached outputs on a per-file basis.

SpectrogramStage now returns {"n_mels": ..., "time_steps": ...}.
TokenStage now returns {"input_token_length": ..., "phoneme_list": [...]}.

Replaced test_get_applied_values_returns_empty_dict with three tests each that
assert the new config keys are present and reflect constructed values. Added
test_changing_n_mels_triggers_regen, test_changing_time_steps_triggers_regen,
test_changing_input_token_length_triggers_regen, and
test_changing_phoneme_list_triggers_regen to verify end-to-end regen behaviour.

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
librosa -> resampy -> numba, and numba <0.59.0 does not support Python 3.12.3
(Ubuntu 24.04 system Python). numba 0.59.0 added Python 3.12 support.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
numpy==2.5.0 conflicts with numba (latest numba supports numpy<2.5).
Librosa brings in resampy which requires numba>=0.59.0, but numba
0.59.0 requires numpy<1.27, and no numba version supports numpy 2.5.0.

Use numpy>=1.22 to let pip find a compatible version that satisfies
both tensorflow 2.21.0 and numba/librosa.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants