bundle: add --jobs flag for parallel formula installation#21891
bundle: add --jobs flag for parallel formula installation#21891MikeMcQuaid merged 14 commits intoHomebrew:mainfrom
Conversation
|
Hey @mvanhorn thanks for this. A few notes in no particular order:
Looking at code now! |
MikeMcQuaid
left a comment
There was a problem hiding this comment.
Looks good so far!
The thing that will definitely bite here unfortunately is locking. If you have two separate dependency trees that overlap: they cannot be handled in parallel due to keg locking.
It's unclear if the current code handles this or not; it's a little hard to follow because the low-level sync/mutex/lock operations are intertwined with the actual logic.
|
Thanks for the review @MikeMcQuaid, good feedback across the board. PR template: my mistake, updating the description with the checklist now. Benchmarks: "estimated" isn't good enough, you're right. I'll run hyperfine against a real Brewfile and post actual numbers with the specific formulae listed. concurrent-ruby: makes sense to use what's already in the codebase. I'll refactor to use it similar to how download_queue.rb works. New class: agreed, the parallel scheduling logic is involved enough to warrant its own class rather than being scattered in installer.rb. Re: dependency waiting (line 194) - yes, currently it only waits on dependencies that are also in the Brewfile. I need to verify whether that's sufficient or if we need to check the full dependency graph via brew deps. Re: verbose output (line 221) - output from parallel workers goes through a mutex to prevent interleaving, but I need to verify whether brew install --verbose output from concurrent installs stays readable in practice. Keg locking: this is the critical issue. If two formulae share an overlapping dependency being installed, keg locking could cause problems. I need to investigate how download_queue.rb handles this and whether concurrent-ruby gives better primitives. The current raw thread approach definitely needs more guardrails here. Will push updates addressing the concurrent-ruby swap, class extraction, and keg locking investigation. Real benchmarks to follow. |
Move parallel formula installation into a dedicated Homebrew::Bundle::ParallelInstaller class using concurrent-ruby's FixedThreadPool instead of raw Thread.new. Add keg lock awareness via FormulaLock to prevent concurrent installs of formulae that share overlapping dependency kegs. Addresses review feedback from @MikeMcQuaid on Homebrew#21891. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Pushed c711314 addressing all review feedback: Class extraction: Parallel install logic moved from concurrent-ruby: Replaced raw Keg locking: Verbose output: Status messages ( Benchmarks (hyperfine, 3 runs each, 4 independent formulae: tree, bat, fd, fzf): The speedup is modest with 4 small formulae since install time per formula is short (~0.5s each). The improvement scales with formula count and individual install time. With all formulae already satisfied (no-op case), overhead is ~70ms from thread pool setup.
|
Can we also get a sequential benchmark on |
MikeMcQuaid
left a comment
There was a problem hiding this comment.
Thanks, few more notes here!
|
Benchmarks (hyperfine, 3 runs each, 4 formulae: tree, bat, fd, fzf):
~35% improvement. Note that bat and fd share recursive deps (ca-certificates, openssl@3 chain via rust/libgit2), so they're serialized. With a Brewfile of fully independent formulae the speedup would be larger. Also pushed two commits addressing your second round of feedback:
|
|
Addressed all feedback in 8e9c9a2:
Style and typecheck pass clean. |
MikeMcQuaid
left a comment
There was a problem hiding this comment.
Looking good. A few more questions for you (not Claude) to answer. Also feels like some of the bigger and/or more involved functions could do with some more comments to add a little clarity on flow and choices.
|
Hi Mike! here's me (but also with Claude helping me so I don't bork anything up..) On runtime_dependencies: I kept raw deps because recursive_dep_names uses recursive_dependencies (which includes build deps) for lock conflict detection. If formula_dep_names only returned runtime deps, two formulas sharing a build dep could slip through and hit keg locks. The whole point of the lock check is to serialize anything that could conflict, so the dependency universe needs to match. On the naming: you're right, brewfile_dependencies was confusing. Renamed to formula_dep_names - it's the formula's own direct deps, not "deps from the Brewfile." On cask recursive deps: added cask_dep_names that walks depends_on[:cask] and intersects with Brewfile entries. Direct deps only for now since cask chains are usually shallow, but it's structured to go deeper if needed. Added phase comments to build_dependency_map. Tried to keep them useful without being noisy - the three-phase approach (name map, direct deps, recursive dep sets, then merge) wasn't obvious from reading the code. All pushed in 13a7800. Typecheck clean, style clean, tests green. |
@mvanhorn This will probably be more conservative than it needs to be unfortunately. Ideally we'll only look at build deps when building from source and only look at runtime deps when pouring bottles. With locking you don't want something like a Apologies for all the back and forth, I do think this will be worth it! |
|
Good call - you're right that this was overly conservative. Fixed in eb22e0e.
This should unblock cases like bat and fd where cmake was unnecessarily serializing their bottle pours. No need to apologize for the back and forth - this is making the feature meaningfully better. |
MikeMcQuaid
left a comment
There was a problem hiding this comment.
Looking better!
Weird behaviour with a password prompt here:
Installing Slack
Using TouchDraw2
Using WhatsApp
Password:Using anykeyh.simplecov-vscode
|
Addressed in 5ced3ff: Applied the Extracted the duplicated On the password prompt interleaving: cask installs can trigger |
|
Hey Mike! Fixed the password prompt interleaving in 3415250. The root cause: when a cask install triggers sudo, the The fix writes a newline to |
@mvanhorn unfortunately output is now a bit crappy: Have some merge conflicts to resolve here too! |
Add `--jobs=N` flag to `brew bundle install` that installs independent formulae in parallel using Ruby threads. Dependent formulae wait for their dependencies to complete before installing. - Default: sequential (--jobs=1, preserving current behavior) - `--jobs=auto` uses Etc.nprocessors capped at 4 - `HOMEBREW_BUNDLE_JOBS` env var support - Non-formula entries (casks, extensions) still install sequentially - Thread-safe output via mutex to prevent interleaving - Deadlock fallback: if no entries can be scheduled, remaining entries install sequentially Continues the trajectory of Homebrew#18278 (concurrent downloads) and Homebrew#21252 (parallel fetch in bundle) by parallelizing the install loop itself.
- Remove redundant `require "set"` - Fix trailing comma in method call - Fix hash alignment in Sorbet signatures - Fix line length violations - Replace `have_received` with `receive` (RSpec/MessageSpies) - Use explicit block argument instead of yield - Combine loops (Style/CombinableLoops)
Move parallel formula installation into a dedicated Homebrew::Bundle::ParallelInstaller class using concurrent-ruby's FixedThreadPool instead of raw Thread.new. Add keg lock awareness via FormulaLock to prevent concurrent installs of formulae that share overlapping dependency kegs. Addresses review feedback from @MikeMcQuaid on Homebrew#21891. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address round 2 review feedback from @MikeMcQuaid: - Remove all custom FormulaLock-based locking from ParallelInstaller. Rely on `brew install`'s built-in keg locking instead, with the dependency map preventing same-tree parallel installs. - Extend parallel installation to all entry types (casks, taps, extensions), not just formulae. Non-Brew entries have no inter-entry dependencies and can all run in parallel immediately. - Simplify build_dependency_map to handle both Brew (formula deps) and non-Brew (empty deps) entries in a single pass. Net -82 lines. Typecheck, style, and tests pass.
`brew install` acquires file locks on all recursive dependencies (including build deps via FormulaInstaller#lock). When two entries share any transitive dep (e.g., bat and fd both pull in ca-certificates via their rust/libgit2 chains), their concurrent `brew install` processes deadlock on the shared file lock. Use Formula#recursive_dependencies to build a conflict graph: entries sharing any transitive dep are serialized (later entry waits for earlier). Entries with disjoint dep trees still run in parallel. Example with tree, bat, fd, fzf (--jobs=4): Batch 1: tree + bat + fzf (parallel, no shared deps) Batch 2: fd (after bat, shared ca-certificates/openssl chain)
- Move formula dependency lookups to Brew.brewfile_dependencies and Brew.recursive_dep_names class methods - Handle Cask formula deps and implicit Tap deps in dependency map - Replace T.must with .fetch for better error messages - Remove unnecessary T.cast on future value Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…omments - Rename `brewfile_dependencies` -> `formula_dep_names` (clearer intent) - Apply `.presence` pattern for formula lookup (idiomatic Ruby) - Simplify tap_prefix extraction (trailing conditional) - Add cask-on-cask dependency tracking via `cask_dep_names` - Add phase comments to `build_dependency_map` for flow clarity
Only include build dependencies in lock conflict detection when building from source. Pouring bottles only acquires keg locks on runtime deps, so shared build deps like cmake no longer serialize unrelated bottle pours. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…serialize cask installs - Apply code suggestion: chain .to_set(&:name) on if/else in recursive_dep_names instead of repeating it in each branch - Extract duplicated formulae_by_full_name/formulae_by_name lookup into find_formula helper used by formula_dep_names and formula_bottled? - Serialize cask installs via dedicated mutex to prevent interactive sudo Password: prompts from interleaving with parallel output
Cask installs can trigger sudo password prompts that write directly to /dev/tty. Parallel formula workers could print status messages at the same moment, causing interleaved output like "Password:Using foo". Switch @output_mutex from Mutex to Monitor (reentrant) and hold it for the entire cask install. This blocks other workers' status output while a cask install is in progress, preventing prompt/status interleaving. Monitor's reentrancy lets write_output calls inside do_install_entry! re-acquire the lock on the same thread without deadlocking. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cask installs can trigger interactive prompts (sudo, macOS security frameworks) that write directly to /dev/tty without a trailing newline. When running in parallel, the next worker's status message would append to the same line as a stale "Password:" prompt, producing garbled output like "Password:Using anykeyh.simplecov-vscode". After the output lock is released following a cask install, write a newline to /dev/tty to ensure the terminal cursor starts on a clean line. The write is a no-op in CI or when output is piped (ENXIO/ENOENT are caught). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Writing an unconditional newline to /dev/tty after each cask install produced visually inconsistent output - a blank line after every cask while formulas ran back-to-back. Reported by @MikeMcQuaid. Use \r + CSI-K (clear to end of line) so any stale interactive prompt (sudo Password:, macOS security) gets overwritten in place instead of pushing the cursor down. When the line is already clean the sequence is visually a no-op, so formula and cask status lines now render uniformly.
typed: strict now requires explicit sigs on these helpers. formulae_by_full_name and formulae_by_name can both return nil, so find_formula must too. Callers already guard against blank.
3415250 to
49a8f30
Compare
|
Thanks for the catch on the output, and apologies for the noise. Root cause: the unconditional Pushed 36515f5 ( Also rebased on latest master to clear the conflicts. The On the earlier point about the lock universe being more conservative than ideal ( |
MikeMcQuaid
left a comment
There was a problem hiding this comment.
Thanks @mvanhorn. Tested locally and works well.
Woohoo super excited about this. |
|
@MikeMcQuaid heads up wrote about my excitement with a little video on this feature on x. Says you don't use x much. :) https://x.com/mvanhorn/status/2045134181546566030?s=46 |
brew lgtm(style, typechecking and tests) with your changes locally?brew lgtm, and iterated to match Homebrew conventions (Sorbet strict, concurrent-ruby patterns from download_queue.rb).Summary
Add
--jobs=Nflag tobrew bundle installthat installs independent formulae in parallel using concurrent-ruby'sFixedThreadPool. Dependent formulae wait for their dependencies to complete before installing. Non-formula entries (casks, extensions) still install sequentially.Continues the trajectory of #18278 (concurrent downloads) and #21252 (parallel fetch) by parallelizing the install loop itself.
Changes
After review feedback from @MikeMcQuaid:
Homebrew::Bundle::ParallelInstaller(was scattered inline ininstaller.rb)Thread.newwithConcurrent::FixedThreadPool+Concurrent::Promises.future_on, matching the pattern used indownload_queue.rbParallelInstalleridentifies shared dependencies across Brewfile entries. Before installing a formula, it acquiresFormulaLockon any dependency kegs shared with other parallel installs, preventing concurrent keg modificationscmd/bundle.rb:--jobs=Nflag,--jobs=autocaps atEtc.nprocessors(max 4),HOMEBREW_BUNDLE_JOBSenv var--jobs=1Design decisions
--jobs=1(sequential), preserving current behaviorensureblock callspool.shutdown+pool.wait_for_terminationso threads are always cleaned upBenchmarks
Real hyperfine benchmarks with specific formulae coming in a follow-up comment.
This contribution was developed with AI assistance (Claude Code + Codex).