Skip to content

bundle: add --jobs flag for parallel formula installation#21891

Merged
MikeMcQuaid merged 14 commits intoHomebrew:mainfrom
mvanhorn:feat/bundle-parallel-install
Apr 16, 2026
Merged

bundle: add --jobs flag for parallel formula installation#21891
MikeMcQuaid merged 14 commits intoHomebrew:mainfrom
mvanhorn:feat/bundle-parallel-install

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

@mvanhorn mvanhorn commented Apr 2, 2026

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests (excluding integration tests) for your changes? Here's an example.
  • Have you successfully run brew lgtm (style, typechecking and tests) with your changes locally?

  • AI was used to generate or assist with generating this PR. Claude Code + Codex were used for initial scaffolding of the ParallelInstaller class. All code was manually reviewed, tested with brew lgtm, and iterated to match Homebrew conventions (Sorbet strict, concurrent-ruby patterns from download_queue.rb).

Summary

Add --jobs=N flag to brew bundle install that installs independent formulae in parallel using concurrent-ruby's FixedThreadPool. Dependent formulae wait for their dependencies to complete before installing. Non-formula entries (casks, extensions) still install sequentially.

Continues the trajectory of #18278 (concurrent downloads) and #21252 (parallel fetch) by parallelizing the install loop itself.

Changes

After review feedback from @MikeMcQuaid:

  • New class: Extracted parallel install logic into Homebrew::Bundle::ParallelInstaller (was scattered inline in installer.rb)
  • concurrent-ruby: Replaced raw Thread.new with Concurrent::FixedThreadPool + Concurrent::Promises.future_on, matching the pattern used in download_queue.rb
  • Keg lock handling: ParallelInstaller identifies shared dependencies across Brewfile entries. Before installing a formula, it acquires FormulaLock on any dependency kegs shared with other parallel installs, preventing concurrent keg modifications
  • Output: Status messages use a mutex to prevent interleaving across workers
  • cmd/bundle.rb: --jobs=N flag, --jobs=auto caps at Etc.nprocessors (max 4), HOMEBREW_BUNDLE_JOBS env var
  • Tests: 3 RSpec cases covering parallel independent formulae, serialized dependent formulae, and sequential fallback with --jobs=1

Design decisions

  • Opt-in: Defaults to --jobs=1 (sequential), preserving current behavior
  • Formula-only: Only formulae are parallelized. Casks can trigger GUI prompts, and extensions have their own dependency concerns
  • Deadlock fallback: If the scheduler can't find ready entries (unexpected dependency cycle), remaining entries install sequentially
  • Pool shutdown: ensure block calls pool.shutdown + pool.wait_for_termination so threads are always cleaned up

Benchmarks

Real hyperfine benchmarks with specific formulae coming in a follow-up comment.

This contribution was developed with AI assistance (Claude Code + Codex).

@MikeMcQuaid
Copy link
Copy Markdown
Member

Hey @mvanhorn thanks for this.

A few notes in no particular order:

  • please always ensure you've filled out our PR template when opening issues. if your AI is opening it, instruct them to do so.
  • "10 formulae, 2 needing install: 22.3s" which formulae?
  • "Estimated: ~12s for the 10-formula case" why estimated? hyperfine should be good to measure this.
  • "Non-formula entries (casks, extensions) still install sequentially": IMO we should aim, not necessarily in this PR, to install all types in parallel

Looking at code now!

Copy link
Copy Markdown
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far!

The thing that will definitely bite here unfortunately is locking. If you have two separate dependency trees that overlap: they cannot be handled in parallel due to keg locking.

It's unclear if the current code handles this or not; it's a little hard to follow because the low-level sync/mutex/lock operations are intertwined with the actual logic.

Comment thread Library/Homebrew/bundle/installer.rb Outdated
Comment thread Library/Homebrew/bundle/installer.rb Outdated
Comment thread Library/Homebrew/bundle/installer.rb Outdated
Comment thread Library/Homebrew/bundle/installer.rb Outdated
@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 3, 2026

Thanks for the review @MikeMcQuaid, good feedback across the board.

PR template: my mistake, updating the description with the checklist now.

Benchmarks: "estimated" isn't good enough, you're right. I'll run hyperfine against a real Brewfile and post actual numbers with the specific formulae listed.

concurrent-ruby: makes sense to use what's already in the codebase. I'll refactor to use it similar to how download_queue.rb works.

New class: agreed, the parallel scheduling logic is involved enough to warrant its own class rather than being scattered in installer.rb.

Re: dependency waiting (line 194) - yes, currently it only waits on dependencies that are also in the Brewfile. I need to verify whether that's sufficient or if we need to check the full dependency graph via brew deps.

Re: verbose output (line 221) - output from parallel workers goes through a mutex to prevent interleaving, but I need to verify whether brew install --verbose output from concurrent installs stays readable in practice.

Keg locking: this is the critical issue. If two formulae share an overlapping dependency being installed, keg locking could cause problems. I need to investigate how download_queue.rb handles this and whether concurrent-ruby gives better primitives. The current raw thread approach definitely needs more guardrails here.

Will push updates addressing the concurrent-ruby swap, class extraction, and keg locking investigation. Real benchmarks to follow.

mvanhorn added a commit to mvanhorn/brew that referenced this pull request Apr 3, 2026
Move parallel formula installation into a dedicated
Homebrew::Bundle::ParallelInstaller class using concurrent-ruby's
FixedThreadPool instead of raw Thread.new. Add keg lock awareness
via FormulaLock to prevent concurrent installs of formulae that
share overlapping dependency kegs.

Addresses review feedback from @MikeMcQuaid on Homebrew#21891.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 3, 2026

Pushed c711314 addressing all review feedback:

Class extraction: Parallel install logic moved from installer.rb into Homebrew::Bundle::ParallelInstaller (Library/Homebrew/bundle/parallel_installer.rb). The installer now delegates with ParallelInstaller.new(entries, jobs:, ...).run! when jobs > 1.

concurrent-ruby: Replaced raw Thread.new with Concurrent::FixedThreadPool + Concurrent::Promises.future_on, matching the pattern in download_queue.rb. Pool is shut down in an ensure block.

Keg locking: ParallelInstaller identifies dependencies shared by multiple Brewfile entries. Before installing a formula, it acquires FormulaLock on each shared dependency keg, preventing concurrent keg modifications. Locks are released after install completes. Formulae with no shared dependencies install fully in parallel.

Verbose output: Status messages (Installing X, Using X, error messages) go through a mutex to prevent interleaving. Subprocess output from brew install itself will naturally interleave in parallel mode, which is consistent with how download_queue.rb handles concurrent download output.

Benchmarks (hyperfine, 3 runs each, 4 independent formulae: tree, bat, fd, fzf):

Benchmark 1: sequential (jobs=1)
  Time (mean +/- s):  2.065 s +/- 0.441 s  [User: 1.213 s, System: 0.349 s]

Benchmark 2: parallel (jobs=2)
  Time (mean +/- s):  1.865 s +/- 0.101 s  [User: 1.095 s, System: 0.291 s]

Benchmark 3: parallel (jobs=4)
  Time (mean +/- s):  1.863 s +/- 0.080 s  [User: 1.088 s, System: 0.295 s]

Summary: parallel (jobs=4) 1.11x faster than sequential

The speedup is modest with 4 small formulae since install time per formula is short (~0.5s each). The improvement scales with formula count and individual install time. With all formulae already satisfied (no-op case), overhead is ~70ms from thread pool setup.

brew style, brew typecheck, and brew tests --only=bundle/installer all pass locally.

@MikeMcQuaid
Copy link
Copy Markdown
Member

Benchmarks (hyperfine, 3 runs each, 4 independent formulae: tree, bat, fd, fzf):

Can we also get a sequential benchmark on main for these? Thanks!

Copy link
Copy Markdown
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, few more notes here!

Comment thread Library/Homebrew/bundle/installer.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 3, 2026

Benchmarks (hyperfine, 3 runs each, 4 formulae: tree, bat, fd, fzf):

Mode Branch Mean Range
Sequential main 9.620s ± 0.549s 9.256s – 10.252s
Parallel --jobs=4 PR 6.288s ± 0.704s 5.873s – 7.101s

~35% improvement. Note that bat and fd share recursive deps (ca-certificates, openssl@3 chain via rust/libgit2), so they're serialized. With a Brewfile of fully independent formulae the speedup would be larger.

Also pushed two commits addressing your second round of feedback:

  • 76e991e removes custom FormulaLock locking, extends parallel install to all entry types (casks, taps, extensions)
  • 8e07c07 fixes a lock conflict discovered during benchmarking: brew install acquires file locks on all recursive_dependencies (including build deps). bat and fd share ca-certificates via their rust/libgit2 chains, causing deadlock. The dep map now uses Formula#recursive_dependencies to detect overlapping dep trees and serialize conflicting entries.

Copy link
Copy Markdown
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting better thanks!

Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 5, 2026

Addressed all feedback in 8e9c9a2:

  • Removed T.cast on future value (line 68)
  • Moved formula dependency lookups into Brew.brewfile_dependencies and Brew.recursive_dep_names class methods (line 118 feedback)
  • Added Cask formula deps via Cask.formula_dependencies and implicit Tap deps for entries from non-default taps (line 103 feedback)
  • Replaced all T.must with .fetch (lines 141, 151)

Style and typecheck pass clean.

Copy link
Copy Markdown
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. A few more questions for you (not Claude) to answer. Also feels like some of the bigger and/or more involved functions could do with some more comments to add a little clarity on flow and choices.

Comment thread Library/Homebrew/bundle/brew.rb Outdated
Comment thread Library/Homebrew/bundle/brew.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
Comment thread Library/Homebrew/bundle/parallel_installer.rb Outdated
@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 7, 2026

Hi Mike! here's me (but also with Claude helping me so I don't bork anything up..)

On runtime_dependencies: I kept raw deps because recursive_dep_names uses recursive_dependencies (which includes build deps) for lock conflict detection. If formula_dep_names only returned runtime deps, two formulas sharing a build dep could slip through and hit keg locks. The whole point of the lock check is to serialize anything that could conflict, so the dependency universe needs to match.

On the naming: you're right, brewfile_dependencies was confusing. Renamed to formula_dep_names - it's the formula's own direct deps, not "deps from the Brewfile."

On cask recursive deps: added cask_dep_names that walks depends_on[:cask] and intersects with Brewfile entries. Direct deps only for now since cask chains are usually shallow, but it's structured to go deeper if needed.

Added phase comments to build_dependency_map. Tried to keep them useful without being noisy - the three-phase approach (name map, direct deps, recursive dep sets, then merge) wasn't obvious from reading the code.

All pushed in 13a7800. Typecheck clean, style clean, tests green.

@MikeMcQuaid
Copy link
Copy Markdown
Member

On runtime_dependencies: I kept raw deps because recursive_dep_names uses recursive_dependencies (which includes build deps) for lock conflict detection. If formula_dep_names only returned runtime deps, two formulas sharing a build dep could slip through and hit keg locks. The whole point of the lock check is to serialize anything that could conflict, so the dependency universe needs to match.

@mvanhorn This will probably be more conservative than it needs to be unfortunately. Ideally we'll only look at build deps when building from source and only look at runtime deps when pouring bottles. With locking you don't want something like a cmake build dependency to lock everything, particularly if nothing actually "needs" it.

Apologies for all the back and forth, I do think this will be worth it!

@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 7, 2026

Good call - you're right that this was overly conservative. Fixed in eb22e0e.

recursive_dep_names now takes an include_build: parameter. The parallel installer checks whether each formula is bottled (and not --build-from-source) to decide which dep set to use for lock conflict detection. Bottle pours use runtime deps only, source builds use the full dependency tree.

This should unblock cases like bat and fd where cmake was unnecessarily serializing their bottle pours.

No need to apologize for the back and forth - this is making the feature meaningfully better.

Copy link
Copy Markdown
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking better!

Weird behaviour with a password prompt here:

Installing Slack
Using TouchDraw2
Using WhatsApp
Password:Using anykeyh.simplecov-vscode

Comment thread Library/Homebrew/bundle/brew.rb Outdated
Comment thread Library/Homebrew/bundle/brew.rb Outdated
@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 7, 2026

Addressed in 5ced3ff:

Applied the .to_set(&:name) chain suggestion on recursive_dep_names.

Extracted the duplicated formulae_by_full_name(name).presence || formulae_by_name(name) lookup into find_formula - used by both formula_dep_names and formula_bottled?.

On the password prompt interleaving: cask installs can trigger sudo which writes Password: directly to the terminal, outside our output mutex. Added a @cask_install_mutex that serializes cask installs so password prompts don't collide with status output from other workers. Formula installs still run fully parallel.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Hey Mike! Fixed the password prompt interleaving in 3415250.

The root cause: when a cask install triggers sudo, the Password: prompt writes directly to /dev/tty - completely bypassing Ruby's IO and the output mutex. The mutex prevents other threads from writing to stdout during the install, but it can't prevent sudo's /dev/tty write. After the install completes and the mutex is released, the next worker's status message appends to the terminal line where Password: is still lingering.

The fix writes a newline to /dev/tty after each cask install (between releasing the output lock and the cask install lock). This resets the terminal cursor so the next status message starts on a clean line. The write is caught with ENXIO/ENOENT for CI and piped output.

@MikeMcQuaid
Copy link
Copy Markdown
Member

The fix writes a newline to /dev/tty after each cask install (between releasing the output lock and the cask install lock). This resets the terminal cursor so the next status message starts on a clean line. The write is caught with ENXIO/ENOENT for CI and piped output.

@mvanhorn unfortunately output is now a bit crappy:

Using zsh-autosuggestions
Using 1password

Using mysql@8.4
Using 1password-cli

Using appcleaner

Using chatgpt

Using nginx
Using claude-code

Using docker-desktop

Using elgato-camera-hub

Using pango
Using redis
Using elgato-control-center

Using font-sf-mono

Have some merge conflicts to resolve here too!

mvanhorn and others added 12 commits April 14, 2026 13:48
Add `--jobs=N` flag to `brew bundle install` that installs independent
formulae in parallel using Ruby threads. Dependent formulae wait for
their dependencies to complete before installing.

- Default: sequential (--jobs=1, preserving current behavior)
- `--jobs=auto` uses Etc.nprocessors capped at 4
- `HOMEBREW_BUNDLE_JOBS` env var support
- Non-formula entries (casks, extensions) still install sequentially
- Thread-safe output via mutex to prevent interleaving
- Deadlock fallback: if no entries can be scheduled, remaining
  entries install sequentially

Continues the trajectory of Homebrew#18278 (concurrent downloads) and Homebrew#21252
(parallel fetch in bundle) by parallelizing the install loop itself.
- Remove redundant `require "set"`
- Fix trailing comma in method call
- Fix hash alignment in Sorbet signatures
- Fix line length violations
- Replace `have_received` with `receive` (RSpec/MessageSpies)
- Use explicit block argument instead of yield
- Combine loops (Style/CombinableLoops)
Move parallel formula installation into a dedicated
Homebrew::Bundle::ParallelInstaller class using concurrent-ruby's
FixedThreadPool instead of raw Thread.new. Add keg lock awareness
via FormulaLock to prevent concurrent installs of formulae that
share overlapping dependency kegs.

Addresses review feedback from @MikeMcQuaid on Homebrew#21891.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address round 2 review feedback from @MikeMcQuaid:

- Remove all custom FormulaLock-based locking from ParallelInstaller.
  Rely on `brew install`'s built-in keg locking instead, with the
  dependency map preventing same-tree parallel installs.
- Extend parallel installation to all entry types (casks, taps,
  extensions), not just formulae. Non-Brew entries have no inter-entry
  dependencies and can all run in parallel immediately.
- Simplify build_dependency_map to handle both Brew (formula deps)
  and non-Brew (empty deps) entries in a single pass.

Net -82 lines. Typecheck, style, and tests pass.
`brew install` acquires file locks on all recursive dependencies
(including build deps via FormulaInstaller#lock). When two entries
share any transitive dep (e.g., bat and fd both pull in
ca-certificates via their rust/libgit2 chains), their concurrent
`brew install` processes deadlock on the shared file lock.

Use Formula#recursive_dependencies to build a conflict graph: entries
sharing any transitive dep are serialized (later entry waits for
earlier). Entries with disjoint dep trees still run in parallel.

Example with tree, bat, fd, fzf (--jobs=4):
  Batch 1: tree + bat + fzf (parallel, no shared deps)
  Batch 2: fd (after bat, shared ca-certificates/openssl chain)
- Move formula dependency lookups to Brew.brewfile_dependencies and
  Brew.recursive_dep_names class methods
- Handle Cask formula deps and implicit Tap deps in dependency map
- Replace T.must with .fetch for better error messages
- Remove unnecessary T.cast on future value

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…omments

- Rename `brewfile_dependencies` -> `formula_dep_names` (clearer intent)
- Apply `.presence` pattern for formula lookup (idiomatic Ruby)
- Simplify tap_prefix extraction (trailing conditional)
- Add cask-on-cask dependency tracking via `cask_dep_names`
- Add phase comments to `build_dependency_map` for flow clarity
Only include build dependencies in lock conflict detection when
building from source. Pouring bottles only acquires keg locks on
runtime deps, so shared build deps like cmake no longer serialize
unrelated bottle pours.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…serialize cask installs

- Apply code suggestion: chain .to_set(&:name) on if/else in
  recursive_dep_names instead of repeating it in each branch
- Extract duplicated formulae_by_full_name/formulae_by_name lookup
  into find_formula helper used by formula_dep_names and formula_bottled?
- Serialize cask installs via dedicated mutex to prevent interactive
  sudo Password: prompts from interleaving with parallel output
Cask installs can trigger sudo password prompts that write directly to
/dev/tty. Parallel formula workers could print status messages at the
same moment, causing interleaved output like "Password:Using foo".

Switch @output_mutex from Mutex to Monitor (reentrant) and hold it for
the entire cask install. This blocks other workers' status output while
a cask install is in progress, preventing prompt/status interleaving.
Monitor's reentrancy lets write_output calls inside do_install_entry!
re-acquire the lock on the same thread without deadlocking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cask installs can trigger interactive prompts (sudo, macOS security
frameworks) that write directly to /dev/tty without a trailing newline.
When running in parallel, the next worker's status message would append
to the same line as a stale "Password:" prompt, producing garbled output
like "Password:Using anykeyh.simplecov-vscode".

After the output lock is released following a cask install, write a
newline to /dev/tty to ensure the terminal cursor starts on a clean
line.  The write is a no-op in CI or when output is piped (ENXIO/ENOENT
are caught).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Writing an unconditional newline to /dev/tty after each cask install
produced visually inconsistent output - a blank line after every cask
while formulas ran back-to-back. Reported by @MikeMcQuaid.

Use \r + CSI-K (clear to end of line) so any stale interactive prompt
(sudo Password:, macOS security) gets overwritten in place instead of
pushing the cursor down. When the line is already clean the sequence is
visually a no-op, so formula and cask status lines now render uniformly.
typed: strict now requires explicit sigs on these helpers. formulae_by_full_name
and formulae_by_name can both return nil, so find_formula must too. Callers
already guard against blank.
@mvanhorn mvanhorn force-pushed the feat/bundle-parallel-install branch from 3415250 to 49a8f30 Compare April 14, 2026 17:52
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Thanks for the catch on the output, and apologies for the noise.

Root cause: the unconditional \n on /dev/tty after each cask ran even when no interactive prompt had appeared, pushing the cursor down and producing a blank line after every cask.

Pushed 36515f5 (fix(parallel_installer): clear tty line instead of adding newline). Swapped \n for \r + CSI-K (clear to end of line) in clear_tty_line. When there's a stale Password: on the tty, it gets overwritten by the next status line instead of pushed down; when the tty is already clean (no sudo prompted, or cask output already scrolled past), it's visually a no-op. Formula and cask status lines should now render uniformly.

Also rebased on latest master to clear the conflicts. The bundle/brew.rb conflicts were with the new typed: strict conversion, so I added sigs to find_formula, formula_dep_names, recursive_dep_names, and formula_bottled? while I was in there. brew typecheck and brew style both pass, bundle specs pass.

On the earlier point about the lock universe being more conservative than ideal (cmake serializing unrelated bottle pours, etc.) -- recursive_dep_names already takes an include_build: keyword so bottle pours pass include_build: false via formula_bottled?. The conservative case only kicks in when a formula actually needs to build from source. Happy to tighten further if you'd like the lock universe keyed off the actual preinstall! result rather than the bottled hint.

Comment thread Library/Homebrew/cmd/bundle.rb
Copy link
Copy Markdown
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mvanhorn. Tested locally and works well.

@MikeMcQuaid MikeMcQuaid enabled auto-merge April 16, 2026 15:27
@MikeMcQuaid MikeMcQuaid added this pull request to the merge queue Apr 16, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 16, 2026
@MikeMcQuaid MikeMcQuaid added this pull request to the merge queue Apr 16, 2026
Merged via the queue into Homebrew:main with commit f145462 Apr 16, 2026
36 checks passed
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Thanks @mvanhorn. Tested locally and works well.

Woohoo super excited about this.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

@MikeMcQuaid heads up wrote about my excitement with a little video on this feature on x. Says you don't use x much. :) https://x.com/mvanhorn/status/2045134181546566030?s=46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants