Skip to content

V4#3807

Open
DrJosh9000 wants to merge 33 commits into
mainfrom
v4
Open

V4#3807
DrJosh9000 wants to merge 33 commits into
mainfrom
v4

Conversation

@DrJosh9000

@DrJosh9000 DrJosh9000 commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Description

Make v4 happen.

The plan is:

  • Merge this branch into main and delete the branch.
  • Cut a new long-running v3 branch from the commit prior, and publish stable releases from it for a while
  • (One day) merge a new PR bumping VERSION to 4.0.0 to do a stable v4 release 🎉

Context

It's about time.

Fixes #1391
Fixes #1594
Fixes #1623
Fixes #1646
Closes #1593

Changes

  • Rename v3 to v4 in package names and a few other places.
  • Promote or rip out various experiments, but note a few are left in.
    • allow-artifact-path-traversal: removed, the insecure behaviour is no longer supported
    • normalised-upload-paths: is now default behaviour
    • override-zero-exit-on-cancel: is now default behaviour
    • resolve-commit-after-checkout: is now default behaviour
    • propagate-agent-config-vars: is now default behaviour
    • descending-spawn-priority: removed, with the --spawn-with-priority flag now taking a string value (one of static, ascending, or descending)
  • Rip out the deprecated Docker integration.
  • Remove deprecated CLI flags:
    • trace-context-encoding
    • kubernetes-log-collection-grace-period
    • no-automatic-ssh-fingerprint-verification (use no-ssh-keyscan instead)
    • meta-data (use tags instead)
    • meta-data-ec2 (use tags-from-ec2-meta-data instead)
    • meta-data-ec2-tags (use tags-from-ec2-tags instead)
    • meta-data-gcp (use tags-from-gcp-meta-data instead)
    • tags-from-ec2 (use tags-from-ec2-meta-data instead)
    • tags-from-gcp (use tags-from-gcp-meta-data instead)
    • disconnect-after-job-timeout (use disconnect-after-idle-timeout instead )
    • follow-symlinks (use glob-resolve-follow-symlinks instead)
  • Remove deprecated env vars generated for plugin configuration.
  • Run post-checkout, post-command, pre-exit hooks run in "reverse" order
  • Output a trailing newline in buildkite-agent meta-data get
  • Replace cancel-grace-period and signal-grace-period-seconds flags with cancel-signal-timeout and cancel-cleanup-timeout, and adjust the timeouts (10s signal timeout and 5s cleanup timeout)
  • Remove OpenTracing and various DataDog-specific workarounds
  • Pipeline uploads containing secrets are now rejected by default (reject-secrets is replaced with allow-secrets)
  • Upgrade urfave/cli to v3

Testing

  • Tests have run locally (with go test ./...). Buildkite employees may check this if the pipeline has run automatically.
  • Code is formatted (with go tool gofumpt -extra -w .)

Disclosures / Credits

  • jj rebase
  • Multi-file regex search and replace
  • All the previous thought put into future breaking changes

@DrJosh9000 DrJosh9000 force-pushed the v4 branch 13 times, most recently from f8066e1 to d28fd70 Compare April 14, 2026 02:04
@DrJosh9000 DrJosh9000 force-pushed the v4 branch 9 times, most recently from 8b3d172 to 7686fe7 Compare April 23, 2026 02:07
@DrJosh9000 DrJosh9000 force-pushed the v4 branch 7 times, most recently from 638c3b4 to a52917e Compare May 5, 2026 23:52
@buildsworth-bk-app

Copy link
Copy Markdown

On it! Follow along with my review here: https://buildkite.com/buildkite/buildsworth-review-pull-request/builds/1625

@buildsworth-bk-app buildsworth-bk-app Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a few blocking issues in the CLI v3 and OpenTelemetry changes. The PR is L3, and the Buildkite agent check was still pending while I reviewed.

Want to dig deeper? The full session log is attached to this Buildkite build. Download the session file and open a new pi session with it:

Download the buildsworth logs from build 1625, then answer my questions about the findings.

Comment thread clicommand/global.go
Comment thread clicommand/agent_start.go Outdated
Comment thread clicommand/agent_start.go
@buildsworth-bk-app

Copy link
Copy Markdown

On it! Follow along with my review here: https://buildkite.com/buildkite/buildsworth-review-pull-request/builds/1656

@buildsworth-bk-app buildsworth-bk-app Bot dismissed their stale review June 30, 2026 02:19

Previous concerns have been addressed.

@buildsworth-bk-app buildsworth-bk-app Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous buildsworth blockers are resolved: OTel provider init now follows opentelemetry-tracing / TelemetryServiceName, GCP metadata tags are a bool flag, and the cloud tag wait timeouts are time.Duration. I found one new blocking SSH host-key fallback issue, plus a config-template cleanup.

Want to dig deeper? The full session log is attached to this Buildkite build. Download the session file and open a new pi session with it:

Download the buildsworth logs from build 1656, then answer my questions about the findings.

Comment thread internal/job/ssh_host_key_checking.go Outdated
Comment thread packaging/linux/root/usr/share/buildkite-agent/buildkite-agent.cfg
@buildsworth-bk-app

Copy link
Copy Markdown

On it! Follow along with my review here: https://buildkite.com/buildkite/buildsworth-review-pull-request/builds/1703

@buildsworth-bk-app buildsworth-bk-app Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous blocking SSH fallback concern is resolved by failing closed when the version can't be determined. The packaged config-template cleanup is still partly open, and I found one new blocking SSH setup regression. The PR remains L3 because it changes agent checkout/security/config behaviour; the Buildkite agent check was still pending while I reviewed.

Want to dig deeper? The full session log is attached to this Buildkite build. Download the session file and open a new pi session with it:

Download the buildsworth logs from build 1703, then answer my questions about the findings.

Comment thread internal/job/executor.go Outdated
Comment thread packaging/linux/root/usr/share/buildkite-agent/buildkite-agent.cfg
@buildsworth-bk-app

Copy link
Copy Markdown

On it! Follow along with my review here: https://buildkite.com/buildkite/buildsworth-review-pull-request/builds/1726

@buildsworth-bk-app

Copy link
Copy Markdown

On it! Follow along with my review here: https://buildkite.com/buildkite/buildsworth-review-pull-request/builds/1728

@buildsworth-bk-app

Copy link
Copy Markdown

On it! Follow along with my review here: https://buildkite.com/buildkite/buildsworth-review-pull-request/builds/1731

@buildsworth-bk-app buildsworth-bk-app Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest non-SSH setup failure and packaged config-template cleanup are addressed. The follow-up reintroduces the SSH-version fallback blocker from the prior round, so I'm requesting changes; the Buildkite agent check is currently failing.

Want to dig deeper? The full session log is attached to this Buildkite build. Download the session file and open a new pi session with it:

Download the buildsworth logs from build 1728, then answer my questions about the findings.

Comment thread internal/job/ssh_host_key_checking.go Outdated

@buildsworth-bk-app buildsworth-bk-app Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The active buildsworth SSH host-key blocker from the previous pass is still unresolved; I didn't find anything else beyond that on this pass. The PR remains L3 because it changes agent checkout/security/config behavior, and the Buildkite agent check is currently failing.

Want to dig deeper? The full session log is attached to this Buildkite build. Download the session file and open a new pi session with it:

Download the buildsworth logs from build 1731, then answer my questions about the findings.

Comment thread internal/job/ssh_host_key_checking.go Outdated
@buildsworth-bk-app

Copy link
Copy Markdown

On it! Follow along with my review here: https://buildkite.com/buildkite/buildsworth-review-pull-request/builds/1743

@buildsworth-bk-app buildsworth-bk-app Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous buildsworth concerns are addressed in the current branch: the SSH host-key fallback now avoids StrictHostKeyChecking=no when the SSH version cannot be determined, and the older packaged config-template cleanup is also resolved. I don't have new inline findings from this pass; I'm leaving this as a comment rather than an approval because the PR is L3 (agent checkout/security/config behavior changes), and the Buildkite agent check is still pending.

Want to dig deeper? The full session log is attached to this Buildkite build. Download the session file and open a new pi session with it:

Download the buildsworth logs from build 1743, then answer my questions about the findings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Changes to existing behaviour users might rely on change Not a new feature, but a user observable non-breaking behavior change. v4 Breaking changes that will be included in Agent v4

Projects

None yet

6 participants