Skip to content

Fix Ubuntu 26.04 Azure build: drop stale Microsoft 18.04 apt source#3

Open
zrk02 wants to merge 60 commits into
mainfrom
fix/ubuntu-2604-azure-stale-ms-repo
Open

Fix Ubuntu 26.04 Azure build: drop stale Microsoft 18.04 apt source#3
zrk02 wants to merge 60 commits into
mainfrom
fix/ubuntu-2604-azure-stale-ms-repo

Conversation

@zrk02

@zrk02 zrk02 commented Jun 10, 2026

Copy link
Copy Markdown

Problem

The Azure SIG builds for Ubuntu 26.04 (sig-ubuntu-2604 and sig-ubuntu-2604-gen2, added in kubernetes-sigs#1987) fail in the pull-azure-sigs job at the kubernetes : Add the Kubernetes repo task — the first apt update of the build:

Failed to update apt cache after 5 retries: ... OpenPGP signature
verification failed: https://packages.microsoft.com/ubuntu/18.04/prod
bionic InRelease: NO_PUBKEY EB3E94ADBE1229CF ... is not signed

Root cause

The Canonical Ubuntu 26.04 (resolute) Azure Marketplace image ships a legacy Microsoft "prod" apt source pinned to ubuntu/18.04 whose signing key is not present in the keyring. Ubuntu 26.04 removed apt-key and enforces signed-by, so the stale source breaks the first apt cache update. The 24.04 image is unaffected, which is why only 2604 fails.

Fix

image-builder does not rely on this repo for node images (azure-cli is only installed when debug_tools is set, from its own signed-by repo), so the stale source is removed in the Azure provider tasks before any apt cache update runs. The providers role runs before the kubernetes role, and the removal task sits at the top of azure.yml (ahead of the azurecli.yml import), so it always executes before the first update_cache. The task is a no-op on images that don't ship the file.

The base image is otherwise valid: 26.04 went GA on 2026-04-23 and the ubuntu-26_04-lts offer is live in Azure, so the existing packer/azure/ubuntu-2604*.json config is correct.

Testing

  • ansible-lint --project-dir . ansible/ passes on the changed file (production profile, 0 failures).
  • Removal is guarded with os_family == "Debian" and is idempotent / a no-op when the stale source is absent.

joshfrench and others added 30 commits April 27, 2026 11:08
…onflict

iptables-1.8.13-2.ph5 and ebtables-2.0.11-4.ph5 now require the
alternatives package (introduced by Broadcom on April 9, 2026).

The Photon 5 minimal installation ships chkconfig which conflicts with
the new alternatives package, so it cannot be installed directly. The
fix installs alternatives via Ansible with --allowerasing before
distro-sync runs. This atomically replaces chkconfig with alternatives,
satisfying the iptables/ebtables dependency and allowing distro-sync to
complete successfully.

Validated with a live vSphere build against the testbed.

Ref: vmware/photon#1646
Made-with: Cursor
The setup role fix covers firstboot.yml, but vmware-photon.yml runs
in node.yml (the second playbook) via the providers role. Add the same
alternatives --allowerasing step before the cloud-init install so the
fix is present even when node.yml is run independently.

Suggested-by: bhllamoreaux
Made-with: Cursor
…oudbuild

chore(ci): updating the cloudbuild gcb-docker-gcloud image to the latest release
docs: Update docs for image-builder v0.1.52
The MicrosoftWindowsServer/windows-cvm Marketplace offer no longer
publishes any image versions, breaking pull-azure-sigs builds for
sig-windows-2019-containerd-cvm and sig-windows-2022-containerd-cvm.

Switch the source images to the corresponding Gen2 SKUs in the
WindowsServer offer. The SIG image definition created by init-sig.sh
already sets SecurityType=ConfidentialVmSupported, so the resulting
captured images remain CVM-capable for downstream consumers.

Refs: kubernetes-sigs#1996
These references target RHEL/CentOS 7 and 8, which are no longer build
targets in image-builder (the README lists only RHEL 9, Rocky 9, AlmaLinux
9, CentOS 9). The python2-pip install task in vmware-redhat.yml was gated
on distribution_major_version <= 8, and the pip<21.0 upgrade was gated on
== 7; both conditions are now unreachable. The python2-pip entries in
goss-vars.yaml were similarly unused.

Refs: kubernetes-sigs#578
…it files

The repo no longer builds RHEL/CentOS 7 or 8 (only -9 variants are in
the Makefile and README matrix), so several conditional tasks and a
pair of orphaned Nutanix cloud-init templates are unreachable:

- roles/providers/tasks/main.yml: 'Set cloudinit feature flags for
  redhat 8' (gated on RHEL 8).
- roles/sysprep/tasks/main.yml: 'not (RedHat and major_version <= 7)'
  guard around journalctl rotate.
- packer/nutanix/linux/cloud-init/{rhel,rockylinux}/8/user-data.tmpl:
  no rhel-8.json or rockylinux-8.json consumes them.
…ernatives

ova: fix Photon 5 distro-sync failure due to alternatives/chkconfig conflict
…-source-sku

Fix windows-cvm source SKUs after Marketplace removal
…thon2-refs

Remove dead python2 references
….8-1.35.5

Bump CAPG nightly Kubernetes versions to 1.34.8 and 1.35.5
…belet-cleanup

Windows: improve StartKubelet.ps1 kubelet bootstrap
These distro_version: "8" blocks (and a couple of related python2-pip
entries) are unreachable: image-builder no longer ships any *-8 build
targets, only the *-9 variants in the README matrix and Makefile.

Also drops a few empty 'os_version:' keys left behind after removing
the only child entry. The 'rh8_rpms' YAML alias is kept because Oracle
Linux 9's OCI block still references it (and 'curl', 'yum-utils',
'nftables', 'python3-netifaces', 'python3-requests' are all valid on
OL 9, so behavior is unchanged).
- Add packer/powervs/centos-10.json with CentOS Stream 10 base image config
- Add powervs-centos-10 build and validate targets to Makefile
- Exclude lsvpd from dnf update (IBM RHEL9 repo provides incompatible version on CentOS 10)
- Skip ifcfg network reset for CentOS 10 (network-scripts dir removed in CentOS 10)
Add CentOS Stream 10 support for PowerVS image builds
…ers to add this file on demand as required.

We still keep the original parameters but are adding an OR option for this so we can use it in a wider scope - defaulting the option to false.
Also adding goss checks for this.
feat: adding containerd_enable_limit_no_file as an option to allow users to add this file on demand as required.
…rs-rhel8

Remove dead RHEL/CentOS 8 entries from goss-vars
…d-rhel8-photon3-dirs

Remove dead RHEL 7/8 ansible code paths and orphaned nutanix cloud-init files
…-public-repos-ubuntu24

fix(ansible): disable .sources apt repos when disable_public_repos=true
mboersma and others added 29 commits May 26, 2026 09:59
Azure has retired unmanaged disks. Remove the VHD builder and the
build-azure-vhd-* / validate-azure-vhd-* make targets, leaving only
the managed SIG image builders.
…d-2.3.1-runc-1.4.2

⬆️ Bump containerd to LTS v2.3.1 and runc to v1.4.2
The top-level Makefile's .DEFAULT rule forwarded unknown targets to
images/capi via:

    .DEFAULT:
            $(MAKE) -C images/capi $@

Because $@ was substituted unquoted into the recipe, an attacker who
could control the make argument could execute arbitrary commands, e.g.:

    make '`touch /tmp/pwned`'

This commit hardens .DEFAULT with two layers of validation:

  1. A make-level $(findstring) check rejects target names containing
     single quotes or backslashes, both of which could be used to break
     out of shell single-quoting.
  2. A shell-level case statement enforces an allowlist of safe
     characters ([a-zA-Z0-9_./-]) on the validated target name, which
     is now passed to the recursive make call inside single quotes.

Reported privately by the Kubernetes Security Response Committee from
a third-party audit.
The sysprep Debian playbook disables apt-daily timers and
unattended-upgrades, but does so after running 'apt-mark hold' on all
installed packages. On freshly booted Azure VMs those services are
often still in flight and hold the dpkg frontend lock, causing
intermittent failures like:

    dpkg: error: dpkg frontend lock was locked by another process
    E: Executing dpkg failed. Are you root?

Move the disable tasks ahead of the apt-mark tasks, and add a bounded
retry loop on the apt-mark shell tasks as a guard against concurrent
apt activity that may still be running when sysprep begins.
…mmand-injection

Validate target names in top-level Makefile
…d-builder

Retire Azure VHD (unmanaged disk) builder
…ld-race

Fix apt-mark hold race with apt-daily/unattended-upgrades
Signed-off-by: Kevin Reeuwijk <kevin.reeuwijk@spectrocloud.com>
Signed-off-by: reasonofsky <marc.hulin@gmail.com>
Resolve latest Flatcar version from release server version.txt
…s_architecture

migrate missing ansible_* facts to ansible_facts[]
- Update Makefile, azure_targets.sh, init-sig.sh with ubuntu-2604 targets
- Disable Azure Ubuntu 26 CVM target (not available yet)
- Replace deprecated apt_key with signed-by keyring approach (all roles)
- Map Azure CLI codename resolute->noble (no MS repo for 26.04 yet)
- Set QEMU memory to 4GB for Ubuntu 26.04
- Update README.md and all provider docs
- Fix sysctl path for Ubuntu 26.04 (systemd 259 no longer loads /etc/sysctl.conf)
- Stop background apt services to prevent race conditions during build
✨ Init Ubuntu 26 - New release Resolute
Build an image for Oxide, supporting ubuntu 24.04.
…kubernetes-sigs#1947)

* move kubelet --system-reserved deprecated flag to kubelet-config file

Signed-off-by: ffais <ffais@fbk.eu>

* fix resource sizing script & add version check on main.yaml and 10-kubeadm.conf

Signed-off-by: ffais <ffais@fbk.eu>

* improve systemReserved already set checks

Signed-off-by: ffais <ffais@fbk.eu>

* fix wrong kubelet config extension & use safer procedure to save KUBELET_CONFIG & move the drop-in configuration in main.yml

Signed-off-by: ffais <ffais@fbk.eu>

* fix: version range for KUBELET_CONFIG_DROPIN_DIR_ALPHA variable & remove debug leftovers

Signed-off-by: ffais <ffais@fbk.eu>

---------

Signed-off-by: ffais <ffais@fbk.eu>
Bumps the all-github-actions group with 1 update: [actions/checkout](https://github.com/actions/checkout).


Updates `actions/checkout` from 6.0.2 to 6.0.3
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@de0fac2...df4cb1c)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: all-github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
…ot/github_actions/all-github-actions-6a98abd9ac

dependabot(deps): bump actions/checkout from 6.0.2 to 6.0.3 in the all-github-actions group
The ubuntu-2604 vSphere build hangs because its autoinstall never
completes, so the VM never reboots and packer waits on SSH until the 2h
job timeout, failing all of pull-ova-all. Exclude it like photon-4 until
the autoinstall is fixed. See kubernetes-sigs#2035.
The old WindowsServer offer's WS2022 .NET 6 images are deprecated on
9 June 2026. Same SKUs are published under the new windowsserver2022
offer, so just switch image_offer.
The apt-daily/apt-daily-upgrade timers and unattended-upgrades service
fire shortly after first boot and intermittently hold the dpkg/apt lock
while the containerd, kubernetes, and sysprep roles install packages,
causing flaky Ubuntu build failures (E: Could not get lock).

sysprep already disables these units, but that runs at the end of the
build, too late to protect the install steps. Stop and mask them at the
start of the node role, before any package installs, and wait for any
in-flight apt process to release the dpkg frontend lock.

This complements the sysprep fix in kubernetes-sigs#2024 by closing the earlier,
install-time instances of the same race.
…ace-early

Mask apt-daily and unattended-upgrades early to fix dpkg lock race
…ntu-2604

Exclude Ubuntu 26.04 OVA build from CI
…-2022-marketplace-offer

Move Azure WS2022 builds to new windowsserver2022 offer
The Azure SIG builds for Ubuntu 26.04 (sig-ubuntu-2604 and
sig-ubuntu-2604-gen2, added in kubernetes-sigs#1987) fail in the pull-azure-sigs job
at the "kubernetes : Add the Kubernetes repo" task, which runs the
first `apt update` of the build:

  Failed to update apt cache after 5 retries: ... OpenPGP signature
  verification failed: https://packages.microsoft.com/ubuntu/18.04/prod
  bionic InRelease: NO_PUBKEY EB3E94ADBE1229CF ... is not signed

The Canonical Ubuntu 26.04 (resolute) Azure Marketplace image ships a
legacy Microsoft "prod" apt source pinned to ubuntu/18.04 whose signing
key is not present in the keyring. Ubuntu 26.04 removed apt-key and
enforces signed-by, so the stale source breaks the first apt cache
update. The 24.04 image is unaffected, which is why only 2604 fails.

image-builder does not use this repo for node images (azure-cli is only
installed when debug_tools is set, from its own signed-by repo), so
remove the stale source in the Azure provider tasks before any apt
update runs. The task is a no-op on images that don't ship it.

The base image is otherwise valid: 26.04 went GA on 2026-04-23 and the
ubuntu-26_04-lts offer (server-gen1 / server SKUs) is live in Azure, so
the existing packer/azure/ubuntu-2604*.json config is correct.

Signed-off-by: zrk02 <johan.suurkula@elastx.se>
@zrk02 zrk02 self-assigned this Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.