This guide walks you through the practical scenarios and workflows for using the docker-tools infrastructure. The eng/docker-tools directory is a shared infrastructure layer used across all .NET Docker repositories (dotnet-docker, dotnet-buildtools-prereqs-docker, dotnet-framework-docker). It solves a fundamental challenge: building, testing, and publishing Docker images across multiple operating systems (Alpine, Ubuntu, Azure Linux, Windows Server variants), multiple CPU architectures (amd64, arm64, arm32), and multiple .NET versions—all while maintaining consistency and reliability.
At its core, the infrastructure provides:
- PowerShell scripts for local image building and Docker operations—so you can test Dockerfile changes on your machine before committing
- Azure Pipelines templates for CI/CD (build, test, publish)—a composable template system that orchestrates builds across dozens of OS/architecture combinations in parallel
- ImageBuilder orchestration—a specialized .NET tool that understands manifest files, manages image dependencies, handles multi-arch manifest creation, and coordinates the entire build process
- Caching and optimization—intelligent systems that skip unchanged images and minimize redundant work
- SBOM generation—automatic Software Bill of Materials creation for supply chain security
The infrastructure handles complexity that would otherwise be overwhelming: a single commit to a repo can trigger builds of hundreds of image variants across Linux and Windows agents, each requiring proper build sequencing, testing, and eventual publication to Microsoft Artifact Registry (MAR).
Important: Files in eng/docker-tools/ are synchronized across repositories by automation in the dotnet/docker-tools repository. If you need to make changes to this infrastructure, submit them there—changes made directly in consuming repos will be overwritten.
The most common local task is building images to test Dockerfile changes before pushing.
Quick Build - All Images:
./eng/docker-tools/build.ps1Filter by OS:
# Build only Alpine images
./eng/docker-tools/build.ps1 -OS "alpine"
# Build Ubuntu 24.04 images
./eng/docker-tools/build.ps1 -OS "noble"Filter by Architecture:
# Build arm64 images only
./eng/docker-tools/build.ps1 -Architecture "arm64"Filter by Path:
# Build images from a specific directory
./eng/docker-tools/build.ps1 -Paths "src/runtime/8.0/alpine3.20"
# Build all 8.0 runtime images using glob pattern
./eng/docker-tools/build.ps1 -Paths "*runtime*8.0*"Combine Filters:
# Build .NET 8.0 Alpine arm64 images
./eng/docker-tools/build.ps1 -Version "8.0" -OS "alpine" -Architecture "arm64"Filter by Product Version (if applicable):
# Build only .NET 8.0 images
./eng/docker-tools/build.ps1 -Version "8.0"
# Build .NET 6.0 and 8.0 images
./eng/docker-tools/build.ps1 -Version "6.0","8.0"When you run build.ps1, here's the chain of execution:
build.ps1
│
├── Translates your filter parameters into ImageBuilder CLI args
│
└── Calls Invoke-ImageBuilder.ps1 "build --version X --os-version Y ..."
│
├── On Linux: Runs ImageBuilder in a Docker container
│ └── Builds image: microsoft-dotnet-imagebuilder-withrepo
│ └── Mounts Docker socket and repo contents
│
└── On Windows: Extracts ImageBuilder locally (due to Docker-in-Docker limitations)
└── Runs Microsoft.DotNet.ImageBuilder.exe directly
For advanced scenarios, you may want to invoke ImageBuilder with specific commands:
# Run any ImageBuilder command
./eng/docker-tools/Invoke-ImageBuilder.ps1 "build --help"
# Generate the build matrix (useful for debugging pipeline behavior)
./eng/docker-tools/Invoke-ImageBuilder.ps1 "generateBuildMatrix --manifest manifest.json --type platformDependencyGraph"
# Validate manifest syntax
./eng/docker-tools/Invoke-ImageBuilder.ps1 "validateManifest --manifest manifest.json"The pipeline behaves differently depending on the build context:
Public PR Builds:
Build Stage
├── PreBuildValidation
├── GenerateBuildMatrix
└── Build Jobs (dry-run, no push)
└── Inline tests after each build
│
▼
Post_Build Stage
└── Merge artifacts
│
▼
Publish Stage (dry-run)
└── All publish operations run but skip actual pushes
│
▼
(end)
- Images are built but not pushed to any registry
- Tests run inline within each build job
- Publish stage runs in dry-run mode (validates publish logic without pushing)
- Validates that Dockerfiles build successfully
Internal Official Builds:
Build Stage
├── PreBuildValidation
├── CopyBaseImages → staging ACR
├── GenerateBuildMatrix
└── Build Jobs (push to staging ACR)
│
▼
Post_Build Stage
├── Merge image info files
└── Consolidate SBOMs
│
▼
Test Stage
├── GenerateTestMatrix
└── Test Jobs
│
▼
Publish Stage
├── Copy images to production ACR
├── Create multi-arch manifests
├── Wait for MAR ingestion
├── Update READMEs
├── Publish image info to versions repo
└── Apply EOL annotations
- Full pipeline with all stages
- Images flow:
BuildRegistry→PublishRegistry→ MAR (seepublish-config-prod.ymlfor ACR definitions) - Tests run against staged images
- Only successful builds get published
The generateBuildMatrix command is key to understanding how builds are parallelized. It:
- Reads the manifest.json - Understands which images exist
- Builds a dependency graph - Knows that
runtime-depsmust build beforeruntime - Groups by platform - Creates jobs for each OS/Architecture combo
- Optimizes with caching - Can detect and exclude unchanged images (see Image Caching below)
The stages variable is a comma-separated string that controls which pipeline stages execute:
variables:
- name: stages
value: "build,test,sign,publish" # Run all stagesCommon patterns:
"build"- Build only, no tests, signing, or publishing"build,test"- Build and test, but don't sign or publish"build,test,sign"- Build, test, and sign, but don't publish"sign"- Sign only (when re-running failed signing from a previous build)"publish"- Publish only (when re-running a failed publish from a previous build)"build,test,sign,publish"- Full pipeline
Note: The Post_Build stage is implicitly included whenever build is in the stages list. You don't need to specify it separately—it automatically runs after Build to merge image info files and consolidate SBOMs.
The stages variable is useful for:
- Re-running just the publish stage after fixing a transient failure
- Skipping tests during initial development
- Running isolated stages for debugging
Image info files (defined by ImageArtifactDetails) are the mechanism that tracks what was built:
{
"repos": [{
"repo": "dotnet/runtime",
"images": [{
"platforms": [{
"dockerfile": "src/runtime/8.0/alpine3.20/amd64/Dockerfile",
"digest": "sha256:abc123...",
"created": "2024-01-15T10:30:00Z",
"commitUrl": "https://github.com/dotnet/dotnet-docker/commit/..."
}]
}]
}]
}How they flow through the pipeline:
- Build stage: Each build job produces an image-info fragment
- Post_Build stage: Fragments are merged into a single
image-info.json - Test stage: Uses merged info to know which images to test
- Publish stage: Uses info to know which images to copy/publish
- Versions repo: Final info is committed to the versions repo
The versions repo stores the "source of truth" image info. Future builds compare against this to determine what's changed and skip unchanged images.
Using Image Info for Investigations
Image info files are invaluable when you need to track down information about a specific image, particularly when starting from a digest reported by a customer or security scan.
Scenario: "Which commit produced this image?"
Given a digest like sha256:abc123..., you can trace it back to its source:
-
Check the versions repo history - The
dotnet/versionsrepo contains historical image info committed after each publish. Usegit log -p --all -S 'sha256:abc123'to find the commit that introduced this digest. -
From the image info entry, you'll find:
commitUrl- The exact source commit that built this imagedockerfile- Which Dockerfile produced itcreated- When it was builtsimpleTags- The tags applied to this image
Scenario: "What was in the last successful build?"
Download the image-info artifact from a pipeline run in Azure DevOps:
- Navigate to the pipeline run
- Go to the "Published" artifacts section
- Download
image-info(merged) or individual*-image-info-*fragments
Scenario: "When did we last publish updates to a specific image?"
Use the versions repo git history:
# In the dotnet/versions repo
git log --oneline -- build-info/docker/image-info.dotnet-dotnet-docker-main.jsonEach commit corresponds to a publish operation and includes the full image info at that point in time.
Scenario: "Compare what changed between two publishes"
git diff <commit1> <commit2> -- build-info/docker/image-info.dotnet-dotnet-docker-main.jsonThis shows which images were added, removed, or rebuilt (new digests) between the two publishes.
The publish stage does more than just push images. Here's the sequence:
- Copy Images —
copyAcrImagescopies from build ACR to publish ACR - Publish Manifest —
publishManifestcreates multi-arch manifest lists - Wait for MAR Ingestion — Polls MAR until images are available (timeout configurable)
- Publish READMEs — Updates documentation in the registry
- Wait for Doc Ingestion — Ensures README changes are live
- Merge & Publish Image Info — Updates the versions repo with new image metadata
- Ingest Kusto Image Info — Sends telemetry to Kusto for analytics
- Generate & Apply EOL Annotations — Marks images with end-of-life dates
- Post Publish Notification — Creates GitHub issues/notifications about the publish
For testing pipeline changes without actually publishing:
# In pipeline variables or at runtime
variables:
- name: dryRunArg
value: "--dry-run"Or the infrastructure automatically enables dry-run for:
- Pull request builds
- Builds from non-official branches
- Public project builds
The set-dry-run.yml step template determines this automatically based on context.
The infrastructure includes automation that monitors for base image updates and triggers rebuilds when dependencies change.
A scheduled pipeline (check-base-image-updates.yml) runs every 4 hours and:
- Checks for stale images — Compares the base image digests used in our published images against the current digests in upstream registries
- Identifies affected images — Determines which of our images need rebuilding because their base image changed
- Queues targeted builds — Automatically triggers builds for only the affected images, not the entire repo
This ensures that security patches and updates in base images (like alpine, ubuntu, mcr.microsoft.com/windows/nanoserver) flow through to images without manual intervention.
The system has built-in retry logic but requires manual intervention after repeated failures:
Automatic retry behavior:
- If a triggered build fails, the system will attempt to rebuild every 4 hours
- After 3 unsuccessful attempts, the system stops queuing new builds for that image
- This prevents endless rebuild loops when there's a genuine issue requiring human attention
After fixing the issue:
Once you've fixed the underlying problem (Dockerfile change, test fix, etc.) and have a successful build:
- Navigate to the successful pipeline run in Azure DevOps
- Add the
autobuilderlabel to that run - This signals to the infrastructure that a successful build has occurred
- The system will resume automatic rebuilds for that image as needed
The autobuilder label is how the infrastructure tracks that the failure cycle has been broken and normal operations can resume.
The infrastructure includes caching to avoid rebuilding images that haven't changed. Caching operates at two levels:
1. Matrix Trimming (job-level caching)
When trimCachedImagesForMatrix is enabled, the generateBuildMatrix command excludes platforms from the build matrix if they would result in cache hits. This means no build job is even created for those platforms—they're completely skipped.
2. Build-time Caching
Even if a platform isn't trimmed from the matrix, the build command checks each image against the cache before building. If the image is cached, it outputs CACHE HIT, pulls the previously-built image from the registry, and skips the actual Docker build.
An image is considered cached when both of the following conditions are true:
-
Base image digest is unchanged — The digest of the base image (FROM image) matches the digest recorded in the image info file from the last successful publish. If the upstream base image has been updated, this condition fails and the image will be rebuilt.
-
Dockerfile commit is unchanged — The git commit URL for the Dockerfile matches the commit URL recorded in the image info file. If you've modified the Dockerfile, this condition fails and the image will be rebuilt.
Caching compares against the published image info stored in the versions repo. This means caching compares against what's been officially published, not what's in your current branch.
To force a rebuild regardless of cache state, set the noCache parameter to true when queuing the build. This disables both matrix trimming and build-time caching.
Pass Dockerfile ARG values via ImageBuilder:
customBuildInitSteps:
- powershell: |
$args = "--build-arg MY_VAR=value"
echo "##vso[task.setvariable variable=imageBuilderBuildArgs]$args"To pass raw options directly to docker build, use --build-option. Quote values that contain spaces:
customBuildInitSteps:
- powershell: |
$args = '--build-option "--ulimit nofile=65536:65536"'
echo "##vso[task.setvariable variable=imageBuilderBuildArgs]$args"A powerful pattern is combining the stages variable with the sourceBuildPipelineRunId pipeline parameter to run specific stages using artifacts from a previous build. This is useful for:
- Skipping stages you don't need to run
- Avoiding unnecessary re-builds after test/publishing infrastructure fixes
Note: For simple retries of failed jobs, use the Azure Pipelines UI "Re-run failed jobs" feature instead.
Scenario: Test failed, need to run publish anyway
- Set
sourceBuildPipelineRunIdto the build which built the images - Set
stagestopublish
How it works:
sourceBuildPipelineRunIdtells the pipeline which previous run to pull artifacts from- The
download-build-artifact.ymlstep uses this ID to fetchimage-info.jsonfrom that run - Specified stage(s) use the downloaded image info to know which images exist
Common recovery patterns:
| Scenario | stages | sourceBuildPipelineRunId |
|---|---|---|
| Normal full build | "build,test,sign,publish" |
$(Build.BuildId) (default) |
| Re-run publish after infra fix | "publish" |
ID of the successful build run |
| Re-test after infra fix | "test" |
ID of the build run to test |
| Re-sign after infra fix | "sign" |
ID of the build run to sign |
| Build only (no publish) | "build" |
$(Build.BuildId) (default) |
| Test + publish (skip build) | "test,publish" |
ID of the build run |
| Sign + publish (skip build/test) | "sign,publish" |
ID of the build run |
In the Azure DevOps UI:
When you queue a new run, you can override these as runtime parameters:
- Set
stagesto the stage(s) you want to run - Set
sourceBuildPipelineRunIdto the run ID containing the artifacts you need (find the build ID in the URL when viewing a pipeline run, e.g.,buildId=123456)
This avoids the multi-hour rebuild cycle when you just need to retry a failed operation.
When you trigger a pipeline run, you might find that your Dockerfile isn't being built.
If your Dockerfile doesn't appear in any build job, first verify the Dockerfile is included in the manifest file.
How to verify: Check manifest.json to ensure your Dockerfile path is defined under the appropriate repo and image. You can also run generateBuildMatrix locally to see which Dockerfiles are included:
./eng/docker-tools/Invoke-ImageBuilder.ps1 "generateBuildMatrix --manifest manifest.json --type platformDependencyGraph"How to fix: Add the Dockerfile to manifest.json under the correct repo, image, and platform configuration.
If the Dockerfile is in the manifest but you don't see a build job for it, the build matrix was likely trimmed due to matrix trimming.
How to verify: Look at the "Generate platformDependencyGraph Matrix" step output in the GenerateBuildMatrix job. This is an example of what the output in that step looks like:
windowsLtsc2025Amd64:
src-windowsservercore-ltsc2025-helix-graph:
imageBuilderPaths: --path src/windowsservercore/ltsc2025/helix/amd64 --path src/windowsservercore/ltsc2025/helix/webassembly-net8/amd64 --path src/windowsservercore/ltsc2025/helix/webassembly/amd64
legName: windows-ltsc2025amd64src-windowsservercore-ltsc2025-helix-graph
osType: windows
architecture: amd64
osVersions: --os-version windowsservercore-ltsc2025If your Dockerfile path doesn't appear in any of the matrix legs, it was trimmed.
How to fix: Set the noCache parameter to true when queuing the build.
If your build job runs but you see CACHE HIT in the output of the Build Images step and the Dockerfile isn't actually built, the build-time caching determined that the image doesn't need to be rebuilt. This is an example of what the output in that step looks like:
Image info's Dockerfile commit: https://github.com/dotnet/dotnet-buildtools-prereqs-docker/blob/aa85f0dcc3b3d6757c80dc8c2a6f38c290b372cc/src/windowsservercore/ltsc2025/helix/amd64/Dockerfile
Latest Dockerfile commit: https://github.com/dotnet/dotnet-buildtools-prereqs-docker/blob/aa85f0dcc3b3d6757c80dc8c2a6f38c290b372cc/src/windowsservercore/ltsc2025/helix/amd64/Dockerfile
Dockerfile commits match: True
CACHE HIT
-- EXECUTING: docker pull mcr.microsoft.com/dotnet-buildtools/prereqs@sha256:40d36a0aab610f4d513ed7c7300a5d962968a547ffe8a859a0e599691b74b77f
How to fix: Set the noCache parameter to true when queuing the build.