Skip to content

[b/r] Add OpenStackBackupConfig controller and backup/restore labeling#1868

Open
stuggi wants to merge 2 commits intoopenstack-k8s-operators:mainfrom
stuggi:backup_restore_controller
Open

[b/r] Add OpenStackBackupConfig controller and backup/restore labeling#1868
stuggi wants to merge 2 commits intoopenstack-k8s-operators:mainfrom
stuggi:backup_restore_controller

Conversation

@stuggi
Copy link
Copy Markdown
Contributor

@stuggi stuggi commented Mar 31, 2026

  • Add OpenStackBackupConfig CRD and controller that watches CRD instances across operators and labels namespace resources (Secrets, ConfigMaps, NADs, cert-manager Issuers) with backup.openstack.org labels for backup/restore integration
  • Wire backup/restore labeling into the ControlPlane controller: CA cert secrets get backup labels via SecretTemplate, internal service cert requests get restore=false, and ReconcileBackupConfig creates/updates the BackupConfig CR with spec defaults

Commit 1: [b/r] Add OpenStackBackupConfig controller

Introduces the backup.openstack.org/v1beta1 API group with the OpenStackBackupConfig CRD. The controller:

  • Discovers CRD instances by reading backup.openstack.org/restore and backup.openstack.org/restore-order labels from CRD schemas (only on start, not on each reconcile and creates a cache) and applies them to all instances. This allows to have a dynamic approach where new CRDs just require the labels and the controller the rbac perms.
  • Labels Secrets, ConfigMaps, and NADs in the namespace with configurable restore ordering
  • Labels custom cert-manager Issuers (without ownerReferences) — operator-created Issuers are skipped
  • Supports per-resource annotation overrides (backup.openstack.org/restore, backup.openstack.org/restore-order) to customize or exclude individual resources
  • Includes envtest coverage

Commit 2: [b/r] Add backup/restore labels to ControlPlane controller

Integrates backup/restore into the existing ControlPlane reconciliation:

  • ReconcileBackupConfig in internal/openstack/backup.go creates the OpenStackBackupConfig CR with spec defaults via CreateOrPatch
  • CA cert secrets labeled at creation time in ca.go via SecretTemplate
  • Internal service cert requests labeled with restore=false (regenerated by cert-manager on restore)
  • ControlPlane controller watches Secrets for annotation changes via backup.AnnotationChangedPredicate
  • CRD label additions for backup.openstack.org/restore and backup.openstack.org/restore-order on ControlPlane, Version, DataPlaneNodeSet, and DataPlaneService types

Jira: OSPRH-22912
Jira: OSPRH-22913
Jira: OSPRH-26645

Depends-On: openstack-k8s-operators/lib-common#680

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 31, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

OpenStackControlPlane CRD Size Report

Metric Value
CRD JSON size 322464 bytes (315KB)
Base branch size 322326 bytes
Change +0.04%
Status yellow — growing
Threshold reference
Color Range Meaning
🟢 green < 300KB Comfortable
🟡 yellow 300–400KB Growing
🟠 orange 400–750KB Concerning
🔴 red > 750KB Approaching 1.5MB etcd limit (cut in half to allow space for update)

@stuggi stuggi requested review from abays and dprince and removed request for rabi and rebtoor March 31, 2026 17:01
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/dd57c92b72a04ef0929c08fe0728effe

openstack-k8s-operators-content-provider FAILURE in 9m 19s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ adoption-standalone-to-crc-ceph-provider SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-edpm-baremetal-minor-update SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/fe53444e267b46bfab002d52d844e719

openstack-k8s-operators-content-provider FAILURE in 7m 42s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ adoption-standalone-to-crc-ceph-provider SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-edpm-baremetal-minor-update SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi stuggi force-pushed the backup_restore_controller branch from 2e3227a to ddaf0cb Compare April 7, 2026 13:43
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b224d61474934085ac7a999c8bb3f7a1

openstack-k8s-operators-content-provider FAILURE in 8m 02s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ adoption-standalone-to-crc-ceph-provider SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
openstack-operator-docs-preview POST_FAILURE in 2m 37s
⚠️ openstack-operator-edpm-baremetal-minor-update SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi stuggi force-pushed the backup_restore_controller branch 2 times, most recently from dccad21 to 3c0c72e Compare April 8, 2026 06:24
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/baec74827fef49899c7ef7d9c71c34a7

openstack-k8s-operators-content-provider FAILURE in 7m 42s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ adoption-standalone-to-crc-ceph-provider SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-edpm-baremetal-minor-update SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi stuggi force-pushed the backup_restore_controller branch from 3c0c72e to 378a2cb Compare April 10, 2026 12:49
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/d108ae22d2ac4105b454c6997c44c2ad

openstack-k8s-operators-content-provider FAILURE in 7m 49s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ adoption-standalone-to-crc-ceph-provider SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-edpm-baremetal-minor-update SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi
Copy link
Copy Markdown
Contributor Author

stuggi commented Apr 12, 2026

/retest

@stuggi stuggi force-pushed the backup_restore_controller branch from 378a2cb to a0ec96e Compare April 13, 2026 05:31
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ed602d0559ba42829787fa1571d0dbc3

openstack-k8s-operators-content-provider FAILURE in 7m 55s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ adoption-standalone-to-crc-ceph-provider SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-edpm-baremetal-minor-update SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi stuggi force-pushed the backup_restore_controller branch from a0ec96e to d101e85 Compare April 16, 2026 05:50
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c733c0866b64471cb1201d4ae9f38243

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 20m 31s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 24m 19s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 31m 09s
adoption-standalone-to-crc-ceph-provider FAILURE in 1h 10m 38s
✔️ openstack-operator-tempest-multinode SUCCESS in 1h 39m 07s
✔️ openstack-operator-edpm-baremetal-minor-update SUCCESS in 2h 06m 54s

Add the BackupConfig CRD, API types, controller, RBAC, samples, and
envtests for the backup/restore labeling feature. The controller watches
CRD instances across operators and labels resources (secrets, configmaps,
NADs) with backup.openstack.org labels for backup/restore integration.
Supports annotation overrides on individual resources to customize
restore ordering or exclude from backup.

Custom Issuer labeling is handled by the ControlPlane controller in
ca.go, not by the BackupConfig controller.

Jira: OSPRH-22912
Jira: OSPRH-22913

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
@stuggi stuggi force-pushed the backup_restore_controller branch from d101e85 to d5cd091 Compare April 16, 2026 14:09
Wire the BackupConfig reconciliation into the ControlPlane controller
with proper condition handling (OpenStackControlPlaneBackupConfigReady).
Add backup/restore labels to CA cert secrets via SecretTemplate, and
restore=false labels to internal service cert requests. Add the
ReconcileBackupConfig call, secret watch with annotation change
predicate, and RBAC for openstackbackupconfigs. Set BackupConfig spec
defaults in the CreateOrPatch mutate function.

Label custom Issuers for backup/restore in addIssuerLabelAnnotation
after removeIssuerLabel so the MatchingLabels query only uses CA
selector labels. Remove getCertSecretBackupLabels wrapper, call
backup.GetCertSecretBackupLabels directly. Return error from
GetCertSecretBackupLabels for non-NotFound errors. Rename GetConfig
parameter from gvk to crdName.

Jira: OSPRH-22912
Jira: OSPRH-22913

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
@stuggi stuggi force-pushed the backup_restore_controller branch from d5cd091 to 1f0e69e Compare April 16, 2026 14:33
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 16, 2026

@stuggi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/openstack-operator-build-deploy-kuttl-4-18 1f0e69e link true /test openstack-operator-build-deploy-kuttl-4-18

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant