Cache OpenStack ProviderClient per auth context#317
Conversation
|
Warning Review limit reached
More reviews will be available in 33 minutes and 30 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds an ChangesAgent Pod Eviction Gate for Offboarding
OpenStack Provider Client Caching
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/openstack/service_client.go (1)
41-55: 💤 Low valueCache key omits fields that can also affect the Keystone token.
The comment states the key includes "all fields that affect the resulting Keystone token" but it excludes alternative identity selectors like
ProjectID,ProjectDomainID,UserID,UserDomainID,DomainID/DomainName, and application credentials. If a future caller specifiesProjectIDinstead ofProjectName, they would collide with a different auth context.Current upstream usage (e.g.,
OnboardingControllerusingProjectName/ProjectDomainName) is safe, but consider either:
- Adding the missing ID fields to the key, or
- Updating the comment to accurately reflect the subset of fields covered.
Option 1: Expand the cache key to cover ID-based selectors
func cacheKey(authInfo *clientconfig.AuthInfo) string { if authInfo == nil { return "" } - // Include all fields that affect the resulting Keystone token and catalog - // so a future auth context differing only in user, domain, or AuthURL - // doesn't silently reuse the wrong cached provider. + // Include fields that affect the resulting Keystone token and catalog. + // This covers both name-based and ID-based selectors to prevent collision + // when callers use different identity strategies. return strings.Join([]string{ authInfo.AuthURL, authInfo.ProjectName, + authInfo.ProjectID, authInfo.ProjectDomainName, + authInfo.ProjectDomainID, authInfo.Username, + authInfo.UserID, authInfo.UserDomainName, + authInfo.UserDomainID, + authInfo.DomainName, + authInfo.DomainID, }, "\x00") }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/openstack/service_client.go` around lines 41 - 55, The cacheKey function claims to include all fields affecting the Keystone token but omits alternative identity selectors like ProjectID, ProjectDomainID, UserID, UserDomainID, DomainID, DomainName, and application credentials. This can cause cache collisions when different auth contexts use ID-based selectors instead of name-based ones. Update the strings.Join call in cacheKey to include all the missing ID and alternative fields from the authInfo struct alongside the existing name-based fields, ensuring the cache key accounts for all ways callers can specify authentication parameters.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/controller/hypervisor_controller.go`:
- Around line 298-300: The hv.List call on line 298-300 uses a field selector
for "spec.nodeName" on Pods, but this field index is not registered with the
controller's cached client. To fix this, either register the field index by
calling IndexField with "spec.nodeName" in the SetupWithManager method for the
Pod type, or configure Pod to bypass the cache by adding it to the
cache.Options.ByObject configuration in the main setup.
---
Nitpick comments:
In `@internal/openstack/service_client.go`:
- Around line 41-55: The cacheKey function claims to include all fields
affecting the Keystone token but omits alternative identity selectors like
ProjectID, ProjectDomainID, UserID, UserDomainID, DomainID, DomainName, and
application credentials. This can cause cache collisions when different auth
contexts use ID-based selectors instead of name-based ones. Update the
strings.Join call in cacheKey to include all the missing ID and alternative
fields from the authInfo struct alongside the existing name-based fields,
ensuring the cache key accounts for all ways callers can specify authentication
parameters.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f0f0c9a6-9a49-426b-9515-f34863b08adf
📒 Files selected for processing (11)
api/v1/hypervisor_types.gocharts/openstack-hypervisor-operator/templates/role.yamlgo.modinternal/controller/constants.gointernal/controller/gardener_node_lifecycle_controller.gointernal/controller/gardener_node_lifecycle_controller_test.gointernal/controller/hypervisor_controller.gointernal/controller/hypervisor_controller_test.gointernal/controller/offboarding_controller.gointernal/controller/offboarding_controller_test.gointernal/openstack/service_client.go
The seven controllers using the default credentials each call AuthenticatedClient from SetupWithManager, deserialising the full Keystone service catalog every time. With a large multi-region catalog this is wasteful regardless of whether it is a proximate OOM cause. Providers are now cached per auth context (AuthURL, project, user, and their domains) so Keystone is hit once for the default credentials and once for the test-project credentials.
b0b6f49 to
dec2a6d
Compare
Merging this branch will decrease overall coverage
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. |
There was a problem hiding this comment.
Pull request overview
This PR reduces repeated Keystone authentication/catalog deserialization by caching OpenStack ProviderClients per auth context, and extends the hypervisor offboarding workflow to (a) taint nodes to evict agent pods and (b) gate Nova compute-service deletion until those agent pods are confirmed gone.
Changes:
- Cache
gophercloud.ProviderClientinstances keyed by auth context and deduplicate concurrent auth viasingleflight. - Add an offboarding
NoExecutetaint and compute/report a newAgentPodsEvictedcondition from live pod state. - Prevent Nova compute-service deletion during offboarding until
AgentPodsEvicted=True, and update RBAC/tests accordingly.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| internal/openstack/service_client.go | Introduces cached ProviderClient creation and a helper to derive service clients from a cached provider. |
| internal/controller/offboarding_controller.go | Gates compute-service deletion on a new “agent pods evicted” status condition. |
| internal/controller/offboarding_controller_test.go | Updates tests to set/bypass the new eviction gate and adds a gate-specific test case. |
| internal/controller/hypervisor_controller.go | Computes and persists AgentPodsEvicted based on pods running on the node. |
| internal/controller/hypervisor_controller_test.go | Adds envtest coverage for AgentPodsEvicted condition scenarios (tolerating pods, running pods, deletion-in-progress). |
| internal/controller/gardener_node_lifecycle_controller.go | Applies the offboarding taint once VM eviction completes; watches Hypervisor changes relevant to tainting. |
| internal/controller/gardener_node_lifecycle_controller_test.go | Adds tests ensuring taint application and idempotency. |
| internal/controller/constants.go | Adds the shared taintKeyOffboarding constant. |
| api/v1/hypervisor_types.go | Adds the ConditionTypeAgentPodsEvicted condition type. |
| charts/openstack-hypervisor-operator/templates/role.yaml | Grants pod get/list/watch for the new pod inspection logic. |
| go.mod | Promotes logr and x/sync to direct dependencies used by new code. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The seven controllers using the default credentials each call
AuthenticatedClientfromSetupWithManager, deserialising the fullKeystone service catalog every time. With a large multi-region catalog
this is wasteful regardless of whether it is a proximate OOM cause.
Providers are now cached per auth context (AuthURL, project, user, and
their domains) so Keystone is hit once for the default credentials and
once for the test-project credentials.
Concurrent callers with the same key share a single in-flight request
via
singleflight, preventing duplicate Keystone round-trips onstartup.
Summary by CodeRabbit
Release Notes
New Features
Improvements
Tests
Chores