Fix autostart-on-host-failure tag never matching, load tags eagerly#3
Open
Brittlejf wants to merge 2 commits into
Open
Fix autostart-on-host-failure tag never matching, load tags eagerly#3Brittlejf wants to merge 2 commits into
Brittlejf wants to merge 2 commits into
Conversation
instance.tags is a TagList of Tag objects, so testing membership with "autostart_instance_on_host_failure" in instance.tags compared the string against Tag objects and never matched. tag_state was therefore always False, which made expect_running always False, so no instance was ever resumed after a host reboot -- the opposite of the intended opt-in autostart behaviour. Compare against the tag names instead, and add unit tests covering the tagged and untagged cases. The existing failed-resume test is updated to tag its instance so it still reaches the resume path now that resume is gated on the tag.
_init_instance now reads instance.tags for every instance to decide whether to resume it on host boot. tags was not in the expected_attrs passed to InstanceList.get_by_host, so each access triggered a lazy load -- one extra DB query per instance during nova-compute startup. Add 'tags' to expected_attrs so they are joined in up front.
jhindersson
approved these changes
Jun 10, 2026
Member
Author
How
|
Without 'tags' in expected_attrs (before) |
With 'tags' (this PR) |
|
|---|---|---|
instance.tags after load |
not set → first access lazy-loads from DB (_load_tags, 1 query) |
already set to empty TagList |
| Extra query at startup | yes, one per instance | none |
"..." in [t.tag for t in instance.tags] |
False (after the lazy query) |
False (no query) |
Net effect: identical behavior for untagged servers — tag_state is False either way, so they are not autostarted. The only difference is where the empty tag list comes from: an up-front outer join instead of a per-instance lazy query.
One caveat worth noting: joinedload is a real JOIN, so for instances that do have many tags it multiplies rows in the result set. Tags are short and capped (MAX_TAG_LENGTH plus a per-instance tag-count limit), so this stays in line with how nova already uses joinedload for relationships like security_groups.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #2 added opt-in autostart of instances tagged
autostart_instance_on_host_failureafter a hypervisor reboot, but thetag check never matched, so the feature did the opposite of its intent.
This fixes the check and avoids a per-instance DB query at startup.
Background
_init_instancedecides whether to resume a guest on host boot:instance.tagsis aTagListofTagobjects, not strings, so"..." in instance.tagscompares the string againstTagobjects andnever matches.
tag_statewas therefore alwaysFalse, makingexpect_runningalwaysFalse— so no instance was ever resumedafter a host reboot, including the tagged ones the feature is meant to
autostart.
Changes
(
[tag.tag for tag in instance.tags]) instead of theTagobjects.init_hostnow passes'tags'inexpected_attrstoInstanceList.get_by_host, so tags are joined inup front rather than lazy-loaded once per instance during
nova-computestartup.(does not resume) cases, and update the existing failed-resume test to
tag its instance so it still reaches the resume path now that resume is
gated on the tag.
Testing
nova.tests.unit.compute.test_compute_mgrresume tests pass understestr;flake8is clean on the changed files.