Skip to content

Fix autostart-on-host-failure tag never matching, load tags eagerly#3

Open
Brittlejf wants to merge 2 commits into
elastx/yogafrom
fix/autostart-tag-membership-check
Open

Fix autostart-on-host-failure tag never matching, load tags eagerly#3
Brittlejf wants to merge 2 commits into
elastx/yogafrom
fix/autostart-tag-membership-check

Conversation

@Brittlejf

Copy link
Copy Markdown
Member

Summary

PR #2 added opt-in autostart of instances tagged
autostart_instance_on_host_failure after a hypervisor reboot, but the
tag check never matched, so the feature did the opposite of its intent.
This fixes the check and avoids a per-instance DB query at startup.

Background

_init_instance decides whether to resume a guest on host boot:

tag_state = False
if "autostart_instance_on_host_failure" in instance.tags:
    tag_state = True
expect_running = (db_state == power_state.RUNNING and
                  drv_state != db_state and tag_state)

instance.tags is a TagList of Tag objects, not strings, so
"..." in instance.tags compares the string against Tag objects and
never matches. tag_state was therefore always False, making
expect_running always False — so no instance was ever resumed
after a host reboot, including the tagged ones the feature is meant to
autostart.

Changes

  • Fix the tag check — compare against the tag names
    ([tag.tag for tag in instance.tags]) instead of the Tag objects.
  • Load tags eagerlyinit_host now passes 'tags' in
    expected_attrs to InstanceList.get_by_host, so tags are joined in
    up front rather than lazy-loaded once per instance during
    nova-compute startup.
  • Tests — add unit tests for the tagged (resumes) and untagged
    (does not resume) cases, and update the existing failed-resume test to
    tag its instance so it still reaches the resume path now that resume is
    gated on the tag.

Testing

nova.tests.unit.compute.test_compute_mgr resume tests pass under
stestr; flake8 is clean on the changed files.

instance.tags is a TagList of Tag objects, so testing membership with
"autostart_instance_on_host_failure" in instance.tags compared the
string against Tag objects and never matched. tag_state was therefore
always False, which made expect_running always False, so no instance
was ever resumed after a host reboot -- the opposite of the intended
opt-in autostart behaviour.

Compare against the tag names instead, and add unit tests covering the
tagged and untagged cases. The existing failed-resume test is updated to
tag its instance so it still reaches the resume path now that resume is
gated on the tag.
_init_instance now reads instance.tags for every instance to decide
whether to resume it on host boot. tags was not in the expected_attrs
passed to InstanceList.get_by_host, so each access triggered a lazy
load -- one extra DB query per instance during nova-compute startup.

Add 'tags' to expected_attrs so they are joined in up front.
@Brittlejf Brittlejf self-assigned this Jun 10, 2026
@Brittlejf

Copy link
Copy Markdown
Member Author

How 'tags' in expected_attrs affects servers with no tags set

Untagged servers are unaffected — they get an empty TagList, not an error or an extra query. Walking the load path:

  1. DB layer (nova/db/main/api.py): instance_get_all_by_host_instance_get_all_query. tags is not a "manual join" column (only metadata/system_metadata/pci_devices/fault are), so it falls into the else branch and is loaded via orm.joinedload('tags').
  2. joinedload on a one-to-many relationship is a LEFT OUTER JOIN, so an instance with no tags is still returned — just with zero joined tag rows. db_inst['tags'] comes back as an empty list.
  3. Object layer (nova/objects/instance.py:429): because 'tags' is in expected_attrs, _from_db_object runs obj_make_list(...) over that empty list and sets instance['tags'] to an empty TagList.

So for a server with no tags:

Without 'tags' in expected_attrs (before) With 'tags' (this PR)
instance.tags after load not set → first access lazy-loads from DB (_load_tags, 1 query) already set to empty TagList
Extra query at startup yes, one per instance none
"..." in [t.tag for t in instance.tags] False (after the lazy query) False (no query)

Net effect: identical behavior for untagged servers — tag_state is False either way, so they are not autostarted. The only difference is where the empty tag list comes from: an up-front outer join instead of a per-instance lazy query.

One caveat worth noting: joinedload is a real JOIN, so for instances that do have many tags it multiplies rows in the result set. Tags are short and capped (MAX_TAG_LENGTH plus a per-instance tag-count limit), so this stays in line with how nova already uses joinedload for relationships like security_groups.

@Brittlejf Brittlejf marked this pull request as ready for review June 10, 2026 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants