x86_64: lock IA32_FEATURE_CONTROL with VMXON enabled at boot#1
Open
tonicmuroq wants to merge 1 commit into
Open
x86_64: lock IA32_FEATURE_CONTROL with VMXON enabled at boot#1tonicmuroq wants to merge 1 commit into
tonicmuroq wants to merge 1 commit into
Conversation
Guest hypervisors (Windows Hyper-V, WSL2) check IA32_FEATURE_CONTROL at boot and refuse VMXON unless firmware has set the lock bit with VMXON-outside-SMX enabled — otherwise they report "virtualization not enabled in firmware". rust-hypervisor-firmware never touched this MSR, so a Windows guest saw VMX in CPUID (VMMonitorModeExtensions=True) but VirtualizationFirmwareEnabled=False, and Hyper-V launch failed with Event 41 "Either VMX not present or not enabled in BIOS". Nested virtualization was therefore unusable. Set the MSR the way real firmware does: when CPUID reports VMX, lock IA32_FEATURE_CONTROL with VMXON-outside-SMX enabled, before handing control to the OS. Skipped when the lock bit is already set (idempotent) and when VMX is absent (AMD / nested off), so AMD and non-nested guests are unaffected. Verified on a Cloud Hypervisor Windows 11 guest: VirtualizationFirmwareEnabled flips False to True and the guest boots normally.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Windows guests under Cloud Hypervisor cannot run Hyper-V / WSL2 (and therefore Docker Desktop), failing with
Virtualization support not detected. Inside the guest:VMMonitorModeExtensions = True,SecondLevelAddressTranslation = True(VMX + EPT are exposed in CPUID)VirtualizationFirmwareEnabled = FalseMicrosoft-Windows-Hyper-V-HypervisorEvent 41: "Hypervisor launch failed; Either VMX not present or not enabled in BIOS."Root cause
IA32_FEATURE_CONTROL(MSR0x3A) is owned by firmware. Real BIOS/UEFI locks it at boot with the VMXON-outside-SMX bit set; a guest hypervisor reads it and refusesVMXONif the lock bit is clear. rust-hypervisor-firmware never wrote this MSR, so even though the VMX feature bit is present in CPUID, the guest concludes virtualization is "not enabled in firmware" and nested virt is unusable.Fix
In
rust64_start(x86_64), after paging setup and before handing off to the OS, do what real firmware does: when CPUID reports VMX, lockIA32_FEATURE_CONTROLwith VMXON-outside-SMX enabled (0x5).CPUID.1:ECX[5](VMX) is Intel-specific, sohas_vmxis false and the function is a no-op — AMD and non-nested guests are untouched.Test
Built (
x86_64-unknown-none, pinned nightly) and deployed asCLOUDHV.fdto a Cloud Hypervisor host, fresh cold-boot of a Windows 11 guest with the stock cloud-hypervisor binary:VirtualizationFirmwareEnabledflips False → TrueThis change fixes the firmware-reported state, but it does not yet yield a working nested hypervisor, and may make the failure mode worse.
After the firmware flips
VirtualizationFirmwareEnabledto True, I enabledVirtualMachinePlatform+Microsoft-Hyper-V-Alland sethypervisorlaunchtype auto, then rebooted the guest. The guest did not come back:This is the same hang signature as a guest stuck in early boot. The likely reading: with the firmware flag now set, Windows'
hvix64gets past the firmware check (no more Event 41) and proceeds to the actualVMXON, then hangs there — i.e. the underlying nested VMX does not actually function in this host (this is a doubly-nested case: GCE nested-virt L0→L1, Cloud Hypervisor L1→L2 Windows, and Windows Hyper-V would need L2→L3). So this MSR fix appears necessary but not sufficient, and it converts a graceful "not enabled in firmware" failure into a hard hang.Do not treat this as enabling Docker/WSL2 on Windows guests until the post-
VMXONhang is understood. Verification of whether L2→L3 nesting is even supported on the host should land before relying on this.Targets origin (
cocoonstack), not upstream.