Windows Server 2008 R2 Hyper-V Machines Won't Start After Failed SP1 Install
What a fun day! I awoke to a call that all of our server’s VMs were down and inaccessible. Remoted in to find that the VMMS service wouldn’t start.
The Hyper-V Virtual Machine Management service failed to start due to the following error:
The service did not respond to the start or control request in a timely fashion.
It took literally all day but I finally got it fixed and wanted to put notes in about how. I wasn’t able to get ANY help from this from Google but I have to assume others have had or will have this problem. This post will hopefully shed some light on the situation.
The troubleshooting was an epic process. At a number of different times, I thought I had it beat and realized I had made a lateral move. Rather than going through the whole story, the bottom line was that a failed installation of SP1 left our system in an inconsistent state. Some files reflected the RTM version, others reflected SP1. In our case, the first indication that something was wrong was an error when I launched vmms.exe from a command prompt, a long error (that I’m kicking myself now for not writing down!!!) referencing a procedure not found in VID.DLL. Comparing the dates of VMMS.EXE to VID.DLL showed that the EXE was SP1, DLL was RTM. I copied the DLL from another working server, put it in place, but was greeted with what appeared to be a permissions error when opening the Hyper-V manager. I fixed the failed SP1 install by using the earlier post I made on the topic — the same exact update was preventing the install. Repairing SP1 did not fix it. I uninstalled and reinstalled the Hyper-V role, also didn’t fix it. At one point, it gave an error about WMI components, which I fixed by Googling something along the lines of “reinstall hyper-v wmi” though that didn’t fix it, either. After all that, I was able to get the list of machines to display within the Hyper-V Manager but they failed when I tried to start them, saying that “one of the hyper-v components is not running.” Google gave me nothing of use.
What did fix it? The Event Log for VMMS showed this error:
‘Virtualization Infrastructure’ driver required by the Virtual Machine Management service is not installed or is disabled. Check your settings or try reinstalling the Hyper-V role.
That error didn’t tell me anything. Since it spoke of a driver, I checked Device Manager and under System Devices, there was “Virtualization Infrastructure Driver” with a yellow exclamation point. Its properties indicated that Windows couldn’t verify the signature of a driver file. I went to the details tab and then looked at the files that comprised this driver: VID.DLL, which I knew about, and VID.SYS, contained in c:\windows\system32\drivers. VID.DLL showed an SP1 version but VID.SYS showed RTM. Copied the SP1 version from a working SP1 machine, backed up the RTM and replaced with the SP1, disabled the driver, enabled the driver, and everything was good! …With that driver, at least.
Hyper-V machines still wouldn’t start. They were all in a Saved State and I theorized that with all of my attempts to repair this, including a removal and reinstallation of the Hyper-V role, those Saved States would be no good. I deleted the Saved States using the link in Hyper-V Manager at the bottom right when a machine was selected, tried to start, and got a NEW error about the virtual network adapter! This is because of the removal/reinstallation of Hyper-V, for sure. Right-click on each VM and the network adapter read as “Configuration Error.” Changed to the new virtual network, started the VM, and everything started up normally. What a day.