Monday, July 16, 2012

VMWare ESXi booting from USB ... when things go wrong

Greetings, my Internet audience,

Recently I moved my stack of not-so-new network equipment & servers to a wonderful, lovely, Ucoustic sound proof rack (which I can't say enough good things about how well it reduces machine noise, but that's a post for another day).

What you and I likely both know about moving not-so-new equipment is that there is always a risk on moving it to anywhere that something will go wrong.  And predictably One of my ESXi hosts, the one containing the two office Active Directory/DNS/DHCP servers started having problems the next day, first the VM's had issues where they were only intermittently reachable by ping, VM console, Remote Desktop, or SMB, following the most annoying pattern of:

reachable by ping < 2ms for 30-40 pings
reachable by ping ~40000ms for 30-40 pings (Seriously!  I mean 40 SECONDS)
unreachable by ping for about 30-40 pings

over and over and over again.  Nothing in the ESXi logs for the host or VM, nothing in the Windows server system event logs.  So of course I rebooted the VM.  No change.  Aggravating.

So I did what many before me and after me would try ... I shut down all the VM's on the ESXi host, put the ESXi host in maintenance mode, and restarted it.

Got the normal Dell bios messages about the processors, memory, the RAID controller, a battery error on the RAID controller we've been getting for a while, a message about the remote configuration utility, and then nothing.  The cursor blinked annoyingly, tauntingly, but nothing.

Long story short, the USB thumb drive contained the bootable image of ESXi, and it was out of order, dead, kaput. 

But all is not lost, as I first suspected!  And longer story even more brief, I was able to create a new bootable ESXi USB thumb drive, boot from that, do the basic configuration (IP, network, name, gateway, DNS servers), and then logon to the ESXi host through vSphere Client & add the VM's back to inventory.  As I get a moment I'll elucidate how I did this, but if you are in this position and need help, post a comment and I'll let you know how I did it.

A tad bit of useful info was found on this VMWare site, at the bottom under "Disaster Recovery"  -- the one caveat I'll specify is that if you have internal disks containing VM's REMOVE them before installing ESXi on the USB thumb drive, otherwise you risk erasing those VM's that you are desperately trying to resuscitate.