SANs and ESX guests

So, learned a valuable lesson today: If you want to restart a SAN array, FOR GOODNESS SAKE power off any VM guests using the volumes it hosts.

To flesh out the details, we are moving all our volumes off the loaner PS5000 and onto our new PS6000. I'm only seeing one interface being used, so figured that was a config error. Ensured all the eth interfaces were up and had addresses, and spoke to tech support about it. They said a restart of the array might help things.

Well, they didn't mention shutting down attached guests first! I knew that you shouldn't, but it didn't click that our file server was using that volume, and should have been powered off first. I restart the array, and try to move a volume again, but it's still only using one eth interface, albeit a different one this time.

It turns out, from another tech support rep, that when moving volumes the PS doesn't see that as a priority, and therefore only uses one eth interface to do so. Argh!

So I spent the morning cleaning up the disaster that ensued. No file server = no My Docs, and no My Docs = hung logons, can't save files, etc etc etc. Since it was a VM, the ESX host needed to be rebooted, as that's the only way to clear something like this up. I tried to shut down the VM on it's own, but it timed out...after 20 minutes....yeah, a 20 minute timeout. That's a bug!

Anyways, things are up and running now, and lesson learned without too much of a cost. Still kinda feel stupid about it though.

Comments

Popular posts from this blog

In 2020, what will your 100 minutes be?

Breaking hero behaviour with systems thinking

Health check learnings, this time with data!