I recently wrapped up some testing in my lab around VMware HA; specifically, around VMware HA isolation response. My tests involved various network configurations and attempted to clearly document the behavior of VMware HA isolation response under different circumstances. I thought I’d share some of my findings here in the hopes that others would find this information useful as well. (Keep in mind that some of the stuff listed below is just common sense, but I’m including it here anyway just for completeness.)
- Ensure that the vSwitch hosting the Service Console has at least two uplinks. Keep in mind that instead of leaving that second uplink primarily unused, you can place other traffic on the same vSwitch and use the “Override vSwitch failover order” option to direct traffic preferentially onto certain uplinks. (I’ll most likely post a separate blog entry about that so that I can explain that in more detail.)
- Ensure that DNS is working correctly on all ESX hosts in the HA-enabled cluster. You should verify host name resolution for both short names as well as fully-qualified domain names (FQDNs). Although I’ve seen numerous recommendations to hard-code entries into /etc/hosts, this approach is difficult to manage and does not scale well. Just fix DNS instead.
- Ensure that the Service Console’s default gateway responds to ping. If it does not, you’ll need to use the das.usedefaultgateway and das.isolationaddress parameters to change where VMware HA should check to see if it is isolated. Chad Sakac recently discussed these items as well, so check that entry for additional information.
…[more about HA Configuration on http://blog.scottlowe.org]
very helpful information. Thanks Scott