weird ESX networking issues

Problem: Three VMs are dropping network connectivity

SQL4
SRV_GP
YYZ-MAIL-1

(200 packets sent over an hour or so)
SQL4->gateway : 100%
SQL4->service console5: 100%
SQL4->10.1.1.2: 100%
SQL4->YYZ-MAIL-1: 93%

(75 packets sent over 20 minutes or so)
10.1.1.2->gateway: 100%
10.1.1.2->service console5: 100%
10.1.1.2->SQL4: 73% and dropping
10.1.1.2->YYZ-MAIL-1: 100% so far

(after testing like crazy)

On the advice of the VMware rep (well, something he said might fix it), we tried separating the service console vSwitch from the LAN vSwitch, giving the LAN one nic and the service console one nic, and then rebooting. My understanding was that having the service console and LAN on the same nic was supported in 4.0. Guess not.

So far, all the servers we fixed are 100% fine, but that could be due to the reboots. We'll have to wait for 2-3 weeks for verification on that - it was about two weeks after we got set up that the issue appeared, and after some changes, it re-appeared after two weeks again. So...second round of changes have been done, meaning another two weeks of things working. If there have been no re-occurrences after a month, I'll be satisfied.

Comments

Popular posts from this blog

DFSR - eventid 4312 - replication just won't work

Fixing duplicate SPNs (service principal name)

Logstash to Nagios - alerting based on Windows Event ID