Loss of network connectivity. (Weird problem)
Here is the scenario:
We have 5 AD's housing 5000 machine separated geographically (branches in various cities). Every branch has 1 (max 2) machine(s) which have billing enabled on them. These are the only machine which can be used to give prints and transfer data to removable media. These machines usually run on power user privileges. We use DHCP with MAC address binding for all machines. Except the billing machines others run on user rights (restricted to an extent). All machines run XP. All of them have an AV Ė updated. None of them are infected (at least most of them). Firewall is OFF on all machines because we have a HIPS package (AV, Firewall, Proactive and this and that!)
From last few days these machines (THE BILLING ONE'S) have been "mysteriously" going off the network and here is what happens:
I can ping the website.
I can trace to the website.
I can access other machines on the network.
But somehow they just canít access any WEBSITE (internet and intranet), through IE; it just dies out.
I have flushed dns cache and still then pinged a website successfully. So at least DNS isnít the problem. I have stopped dnschache service and pinged and it works. I have checked firewall / AV logs and found ABSOLUTELY NOTHING (sort of surprising). I have stopped the HIPS package to see if it is the cause of the problem but still it doesnít work!
Windows logs (all Ė app, system, IE, security) show up nothing.
Here is the interesting part when I change the IP address of the machine (manually) to an IP which is not allocated (in the same VLAN of course) the machine starts working perfectly! Also from an average of 35 machines per branch why does the billing machine only die out?
I ran a sniffer and process monitor (not process explorer) side by side and found hardly anything. Only thing that stood out was at in a 20 minutes capture after 8 minutes there was a HUGE ARP broadcast coming from the switch IP. The switch (gateway for this VLAN) was sending exceedingly large number of ARP requests to broadcast IP Ė around 1300 packets before it stopped. This is the only thing which stood out. This goes on for two minutes and then everything calms down. Then again after 30 minutes or so this activity comes up again but I am not sure since I didnít have enough time at the location where I was gathering data. I can confirm this tomorrow.
I am willing to share logs and sniffed data for those who want it. ANY HELP WOULD BE GREATLY APPRECIATED.
THANK YOU VERY MUCH IN ADVANCE.