![]() |
|
|
Welcome to the { mindfrost82.com } forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact contact us. |
|
|||||||
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
|
|||
|
Intermittent Network Pauses
Hi There,
We have a HP Class-C Chassis with 4 Blade servers, all running windows server 2003. Two of these servers are clustered in Active Passive mode. This cluster is connected to HP EVA3000 SAN array and the network interface of the cluster connects to Cisco 3750 stack. For the past number of months we have been having issues where the clients, running windows XP lose their network drives and after a pause of approx 10-15 seconds reconnect and in many cases a reboot of the workstation is required. We have performed a number of test to isolate the root casue of the issue. The network was fully checked and ruled out as the casue. Packet capture did not reveal anything unusual, except traffic stopping from the cluster during the pause. We perfomed a number of tests on the cluster nodes and one of the tests was a copy test. We ran perfmon and started copying files between local drives and SAN drives. Test1 :copy a Gigabyte file from Local C: to D: Test2: copy the same file from local C: to SAN Test3: Copy the same file from SAN to local C: During all of the above tests we observed, on perfmon, the CPU utilsation dropping to zero and the the network Interface utilisation dropping to zero at exactly the same time. While the CPU utilisation recovered almost immedailty, the network utilisation stayed at 0% for the duration of the copy. These tests caused exactly the same outages that our users experience. While this was happening I could still ping the server at all time. The servers are running Windows 2003 Server SP2. RSS is disabled on the Nics. TOE is disabled. Teaming is disabled as well. Has anyone seen or had this or similar issues. Please help. Thank you |
|
|||
|
Re: Intermittent Network Pauses
I have this exact same problem. My setup is: Windows 2003 R2 Sp2 x64 Active/Passive Setup attached to a Xiotech Array. The systems are 2950 Dell Servers, quad processr 8gb of memory. We have the exact same problem, that every so often network activity to the servers pauses for 10-15 seconds. We had this problem pre-sp2 and upgraded to sp2 to try and mitigate it. Things I have discovered: 1. It seems to correlate with periods when many connections being timed out. So if you watch tcpmon ( sysinternals tool ) and see a bunch of TIME_WAIT connections, if the system pauses the number of TIME_WAIT connections will be drastically less. But correlation isn't causation. I think this is a side effect, not the problem. 2. I get this feeling that it is something to do with rpc getting hungup doing reverse lookups. But I can't prove it. Have you disabled the tcp chimney stuff? -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in |
|
|||
|
Re: Intermittent Network Pauses
We've seen this exact problem as well since around February or March this
year. Our hardware is Dell PowerEdge 2650's (2 node active/passive cluster on W2k3 SP2 32-bit Enterprise), with a Dell/EMC SAN. It seems to be somehow connected to increased network activity or large file transfers, but there are never any useful events in the logs or illuminating activity on any performance counters. Despite many, many hours spent on the phone, so far neither MS or Dell has been able to isolate the root cause. ![]() |
|
|||
|
Re: Intermittent Network Pauses
$hawn, Have you looked into your storport driver versions? I figure you have, but I have a very similar setup to yours at a different facility and I overlooked the Microsoft KB's that upgrade the storport and we had a nagging performance issue that was caused by older storport drivers. The other guy, Do you have VSS in use in any form? We don't use it for snapshots, but we have a backup program that uses it to backup the SQL database on one of the nodes. Do you ever see VSS messages in your event viewer? -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in |
|
|||
|
Re: Intermittent Network Pauses
Yes. Actually today we replaced an entire server with a new Dell; all brand
new hardware and the latest drivers for everything. The dropouts are still happening. Dell swears there's nothing wrong with the SAN. I'm starting to think that it could be a Win 2003 OS update? I might try removing all updates since January to see if it goes away... > Have you looked into your storport driver versions? I figure you have, > but I have a very similar setup to yours at a different facility and I > overlooked the Microsoft KB's that upgrade the storport and we had a > nagging performance issue that was caused by older storport drivers. |
|
|||
|
Re: Intermittent Network Pauses
$hawn;3729513 Wrote: > Yes. Actually today we replaced an entire server with a new Dell; all > brand > new hardware and the latest drivers for everything. The dropouts are > still > happening. > Dell swears there's nothing wrong with the SAN. > > I'm starting to think that it could be a Win 2003 OS update? I might > try > removing all updates since January to see if it goes away... > If it works let me know, the problem started in October (ish) of 2007 for me, and I stopped updating the machines shortly thereafter because I didn't want to throw in extra variables. Then I did all of the updates ( including sp2 ) because I'm out of ideas. -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in |
|
|||
|
RE: Intermittent Network Pauses
We logged a call with MS and they asked us to upgrade couple of drivers. We
will be doing the upgrade in the next day or so and will post the out come. Drivers to be Upgraded: 1) Update elxstor driver to the latest version. ELXSTOR.SYS |Emulex |5.1:20.7 |Aug 04 2006 |Storport Miniport Driver for LightPulse HBAs 2) Update hpcisss2.sys or we can contact HP to get latest Proliant support Pack HPCISSS2.SYS |Hewlett-Packard Company |6.8:0.32 |Jun 21 2007 |Smart Array SAS/SATA Controller Storport Driver "Intermittent Network Pauses" wrote: > Hi There, > > We have a HP Class-C Chassis with 4 Blade servers, all running windows > server 2003. Two of these servers are clustered in Active Passive mode. This > cluster is connected to HP EVA3000 SAN array and the network interface of the > cluster connects to Cisco 3750 stack. > For the past number of months we have been having issues where the clients, > running windows XP lose their network drives and after a pause of approx > 10-15 seconds reconnect and in many cases a reboot of the workstation is > required. > > We have performed a number of test to isolate the root casue of the issue. > The network was fully checked and ruled out as the casue. Packet capture did > not reveal anything unusual, except traffic stopping from the cluster during > the pause. > > We perfomed a number of tests on the cluster nodes and one of the tests was > a copy test. > We ran perfmon and started copying files between local drives and SAN drives. > > Test1 :copy a Gigabyte file from Local C: to D: > Test2: copy the same file from local C: to SAN > Test3: Copy the same file from SAN to local C: > > During all of the above tests we observed, on perfmon, the CPU utilsation > dropping to zero and the the network Interface utilisation dropping to zero > at exactly the same time. While the CPU utilisation recovered almost > immedailty, the network utilisation stayed at 0% for the duration of the copy. > These tests caused exactly the same outages that our users experience. > While this was happening I could still ping the server at all time. > > The servers are running Windows 2003 Server SP2. > RSS is disabled on the Nics. > TOE is disabled. > Teaming is disabled as well. > > Has anyone seen or had this or similar issues. Please help. > > Thank you > > > > |
|
|||
|
Re: Intermittent Network Pauses
So, new elxstor driver do the trick? -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in |
|
|||
|
Re: Intermittent Network Pauses
Hi Squidi,
Na, no difference at all. Still pausing. The data has been re-arranged in the SAN but still not help. Anymore ideas Squidi. "Squidi" wrote: > > So, new elxstor driver do the trick? > > > -- > Squidi > ------------------------------------------------------------------------ > Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 > View this thread: http://forums.techarena.in/showthread.php?t=962570 > > http://forums.techarena.in > > |
|
|||
|
Re: Intermittent Network Pauses
Below is a little perl script that keeps track of when it happens ( roughly ). Part of the problem with figuring this out is that there is no record of when or how often it happens. If you point it at a text file on a share ( smbstat \\server\share\file.txt ), it opens it, shows it to you, closes it and records the the time in the log file. I then look through the log file with something like: awk '{if ( $7 > 1 ) {print $0}}' smbstat4.log Which spits out everything that took greater then a second. No making fun of my perlfu. This is one of the ways to do it! I also posted on Microsoft's forums. The response was basically, "Call PSS". You two's clusters are on the supported list, maybe you would have better luck. ---perl------------------------------------------------- use Time::HiRes qw(usleep gettimeofday tv_interval); if ( $#ARGV == -1 ) { die "Usage: smbstat <uncpathnametofile>\n"; } $filename = @ARGV[0]; do { open OUTFILE, ">>smbstat4.log" or die $!; $before = [gettimeofday]; open NETFILE, $filename or die $!; while (<NETFILE>) { print $_; } ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$ati me,$mtime,$ctime,$blksize,$blocks)= stat($filename); print $size; close NETFILE; $after = [gettimeofday]; $i = tv_interval $before, $after; $t = localtime; print OUTFILE $t, " Interval: $i \n"; close OUTFILE; sleep(5); } while (1); -------------------------------------- -- Squidi ------------------------------------------------------------------------ Squidi's Profile: http://forums.techarena.in/member.php?userid=48647 View this thread: http://forums.techarena.in/showthread.php?t=962570 http://forums.techarena.in |
![]() |
|
| Thread Tools | Search this Thread |
| Display Modes | |
|
|