Mark Minasi's Reader Forum
Mark Minasi's Reader Forum
Home | Profile | Register | Active Topics | Active Polls | Members | Search | FAQ | Minasi Forum RSS Feed
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 HALP! Questions on Windows and Windows Server
 Windows Server 2003 R2
 Slow network and loss of networkconnection
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/04/2011 :  3:42:56 PM  Show Profile  Reply with Quote
Hi All,

We have 3 AD sites, the Main Office, a branch office, and a learning center. We are using SBS 2003 located at the Main Office and Server 2003 R2 at the other branchs. Both are at SP2 level.

The network problems started 1 month ago when the machines from MO were moved to the branch site. These sites were under the same SBS domain but not under the same server. None of the settings on the workstations or servers were changed and DHCP is used at both sites

The symptoms occur only at the branch site and include slow network performance, loss of work when trying to save documents (the program freezes and closes)and when they re-open the documents their work has not been saved. Also, they are unable to browse the network, all the shares are mapped. They periodically loose there connection to the Exchane server at the MO. The workstations are getting the following error several times a day in the system log intermittently (from a few seconds or minutes for up to 3 hours at a time)

Event Type: Warning
Event Source: MRxSmb
Event Category: None
Event ID: 3019
Date: 8/4/2011
Time: 12:34:09 PM
User: N/A
Computer: xxxxx
Description:
The redirector failed to determine the connection type.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00 00 00 00 04 00 4e 00 ......N.
0008: 00 00 00 00 cb 0b 00 80 ....Ë..#128;
0010: 00 00 00 00 84 01 00 c0 ....#132;..À
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........

The server has been intermittently getting the above error as well ad the following error:

Event Type: Error
Event Source: MRxSmb
Event Category: None
Event ID: 8003
Date: 8/3/2011
Time: 10:49:30 PM
User: N/A
Computer: xxxxx
Description:
The master browser has received a server announcement from the computer xxxxxx that believes that it is the master browser for the

domain on transport NetBT_Tcpip_{C6C2219E-02EE-4524-87. The master browser is stopping or an election is being forced.

For more

information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00 00 00 00 03 00 4e 00 ......N.
0008: 00 00 00 00 43 1f 00 c0 ....C..À
0010: 00 00 00 00 00 00 00 00 ........
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........


The workstations have have a mixture of Broadcom and Intel NICs. They were all tested using Dells diagnostic utility and passed. They can all ping the server and each other by name and IP address. I also ran DCdiag on the server and all it passed all tests. I ran nbtstat -a on all the workstations and there were no "name in conflict" errors, also I ran nbtstat -RR to refresh the NetBIOS cache. I then ran browstat status command from the resource kit and received the following:

Status for domain xxxxxx on transport \Device\NetBT_Tcpip_{C6C2219E-02EE-4524-
87CC-FE9CE8031B9D}
Browsing is active on domain.
Master browser name is: SBSSRV
Master browser is running build 3790
3 backup servers retrieved from master SBSSRV
\\SPELLDC
\\CLASSDC
\\SBSSRV
There are 27 servers in domain xxxxxx on transport \Device\NetBT_Tcpip_{C6
C2219E-02EE-4524-87CC-FE9CE8031B9D}
There are 1 domains in domain xxxxxxx on transport \Device\NetBT_Tcpip_{C6C
2219E-02EE-4524-87CC-FE9CE8031B9D}

At this point we had a consultant come in to troubleshoot, he found no problems with the hardware or my network configuration. The only thing he could think of was that the SBS server is running WINS and using netBIOS over TCP/IP for communication as well as DNS. The server at the branch site is not configured to use it. He disabled NetBIOS on the workstations which seemed to resolve some of the problems we were having when browsing the network (slow to open folder).

Yesterday the problem came back. To see if this made a difference I installed WINS onto the server at the branch and and set WINS to point to itself. I re-enabled NetBIOS on the workstations and now they can now browse the network. None of the other problems have not been solved.

We are a non-profit agency and this is costing us money. Any ideas of what's going on, and how I can fix it?

Ivan

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/04/2011 :  6:47:22 PM  Show Profile  Reply with Quote
Before you can fix the problem you need to identify the cause. What you need to do is figure out if you have a network issue or a server issue. Easiest way to determine that is to get a Wireshark trace. www.wireshark.org It’s free and is the perfect tool to use to diagnose this type of issue. Get some traces and we'll help you figure it out. It sounds to me like you have an icorrect setting. We need to figure out if it's a network or server setting.
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/05/2011 :  5:15:56 PM  Show Profile  Reply with Quote
Hi Douggg,

I've downloaded Wireshark but, will have to wait till Monday to run it.

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/05/2011 :  6:43:05 PM  Show Profile  Reply with Quote
Start a capture to see what you find.
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/07/2011 :  9:14:39 PM  Show Profile  Reply with Quote
I ran a couple of captures today on one of the workstations getting the errors and saved them but, I don't know how to interpret the results.

I did notice a couple of things, there are tons of broadcasts (ARP). Some of those ARP broadcasts where for an external that turned out to be for funwebsearch. I ran Malwarebytes which completely removed it. Also, there are alot of Browser announcements, each time for a differnt workstation attempting to be the master browser. They seem to correspond with the MRxSmb errors in the workstations event log. Also, I'm getting alot of SNMP from the router.

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/08/2011 :  02:12:38 AM  Show Profile  Reply with Quote
You are off to a good start. Appears you have found som eissues already with Wireshark. Use a filter MAC address IP address to monitor conversations on the server. Then look at who the server is having conversations with. Then look at response times and for TCP reransmissions.
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/08/2011 :  11:12:33 AM  Show Profile  Reply with Quote
Douggg,

I have never used Wireshark before and the documentation is confusing. How do I setup the filters?

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/08/2011 :  12:04:06 PM  Show Profile  Reply with Quote
Start a capture. Establish a connection with on the hosts that's slow/dropping then stop the capture.
Use the following filter with the IP address of problem host.
Ip.addr==<ip>

Then see why the conversation is slow or drops.

TCP conversations beging with with a SYN, SYN/ACK ACK in each direction. Look for TCP retransmists, long latencey between frames, an abnormal reset or TCP terminations etc.

Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/08/2011 :  2:47:55 PM  Show Profile  Reply with Quote
I've not been able to connect to one of the hosts having the problem. I keep getting the following message from Wireshark:

Can't get a list of interfaces: Is the server properly installed
on 192.168.20.12? connect()failed: A connection attempt failed
because the connected party did not properly respond after a
period of time, or established connection failed because connected
host has failed to respond.

Would doing this over a remote connection be giving me a problem? Also, there is a GPO turning Windows Firewall on, that may be what's blocking the ability to connect.

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/08/2011 :  3:34:08 PM  Show Profile  Reply with Quote
Doesn't matter if you are doing this over a remote connections. Firewall could be the issue depending what the rules are. You need to get it resolved.

Edited by - Douggg on 08/08/2011 3:34:29 PM
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/15/2011 :  4:31:48 PM  Show Profile  Reply with Quote
First, how do I find out the latency between the server and the slow/dropping hosts?

I turned off Windows firewall on all the computers just to be sure it wasn't causing a problem. Also each site is connected via hardware VPNs.

When I connected from the server to the hosts that are slow and did a TCP capture, I noticed that there are alot of TCP Keep-Alives, retransmitts, and resets. When I removed the TCP filter from the server and just connected to the hosts there was constant chatter between all the hosts in the site and, several computers at our main office. Most of that was with the one I sit at (I was not logged on that computer at the time) as well as, several resets and retransmitts.

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/15/2011 :  5:18:40 PM  Show Profile  Reply with Quote
Look at the delta times beween data frames. Are you seeing any re-transmissions or packet loss?
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/15/2011 :  9:27:18 PM  Show Profile  Reply with Quote
Yes there are alot of re-transmissions and resets between frames. I also looked at delta times and noticed that there is latency between many of the frames. Sometimes these occurr one way and other times it's both ways in the conversation between the server and the host.

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/16/2011 :  12:50:23 AM  Show Profile  Reply with Quote
You found where to look, appears you have a networking issue. Now you need to find our where.

Most common cause is a duplex error. Make certain the switch and server are set to auto/auto.

Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/16/2011 :  12:16:22 PM  Show Profile  Reply with Quote
That site is using a LinkSys unmanged 10/100 8-port switch so, the only thing I can go by are the LEDs. I'll have to check tomorrow when I'll be at the site.

The server which I already verified has duplexing set to auto sensing.

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/16/2011 :  6:16:23 PM  Show Profile  Reply with Quote
Try a netstat -s
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 08/16/2011 :  8:45:54 PM  Show Profile  Reply with Quote
Here are the results of the netstat -s

C:\Documents and Settings\Administrator>netstat -s

IPv4 Statistics

Packets Received = 15972559
Received Header Errors = 0
Received Address Errors = 165
Datagrams Forwarded = 0
Unknown Protocols Received = 0
Received Packets Discarded = 1363
Received Packets Delivered = 15971113
Output Requests = 15731415
Routing Discards = 0
Discarded Output Packets = 0
Output Packet No Route = 0
Reassembly Required = 0
Reassembly Successful = 0
Reassembly Failures = 0
Datagrams Successfully Fragmented = 0
Datagrams Failing Fragmentation = 0
Fragments Created = 0

ICMPv4 Statistics

Received Sent
Messages 309913 307134
Errors 0 0
Destination Unreachable 4624 1754
Time Exceeded 1 0
Parameter Problems 0 0
Source Quenches 0 0
Redirects 0 0
Echos 160506 144812
Echo Replies 144782 160506
Timestamps 0 0
Timestamp Replies 0 0
Address Masks 0 0
Address Mask Replies 0 0

TCP Statistics for IPv4

Active Opens = 130753
Passive Opens = 238063
Failed Connection Attempts = 1375
Reset Connections = 25945
Current Connections = 179
Segments Received = 15377974
Segments Sent = 15154882
Segments Retransmitted = 40243

UDP Statistics for IPv4

Datagrams Received = 215874
No Ports = 272248
Receive Errors = 1
Datagrams Sent = 224205

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 08/16/2011 :  9:43:49 PM  Show Profile  Reply with Quote
Hmmm TCP retranmists don't look bad. Segments Retransmitted = 40243 Small percentage of actual sent, which is good.

Go to Top of Page

budig
Welcome Newcomer

USA
2 Posts
Status: offline

Posted - 04/16/2012 :  12:29:09 PM  Show Profile  Visit budig's Homepage  Reply with Quote
[quote]Originally posted by ivbec

Hi All,
Ivan - I am experiencing the EXACT same issue as the one you posted about in regards to redirector / master browser causing network disconnects with SBS W2K3R2.

Did ytou ever resolve this?

Tim Jenni
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 04/16/2012 :  1:09:16 PM  Show Profile  Reply with Quote
Check to see if yo have bufferbloat.
http://www.bufferbloat.net/
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 04/16/2012 :  4:03:06 PM  Show Profile  Reply with Quote
We replaced the Linksys switch and router with a Netgear GS716T SonicWall TZ100t. Since then we've had no problems.

Ivan
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 04/17/2012 :  11:44:04 AM  Show Profile  Reply with Quote
ivbec did you ever investigate to find the actutal cause of the problem? I'm thinking if since you replaced two devices that connect to each other maybe you had a duplex mismatch.

buid might be the issue you have. Take a look at the duplex settings see if both sides are set for auto. Quick way to check is to look at the netstat settings. (Look at the number of TCP re-transmissions.)

By the way SonicWall suffers from buffer bloat.
http://www.bufferbloat.net/projects/bloat/wiki/Sonicwall_firewall
Go to Top of Page

ivbec
Here To Stay

USA
100 Posts
Status: offline

Posted - 04/17/2012 :  8:37:53 PM  Show Profile  Reply with Quote
I checked and both sides are set to auto. I also concluded it was a duplex problem. Replacing the switch resolved the TCP re-transmissions and the redirector failures ended. The computer browser imediatly started working again. The router was replaced for a different reason. The old one was failing and the Internet connection kept dropping.

Ivan
Go to Top of Page

Rastor728
Old Timer

USA
736 Posts
Status: offline

Posted - 04/18/2012 :  10:24:05 AM  Show Profile  Reply with Quote
There have been times when "auto" really doesn't work too well for some equipment, specially between different manufacturers. In those cases I prefer a manual setting for links between routing and switching devices.

What would Clark Kent do to someone who stole his identity?
Go to Top of Page

Douggg
Major Contributor

972 Posts
Status: offline

Posted - 04/18/2012 :  11:15:54 AM  Show Profile  Reply with Quote
Rastor728 - What you are saying is correct if it was Cisco equipment 10+ years ago. (They had a bug. Cisco has since fixed the bug so it’s no longer a problem. Today you defiantly want to set everything for auto/auto. And the reason for that is software updates will change the setting to the default which is auto. So if you’ve set one side to fixed, you will have a duplex mismatch.

http://en.wikipedia.org/wiki/Duplex_mismatch
(Take a look at the last two sentences.)
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
Mark Minasi's Reader Forum © 2002-2011 Mark Minasi Go To Top Of Page
This page was generated in 0.23 seconds. Snitz Forums 2000