March 7th, 2003, 10:55 PM
The Wierd Tracert Results thread got me thinking about a long-time problem I have on my network that no one has been able to resolve. Two servers, identical hardware, identical operating system (NetWare), pretty similar amounts of data on both. Both are simple print/file servers, nothing strange running on them. I was called in because the local SysAdmin noticed that one box took about 45 minutes to do the nightly back up, one takes almost 6 hours. No one is accessing these boxes at night, and there shouldn't be any open files. They both back up to the same device using ArcServe 6.6
So, my guy couldn't find anything wrong with the servers or the back up system. I got my start in IT as a wire dog, and now am the Cisco guy, and he was (is) pretty sure it's an infrastructure problem. So I went down to the site with my handy-dandy Fluke meter and did a network health check. Everything looks totally normal. Both connect to an old Cabletron fiber MUX, and are on the exact same network segment. In fact, their addresses are x.x.x.19 and .20.
The only bit of strangeness I could find was this. When I pinged .19 from a host on the segment, my TTL didn't decrement a bit, that is, it remained 255 as it should have. When I ping .20 from the same host, on the same segment, the TTL decrements to 128. It is .20 that takes forever to back up.
I tested and changed the patch cords out, and moved which port on the mux the servers attach to, nothing changes the results. There is only a dumb L2 switch and the mux between the servers and the host I pinged from, so it's not a misconfigured VLAN, or anything of that nature. I've run this past a couple of people, but no one seems to know what would cause this. I have no idea if the TTL wierdness has anything to do with the back up issues, but we really can't find anything else wrong.
March 7th, 2003, 10:57 PM
Hrmm.. what version of netware?
And how many users on each box? (how many are saving/printing)?
How much free space on each?
How much space taken up?
March 7th, 2003, 11:05 PM
Heh, odd indeed!
Free guess: could the subnet mask be misconfigured somewhere?
Otherwise, if you try a tracert instead of the ping, are the routes the same (normally no hops at all, but I'd expect something weird from a tracert to the .20 host...)
Credit travels up, blame travels down -- The Boss
March 7th, 2003, 11:10 PM
Wow MsMittens you are speedy with the reply.....especially since I know you are over on IRC as I type this too.
They're both NetWare 5.0, the rest I'm going to have to ball park because I'm not responsible for them on a day to day basis.
If I was to hazard a guess however, I would say both have about 200 users each on them, I don't know which of them (if either) houses the replica. Everything at this site is in it's own tree, but there are two more servers there on a different segment. Both do, I know, hold user files, just office documents, etc. Nothing heavy. As I recall, both have mirrored 18.2 GB hard drives, and I seem to recall about 5 gb free on one, and 6 gb on the other. All the volumes (sys included) had plenty of space, and had been purged recently, so I didn't pay a whole lot of attention as it didn't seem to be the cause of the problem.
--edit: Ammo, I checked for that. Subnets are identical. Both tracert's show 0 hops. I also did traceroute from an ssh connection to the router back into the network, and it still showed 0 hops to both.
March 7th, 2003, 11:17 PM
You think that one might be housing the backup?
I can ask a friend of mine as he does a fair amount of Novell work and he's a CNE. Maybe he might have an idea of what's causing this.
March 7th, 2003, 11:28 PM
Well they all hold a replica, what I meant to say is I don't know which holds the master replica.
I appreciate any help you or your friend could offer, but please don't spend too much time on it. I'm one of 6 CNE's at work and we have a couple of dozen CNA's.
I am, I admit, the least guru-ish of the CNE's when it comes to NetWare because I run the WAN so I could certainly miss a NetWare issue. I've also put this problem in front of a couple of CCIE buddies I have and they're been stumped too.
As none of the users complain, we aren't too worried about it, we have much more pressing concerns, I just wondered if anyone knew what it was.