loading critical patches for mission critical servers

Printable View

Show 40 post(s) from this thread on one page

June 15th, 2006, 03:57 PM
brokencrow

Quote:

I ALWAYS have a verified backup...

How are you verifying your backups?
June 15th, 2006, 04:25 PM
morganlefay

Quote:

How are you verifying your backups?

I restore selected files to another location...confirming I can read and restore from the media.
and that it has not somehow become damaged and\or corrupted.

MLF
June 15th, 2006, 05:09 PM
mohaughn

Quote:

Originally posted here by gore
I thought 99.999 was 5 minutes per year.

Sorry gore.. You were right.. I was thinking 99.9999%, six nines, means that you have 3.4 defects per 1million opportunities. This is what is considered the highest level of "perfection." This is what six sigma tries to achieve through process improvement.

So if you are looking at availability each minute represents one possibility for a defect. Meaning that if you are down for 10 minutes, you just had ten defects.

60minutesX24hoursX365days=525600 minutes in a year.

So if you are going to strive for 3.4 defects per 1 million opportunites you can have approximately 1.7minutes of downtime per year to achieve 99.9999% availability.

My exchange servers are currently running at 99.98%. Our target is 250dpm(defects per million). We have been hitting 200DPM for the last three years. Which gives us a sigma level of 5.04.

However, we don't calculate our availability based on system uptime. We base it on available user minutes. So we take the total number of users on each system averaged out for each month and then multiply that by the number of minutes in that month. The way that we calculate our impact to availability is to multiple the number of users on that system the day of the outage by the total length of the outage.

So, if we have 4000 users on a system and a 20 minutes outage we multiple 4000x20=80000 IUMs(impacted user minutes). If the month has 31 days that would be (60x24x31=44640 available minutes).

Multiple the available system minutes by users to get 44640x4000=178,560,000 user minutes per month which can also be called opportunites for defect. If that is the only outage for that month it works out to 99.96% availability or 4.82 sigma.

But we also get a short window each month for maintenance, and using clustering, we never come close to exceeding out maintenance window.
June 15th, 2006, 05:27 PM
gore

Ahh, yea I was sitting here reading that like "Ok, I thought for sure it was 5 minutes because that was pounded into me in Security +" but I ask anyway to keep it open heh.

You seem to work in a fairly high end place huh?
June 15th, 2006, 06:13 PM
mohaughn

Quote:

Originally posted here by gore
Ahh, yea I was sitting here reading that like "Ok, I thought for sure it was 5 minutes because that was pounded into me in Security +" but I ask anyway to keep it open heh.

You seem to work in a fairly high end place huh?

Yeah.. All of my experience for the last ten years as been in the telecomm sector, with some consulting work in the financial sector. It has it's advantages, but disadvantages as well. Most of the time really large corporations heavily segment their IT operations. For instance right now I only do exchange email and blackberry so I only get exposure to networking and AD when it impacts my servers. But luckily we do everything on our servers, hardware, security, OS, and application. And how many people can say they have an OC196 backbone between their servers..
June 15th, 2006, 06:40 PM
gore

OC196.... Good God.... Now I'm a little rusty, but that, if I am right, is around 9 GBs a second... Wow... I know an OC 256 is is 13 GBs.... Man that's awesome lol. I couldn't handle that connection. My HDs aren't fast enough and neither are the NICs.

I think you were the one who sent me a pic of one of your SUSE servers weren't you? I know you have a nice set up that's for sure.
June 15th, 2006, 11:20 PM
RoadClosed

Quote:

How do you risk unstable when the new Kernel doesn't get used unless you reboot? When you install it and don't reboot, you simply aren't using it.

I was refering to the posts on applying the kernal without rebooting. I wouldn't call that a stable system.

Quote:

I don't know of a single instance where someone is running a 99.999%(actually it is 99.9999%) available system,

I don't even think NORAD reached that. They do however have failover out the ass.
June 16th, 2006, 01:14 AM
gore

RC: The new Kernel wouldn't be in use. That's why on SUSE it says if you decide to use the new Kernel you have to reboot. It doesn't start use until a reboot. My mail server right now has a Kernel update installed and I haven't rebooted for it yet in 2 months. IT won't use the new one until a reboot happens. It just stores it for when you do as far as it goes.

As for NORAD....Ugh, just once, I want to see what kind of systems they have. I can't even imagine. I mean from what I've heard, I've never been there so I can't say for sure, but from what I hear they have system up and ready in case the fail over systems go too.... Mmmmmmm.
June 16th, 2006, 01:59 AM
mohaughn

Gore- Yeah, it is a 10gb link. There are actually several of them. We are a tier1 carrier. I'd be lying if I said that all of that bandwidth was used just for us though. It is using MPLS switching so there is actually a good deal of other corporate traffic travelling over our backbone at the same time. Although it is funny when we have AD replication issues and Microsoft immediately wants to know about network saturation... Yeah, I don't think so..

We actually have billing systems on MVS that have been up for years. MVS is great like that, it is completely compartmentalized. So there is hardly ever a need to reboot the entire system. There could be individual applications running on that mainframe that haven't been up for years, but the core systems have been. And a lot of those systems haven't had software updates for years other than stuff done for Y2k.
June 16th, 2006, 02:11 AM
gore

This is hopefully not tto off topic, but what kind fo cooling do you have for a system like that? It can't possibly be running cool after years and years of being up..

And is it MVS or VMS? Kind of got confused there.

Could be a typo, but then again it could be something I haven't heard of so I have to check that.

Man, I never get to play with high end stuff like that. The only thing I have is PC hardware, and OS wise, I have Solaris, Linux, BSD, windows, DOS, BeOS....

Show 40 post(s) from this thread on one page