How do providers manage their hardware?

raindog308 · Feb 13, 2014

Some things I've always wondered about with providers and how they run their operations.

My background is big IT - my employers have typically owned their own DCs, have local sysadmins in each site, big support contracts, etc. So I've done a lot of datacenter-oriented work, just not with public DCs. Obviously, no one who advertises here runs his own datacenter.

So let's say you want to stand up a VPS hosting company in, say, Kansas City (just picked that at random). You buy a server, ship it to your house, configure it how you like, and then ship it off to the DC in KC where you rent colo.

Let's assume

- we're using rack-grade hardware here, not Best Buy gear

- the DC is remote, not something the provider can just drive over to

So my questions:

(1) A while later, let's say a hard drive goes bad. We're using RAID of some flavor so the node stays online. Do providers typically keep onboard RAID spares (i.e., hot spare in the array), and then replace the hard drive?

(2) Where does the replacement drive come from? The DC staff charge some "remote hands" fee to do the work, but do providers typically keep spare hard drives in some sort of holding area there to hot-swap it out? Or do they ship one?

(3) I suppose if you hot-spare or site-replace the hard drive, the user only suffers while the array rebuilds. If you don't have a hot spare or spare on site, users have to wait while you ship a drive and they could run with a degraded array/poor performance/at risk for some time.

(4) What if the whole node is offline - let's say the motherboard dies. Do providers typically keep a spare node - e.g., they've got 5 nodes in a DC, of which 4 are for customers and 1 is a high availability spare? Otherwise I'm guessing the fastest resolution is buy/warranty a new mobo, have DC remote hands install it (I'm guessing this isn't cheap), etc. - could easily be 24+ hours downtime. Or overload other nodes while they move customers off...that could get ugly. Or do providers just sort of "play the odds" and assume that over the life of a server, only drives are likely to fail.

(5) Same thing if perhaps the node just needs to be upgraded...let's say you want to add more hardware, RAM, whatever. There's remote hands, but what happens to customers in the meantime?

My suspicions are that kiddie hosts run everything "naked" - no hot spare, no spare onsite, no spare node. Bigger hosts run everything redundant - hot spare in the array(s), spares of drives/ram/boards/etc. on site, spare node.

Now for smaller hosts who rent servers, the dedi provider owns a lot of this responsibility. If a mobo goes bad, they're on the hook to pay for replacement, have spare parts, etc.

(6) But for those who rent servers, wouldn't they also need to rent a spare if they want to prevent customers from suffering during an extended outage if the server fails?

blergh · Feb 13, 2014

I suppose one would keep spares and additional hardware around if the situation or clients demanded it. The above mentioned issues is one of the reasons i rent my hardware direct instead of colo, if there ever is any issue I do not have to deal with it directly but could rather have someone to call up and yell at.

Expecting a $7/1GB-vpshost to have hot failovers, nope.

concerto49 · Feb 13, 2014

(1) Keep spares. You don't want customers shouting non stop

(2) Don't use up your full rack and put them at the bottom. You can also get some room in the store room.

(4) You usually keep some spare parts. You can't be god mode and keep everything though. The spare parts might also die.

(5) What do you mean what happens? Schedule a time for the upgrade, charges start then? For VPS customers? Don't ever upgrade a running node unless absolute. At least transfer them on the same VLAN to a different node.

(6) No, spares should be part of the rent. That's why you're renting. Ok, depends on your contract.

MartinD · Feb 13, 2014

We always keep spares in the rack - you can get rack drawers/storage for kit too.

KS_Phillip · Feb 14, 2014

We keep spares on-site in 4u locking drawers. If anything more in-depth than a hdd or memory replacement is required, we fly someone in to do the swap along with either on-hand replacement or overnighted replacement.

notFound · Feb 14, 2014

Keeping spares goes without saying, to be honest most of our rack is tuff crates/reallyusefulbox's or whatever you call it, they fit perfectly and have bits and bobs in them and don't make too much clutter. I made sure to put in as part of the contract with the DC that if we need assistance we can get it at their normal rate (remote hands only comes on shared colo with them).

Luckily my DC has some really nice techs that even waive the remote-hands fee. Having said that even if I know the tech is competent I don't really like them doing things if I can do it, emergencies like RAID cards for example that have happened at night it's fine but during the day even if it's small I prefer to go in and check all the equipment out and see that everything is in order. It's about a 20-minute journey so not too bad.

HostSailor · May 2, 2014

Keep a spare at all times near the servers, an extended downtime would result in loss, bad reputation, etc, you name it.

Profuse-Jim · May 2, 2014

If you're coloing you'll need to keep spares as the DC won't replace them for free. They can replace them, but usually the price is marked up.

drmike · May 2, 2014

Spares are only happening usually where you have a full rack commit. So that eliminates a big chunk of the el cheapo VPS offerers.

What you store (usually bottom of rack) depends on diversity of the gear you have. If standardized on one server line, then it is easy.

I typically like to see a few fans, a power supply, few sticks of RAM and several drives. Might consider spare RAID card if running a bunch of storage.

Also good to have local tech contractor or small reasonable firm with competence if you need hands and beyond reasonable capability or trust of the DC staff. Folks rely on DC hands for all sorts of stuff and I think really, they are expecting too much from these folks considering nature of DCs and volume of their work vs. the clock. If hands time is billable full hours $150 and up, then sure, I expect more. Many people bark about these fees and haggle like cheapskates. When your company is ablaze due to failure,a few hundred shouldn't be killing you or you aren't running a real business.

fizzyjoe908 · May 2, 2014

You don't necessarily need a full cabinet to store these spares in. Some datacenters, including the ones we use, offer lockers or storage space that is excluded from the rest of the building for storing spare parts. This space isn't free of course, but it might be worth the cost than using up some valuable U's.

zionvps · May 8, 2014

the data center iv'e been associated with has most of the spare parts. Once they were missing a ssd, but after payment, they bought from somewhere and it was up within a day. There was no extra charge associated with it because it came with the colocation agreement (24x7 maintenance if hardware fails). But if your node goes offline in a serious hardware issue, if you ask they will boot up with their own server with your disks, but they charge extra for that until replacement is arranged.

William · May 10, 2014

Obviously, no one who advertises here runs his own datacenter.

Uhm, no. 2 by now. We keep spares of HDDs onsite and ship other components express, we also have support contracts via HP.

How do providers manage their hardware?

raindog308

vpsBoard Premium Member

blergh

New Member

concerto49

New Member

MartinD

Retired Staff

KS_Phillip

New Member

notFound

Don't take me seriously!

HostSailor

Member

Profuse-Jim

New Member

drmike

100% Tier-1 Gogent

fizzyjoe908

New Member

zionvps

Member

William

pr0