Did you hear the one about the company that bought 50 new servers and only had enough electrical power to run six?
That's the kind of story Phil Pokorny hears every day, and it's no joke to the
"It's the seemingly little things that system administrators overlook that tend to cause big problems," says Pokorny, Penguin Computing director of engineering. He offers these tips to help system administrators remember the little things that count.
Set up a NFS (network file system) install area to simplify installing software on multiple Linux servers.
You'd be surprised to know how many sys admins still install software by feeding installation CDs one at a time into more than one server. Pokorny is more appalled than surprised, as he sees this happening often in the field.
"One of the simplest things to do is set up an NFS install area with Red Hat to basically copy the CD images into a directory and then export that via NFS," says Pokorny. "SuSE has a similar capability, also really easy to use with a utility called YAFT. Use YAFT with SuSE to create the NFS install area."
Once that's done, just boot up the CD, set up the network install and point it at the NFS server. "Then you never have to repeat any more CDs and can easily install software on many servers at once," Pokorny says.
It's also possible to customize that NFS install environment to automate answering installation questions, even to the point of automating the initial boot.
Measure server racks before buying.
Don't scoff. Pokorny has seen many deployments and expansions stalled by servers not fitting into racks.
"Getting the servers to fit in the racks can be a challenge if you don't purchase them all from the same vendor," says Pokorny. "There are many different styles."
Identify the mounting holes in the rack, and check with your server vendors, so that they can confirm that their rails in fact fit your particular style of rack.
"Do this task up front so that you don't get bit by it later on when you are ready to deploy," says Pokorny. "Then, you are waiting on some $100 piece of rail equipment and can't deploy."
Don't underestimate the power needs of new servers.
"Over the last two years, the power usage per machine has easily doubled and in some cases tripled," says Pokorny.
Before buying servers, compare their power requirements to the data center's capacity. Many data centers are not set up to handle the amount of electricity required by a modern server, says Pokorny, who has seen many deployments stalled by underestimations of power needs.
Power outages caused by overloads in data centers are common, too, Pokorny says. To prevent them, on a scheduled basis, track of how much power is being used by the data center. Get actual power measurements, either by measuring usage yourself and consulting with server and equipment vendors. Be sure to measure power usage in actual workloads or worst case workload situations.
Don't rely on product's power usage specifications alone. "The rating of a power supply itself is not a good indication of the actual power that the system uses," says Pokorny. "They are frequently over-specified, so your power will be rated for, maybe, 600 watts but the system might only actually use 300."
Factor in other electricity drains, particularly air conditioning.
"For every watt of power used by a server, there is an equivalent amount of air conditioning needed," says Pokorny. "People don't properly size the air conditioning needed related to the amount of power that they have. That's one of the biggest issues in setting up new data centers or deploying servers in existing data centers."
Use IPMI when it's available.
The Intelligent Platform Management Interface (IPMI) is a standard that defines how administrators monitor system hardware and sensors, control system components and retrieve logs of important system events to conduct remote management and recovery. IPMI is included in the majority of new servers.
"What is really great about IPMI is that it actually lets you administer a system remotely over a network that it is already connected, so you can turn the box on, off, reset it, check its temperature, check the fan speed and more," says Pokorny. "If there is a fault in the box, it will be logged by the onboard diagnostics and you can retrieve that diagnostic log."
"The value of being able to access a server remotely is tremendous," says Pokorny. "If you're a system administrator, and a server goes down at two in the morning, you don't have to drive to the office just to hit the reset button. From your office at home, you can type in the appropriate commands and have the system reset and back up. Then, you can go back to bed."
This was first published in October 2005