Although LAN networking hardware doesn't track Moore's law, it has been tracking a nice curve of increasing performance at a decreasing cost. Where chip density and processor power are ever inching forward, network bandwidth in the server room will plateau for a while, and then jump, often by a factor of 10 (or an order of magnitude for those of you still counting in decimal). And like most everything else in this industry (uh, saving...
for software), Ethernet networking gear has become much more reliable and far easier to administer than in the olden days.
Here's how it's supposed to work. You plug your server into your nice Ethernet switch using a commodity cable, auto-negotiation occurs, and everything works peachy keen at the fastest speed settings commonly available on both ends of that cable. However, this is not always the case. If both sides don't agree on the settings, each will configure itself as it sees fit. To put it bluntly, auto-negotiation fails.
As I've observed the auto-negotiation problem, the link will seem to work fine -- you're able to ping the box, and to log in -- so you go about your business because you assume that your work is done. Then, as soon as you put some real traffic on the line, you'll probably notice that something's awry. Depending on the logic in the NIC device driver and how your switch is configured, throughput can be miserable. Indeed, your 100Mbps full-duplex connection may exhibit performance no better than a fast analog modem.
In these extreme cases, you should be able to note many collisions or framing errors on your system. Basically, both ends of the link are interrupting each other in a pathologically rude manner. You can typically view this with your system's version of "ifconfig" ("netstat -e" under Windows). Also note that sometimes the problem exists only in one direction, which can make it more difficult to troubleshoot. I typically like to push and pull tarball of the Linux kernel source back and forth from the box before I'm confident that the link is operating correctly. (Both "scp" and "lftp" do a good job of reporting link throughput for this task.)
So, now you might be thinking: "No, that can't happen here." Well, you may be surprised. I've seen it pretty frequently across all types of hardware and operating systems, even in places where folks should know better (like in data centers) and with network gear ranging from little plastic four-port SoHo switches to full-blown "enterprise" concentrators.
Okay, so what to do about it? It's best to be explicit. It would be nice if the computer and switch could sort out their differences, but since they can't, we need to tell them exactly what we want them to do and how to go about it. On the switch, option the port(s) with fixed speed and duplex settings. On the server or workstation, configure the NIC to match. Here's a list of common tools you can use to do this:
- Linux: mii-tool (and its cousin mii-diag); ethtool
- Solaris: ndd; hmeconfig
- IBM AIX: smit or smitty; entstat (to view framing errors and such)
- Windows: You can right-click on the device and modify the device driver properties.
WARNING: Almost all of these operating systems give you enough rope to hang yourself. For example, say you're on the wrong end of the link (i.e., not on the console), and you disable auto-negotiation and then force the NIC to a setting that's completely incompatible with the switch port. Then, you may find yourself walking -- no, running -- toward the server room to get on the console and undo what you've just done. In other words, you can knock the box right off the network if you try hard enough.
One more tip for Linux admins: Many of the modular device drivers will allow you specify the parameters at module load time, so you can set these up in your /etc/modules or /etc/modules.conf file. The output of "dmesg" corresponding to the module initialization will often tell what parameters were negotiated for the link.
So, check to make sure that things are talking as they should on layer 2 before you spend an hour walking through your Apache configuration with a fine-toothed comb to combat that "slow response time." Also, check it out before you jump to conclusions about needing hardware upgrades because developers are complaining about the speed of the CVS server. Ahem...I'm guilty of both of these mistakes. This is a quick and easy step that should be the first in any server tuning exercise.
Tony Mancill is the author of "Linux Routers: A Primer for Network Administrators" from Prentice Hall PTR. He can be reached at firstname.lastname@example.org.