The problem with write performance-related issues, is that often they hide behind other parameters. In all cases however, the top command is a good place to start. The wa parameter in the CPU line indicates the amount of time that your system has been waiting for the I/O-channel. Typically, this indicates a slow storage channel.
Click on image for larger version
A high value at the wa parameter indicates that the storage channel is suffering
Just looking at top however is not good enough. Let's have a look at a small test. In this test, we have written the current memory state to disk, using dd if=/dev/kcore of=/kcore.img bs=4096. On a 4 GB machine, that is a lot of work for the machine to do. While performing the job, everything else really slows down, which means there is a performance problem.
But, the problem with top is that it's not really easy to see the write performance problem. It all depends on the amount of CPU cores you have. On a 16-core server, the write problem may claim all CPU cycles on one CPU. You would however not see that from the generic top overview as this gives you the average for all CPU's together. So with a CPU core that is completely claimed by writes, you wouldn't see much more than 6% at the wa parameter in top. To get more detail, the first thing to do, is to press the 1 key in the top interface, which gives you a line for each CPU core. On the test system used for writing this article, there are two cores only, so the results are not spectacular. But, on a multi-core system the differences displayed may be important.
So, if only one core out of 16 cores is completely busy waiting for the slow storage channel, then the 15 remaining cores can do the work, right? Too often, the answer is "no." If you just have one storage channel, then all CPU's need to go over that one single channel. If one CPU is completely busy waiting for a storage channel, then the other cores won't be able to get prompt reactions from the storage channel either. So it may look all right from the top window, but performance could be terribly off.
Fortunately, there is iotop and it gives information about the most active processes with regard to I/O. Most Linux distributions don't install it by default, so make sure you install it manually, using your distributions meta package handler (for instance: zypper install iotop if you're using SUSE). The good thing about iotop, is that it shows you which is the most active process with regard to I/O at the moment and how much I/O it is generating. If you compare the I/O load caused by this process with the capacity of your storage channel, you'll know immediately if you have a storage problem and if so, where it comes from. Then you can troubleshoot this issue, and optimize your Linux server write performance.
ABOUT THE AUTHOR: Sander van Vugt is an author and independent technical trainer, specializing in Linux since 1994. Vugt is also a technical consultant for high-availability (HA) clustering and performance optimization, as well as an expert on SLED 10 administration.
This was first published in March 2010