Home > Enterprise Linux Tips > Administrator > Understanding Linux system performance management using top
Enterprise Linux Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

ADMINISTRATOR

Understanding Linux system performance management using top


Sander van Vugt, Contributor
03.09.2009
Rating: -4.50- (out of 5)


Enterprise IT tips and expert advice
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


If there's something wrong with the performance of your Linux server, chances are that you're already using top to find out what's happening. It seems however that few people really know how to tell what their system is doing from the information that top provides. Here I will explain how to understand the performance data that top provides.

When starting top, make sure that you are in root. To start, open a console on your favorite Linux distribution and enter the top command. The result should look similar to this:

The first part of relevant information that top provides can be found in the first line: the load average parameters. These describe how busy your computer is at the moment. The average workload of your server is always given in three digits. Each represents the load average for the last minute, the last five minutes and the last fifteen minutes. You should always start by interpreting these numbers, as they tell you if your system is overloaded or not.

To understand the load average values, you must relate them to the number of CPU's or CPU cores in your computer. If you're not sure, just press the 1 button when the top interface is active, this will give you a line for each CPU core that is present in your computer. When a CPU core has been completely busy in the last minute, top will show you 1.00 if it's a one core system. If you have eight cores installed in your computer, and one has been completely busy, while the others were doing nothing, top will show you 0.125 as the value in the load average. In order to interpret the value in the load average lines, you need to know the normal value for your server. For instance, on a four-core machine, that would be 4.00. Anything above that value is bad, as it indicates that queuing occurs and processes are waiting for their slice of system time. Anything below this value is good. If your system is getting beyond the ideal value for that system, the next step is to determine what exactly is ha...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


ppening. Listing 2 gives an example of a one-core system that is too busy:

If the workload is getting too high, you need to find out what is happening. To do this, you have to look at the CPU line(s). You will see no less than eight different parameters, and of these, only three really matter. First is the "us" parameter. This indicates the amount of time your system is busy handling requests that were made in user space. If a task is not in user space, it is a high-privileged task that runs in system space, which you can see reflected in the "sy" parameter. In kernel space, processes can communicate directly to the drivers. Therefore, you should worry more if your system gives a high load in system space. The third parameter that is important in the CPU line is "wa." This stands for waiting, and indicates the amount of time your system waits for I/O-devices. A high parameter here indicates a problem on the I/O-channel, normally this is a hard disk that is too slow or a misconfigured network.

The second listing example shows that the system is way too busy waiting for I/O. This is far too common, many times system performance problems are related to slow I/O devices. One solution is to install a faster hard drive, but before doing that, it is a good idea to check the BIOS of your server and see if there are parameters that you can tune. One of the most important candidates for that, is the write cache parameter. By writing data to write cache before writing it to the disk platters themselves, you can dramatically reduce waiting times. Since write cache is about 1,000 times as fast as the hard disk, chances are that you can win a lot by enabling this feature.

Use top to reveal memory efficiency on Linux servers
Apart from the information on how busy your system's CPU is, top also shows you how memory-efficient your server is. You can find information about this in the lines that start with Mem: and Swap:. Let's start discussing swap. This is RAM that is emulated on the hard drive that your computer should never use. There are some exceptions though: if your server runs Oracle, SAP or any other specific application that is built to use swap. But normally, Linux starts swapping only if it is totally out of normal memory. In an exception, your server could pre-allocate some swap so that it can use it faster if it's needed. But in most cases, you should install new RAM on a server that starts swapping.

After you have verified that your system isn't swapping, you should find out what it is doing with available memory. To understand memory, you should know that Linux uses memory quite efficiently. If there's no real need for memory to service processes, it will be used as read cache or write buffers. The read cache contains files that were recently read from your computer's hard drive. The kernel just keeps them in RAM, because you might need them again and if you do, it's a lot faster to serve these files from RAM than from hard disk. The write buffers on the other hand, are used as a waiting room for your server's hard drive. Instead of offering data directly to the hard drive, the operating system places them in the write buffers where they can wait until the hard disk decides it has time to flush these write buffers (e.g., writes them to disk). This also gives you a performance benefit.

The nice thing about read cache and write buffers is that the operating system can make them available instantaneously when it needs memory. Therefore, you should add the read cache and write buffers to the total amount of available memory. A nice way of doing this, is by using the free -m command. On the +/- buffers/cache line, you can see how much free memory your computer really has.

As you can see in this listing, at first sight it looks as if this server almost has no more available memory, but if you know that buffers can be flushed immediately, you can see that it has largely enough available memory.

Determining active processes on a Linux server
The last interesting part of top is where it shows the most active process on your server. This is not hard to determine: the most active process is listed first on the process list. If this process uses too much system resources, top offers some options to handle it. You can terminate it by pressing the k-key from the top interface. Top will then ask you what signal you want to send to this process. You should always try signal 15 first, this represents the nice way of asking the process to please stop its activity. If that doesn't work, use signal 9, which just terminates the process without further waiting.

Another way of taming a process, is by "renicing" it – i.e., adjust the priority the process is using. To do this, press the r-key from the top interface. By giving a process a negative nice value, you increase the priority with regard to other processes. By assigning a positive nice value, you give more room for the other processes. The values you can use are between -20 and 19. It's a good idea not to assign the value of -20. By doing this, you would give the highest possible priority to a process, thus allowing it to leave no time for the other processes (if it is a busy process).

Was this tip helpful? Please email the editors and let us know.

ABOUT THE AUTHOR: Sander van Vugt is an author and independent technical trainer, specializing in Linux since 1994. Vugt is also a technical consultant for high-availability (HA) clustering and performance optimization, as well as an expert on SLED 10 administration.

Rate this Tip
To rate tips, you must be a member of SearchEnterpriseLinux.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Enterprise Linux Web Server & Application Server
HomeNewsTopicsITKnowledge ExchangeTipsBlogsAsk the ExpertsMultimediaWhite PapersIT Downloads
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2003 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts