As a Linux administrator, you should already know the basics of performance tuning. You probably recognize classic Unix/Linux monitoring commands such as vmstat and top, while system profilers such as Oprofile and the GNOME System Monitor may be new to you. If you were born on Planet Unix or are rather new to Linux, you might be surprised to find that typing in iostat or sar will get you nothing, unless you download the sysstat suite of monitoring tools, which include support for sar, iostat and mpstat.
Performance tuning is not only about running some commands, but it is about proactively monitoring your system, particularly when there are no performance problems. Here we cover Linux performance tuning methodology and provide you with detailed steps that should help you throughout your tuning lifecycle. We'll introduce some of the monitoring tools you should use, provide you with an overview of performance tuning, and discuss considerations that can impact overall performance.
When investigating a performance problem, start by monitoring the CPU utilization statistics. It is important to continuously observe system performance because you need to compare loaded system data with normal usage data (i.e. the baseline). The CPU is one of the fastest components of the system. If CPU utilization keeps the CPU 80-90% busy, overall system-wide performance will be affected. In order to improve upon this, one should follow a careful method for system tuning.
Linux performance tuning methodology
Baseline Before doing anything, you must establish a baseline. The baseline will serve
- as a reference point for your analysis, prior to any tuning of your system. The baseline itself should combine system monitoring reports, as well as a snapshot of the system hardware. When monitoring your system, try to use a minimum of two tools. This way, you'll be able to validate your analysis using multiple sets. Also, try to establish a Service Level Agreement (SLA) with your functional and business teams, which defines acceptable levels of performance for your system.
Stress test/monitor Here is where you stress your system. You will be monitoring the system at a peak workload, prior to making any changes to your system. By doing this, you will be able to have before and after snapshots of your system when you have completed tuning your environment.
Identification of bottleneck At this stage you will analyze the data you have collected to determine the location of any potential bottleneck. In doing so, you will make certain that you use several monitoring tools so that you have conclusive, corroborating evidence.
Tuning It is time to fine tune your system for optimal performance. You will find that there are many different ways to tune a system, depending on the nature of the bottleneck (whether it is CPU, Memory, I/O or Network related.) It is very important that you make only one change at a time in order to determine which operation successfully resolved the problem. Often, tuning up a system involves more common sense than it does, say, kernel tinkering. For example, if you are running several demanding workloads at the same time each day, you may determine that scheduling them off-hours or during a batch cycle is the best solution.
Repeat Repeat all steps, starting with the stress test. Occasionally, you will find that by fixing one bottleneck you have created another. I have seen this many times, especially with CPU problems. By tuning your CPU you allow the system to work harder. In doing so, you may have inadvertently created a memory or I/O problem. This is not a bad thing, as you are now getting closer to resolving some of your performance woes.
One important point that cannot be understated is that you should always be monitoring your system. Performance tuning is a proactive process, which too often occurs in reactive mode only (i.e. when users start to scream and holler.)
First we will take a snapshot of our system, to determine exactly what we are dealing with. Viewing the redhat-release file, running uname and runlevel commands gives us a nice preliminary view. The RPM that we will use is sysstat-4.0.7-4.rhl9.1.i386.rpm.
[root --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> _29_137_29 system]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 5 (Tikanga) [root@172_29_137_29 system]# uname -a Linux 172_29_137_29.dal-ebis.ihost.com 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:14 E ST 2007 x86_64 x86_64 x86_64 GNU/Linux [root@172_29_137_29 system]# runlevel N 5
With these commands, we find out that we are using RHEL5 on a 2.6 kernel in runlevel 5. Another useful command is dmesg, though it gives you so much information (too much to show here), you should always save it to a file.
The proc directory is a place where one can gather all sorts of information directly from the kernel through the proc filesystem. Let's look at CPU and memory information.
# cd /proc [root@172_29_137_29 proc]# more cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.60GHz stepping : 1 cpu MHz : 3600.259 cache size : 1024 KB [root@172_29_137_29 proc]# more meminfo MemTotal: 8180780 kB MemFree: 6561876 kB Buffers: 131400 kB Cached: 1270592 kB SwapCached: 0 kB Active: 881748 kB Inactive: 589724 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 8180780 kB LowFree: 6561876 kB SwapTotal: 1020116 kB SwapFree: 1020116 kB Dirty: 848 kB Writeback: 0 kB AnonPages: 69424 kB Mapped: 30836 kB Slab: 117028 kB PageTables: 4976 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 5110504 kB Committed_AS: 122244 kB VmallocTotal: 34359738367 kB VmallocUsed: 5040 kB VmallocChunk: 34359733223 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB
For disk, we'll use the traditional df command:
[root@172_29_137_29 /]# df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 18145120 3708228 13515164 22% / /dev/sda1 101086 10503 85364 11% /boot tmpfs 4090388 0 4090388 0% /dev/shm 192.168.1.12:/stage/middleware/LINUX 77529088 62420104 15108984 81% /stage/middleware 192.168.1.12:/userdata/20004773 10485760 85952 10399808 1% /home/u0004773
The lsmod command will give you the list of installed kernel modules.
[root@172_29_137_29 /]# lsmod | more Module Size Used by oprofile 139953 1 e1000 156497 0 autofs4 56393 2 hidp 83649 2 nfs 286093 2 lockd 96209 2 nfs
Perhaps my favorite snapshot utility is sysreport. It is perfect for archiving (it dumps the output into a tar archive) and/or sending off data to engineers for further analysis.
[root@172_29_137_29 sys]# sysreport -norpm This utility will go through and collect some detailed information about the hardware and setup of your Red Hat Linux system. This information will be used to diagnose problems with your system and will be considered confidential information. Red Hat will use this information for diagnostic purposes ONLY. Determining Red Hat Linux version: [ OK ] Determinding your current hostname: [ OK ] Getting the date: [ OK ] Checking your systems current uptime and load average: [ OK ] Checking available memory: [ OK ] Checking free disk space: [ OK ] Checking Getting information about RHN Gathering information on SELinux setup Collecting log files from RHN [ OK ] Please enter your case number (if you have one): 123 Please send /root/172_29_137_29.dal-ebis.ihost.com-123.20070930214831.tar.bz2 to your support representative.
My favorite monitoring command is vmstat, an old school Unix command line utility. I love it because it is a quick, easy way of looking at the overall heath of your system (RAM, I/O and CPU).
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 6564232 130912 1270228 0 0 1 1 22 10 0 0 100 0 0 0 0 0 6564232 130912 1270228 0 0 0 0 1204 1827 1 1 99 0 0 0 0 0 6564232 130912 1270228 0 0 0 0 1161 1602 0 1 99 0 0 0 0 0 6564232 130912 1270228 0 0 0 900 1161 2136 1 1 98 0 0 0 0 0 6564108 130912 1270228 0 0 0 1592 1138 1902 1 1 97 2 0 0 0 0 6564108 130912 1270228 0 0 0 0 1169 2322 1 1 98 0 0 0 0 0 6563984 130912 1270228 0 0 0 0 1120 1896 1 1 99 0 0 0 0 0 6563984 130912 1270228 0 0 0 0 1074 868 0 0 99 0 0 0 0 0 6563984 130912 1270228 0 0 0 24 1179 2200 1 1 99 0 0 0 0 0 6563984 130912 1270228 0 0 0 1340 1141 2436 0 1 97 1 0
Other useful command line utilities are mpstat, sar and iostat. Mpstat and sar are used for CPU information, while iostat is used for I/O. Free is a nice utility which displays the total amount of physical RAM and swap on the system. It tells us what is used, free and shared.
[root@172_29_137_29 dev]# free total used free shared buffers cached Mem: 8180780 1616108 6564672 0 130880 1270132 -/+ buffers/cache: 215096 7965684 Swap: 1020116 0 1020116
A general purpose process monitoring tool is OProfile. What it does is use the performance monitoring hardware on the processor to retrieve kernel type system information about the executables on a system. To start Oprofile:
[root@172_29_137_29 ~]# opcontrol --start Using default event: GLOBAL_POWER_EVENTS:100000:1:1:1 Using 2.6+ OProfile kernel interface. Using log file /var/lib/oprofile/oprofiled.log Daemon started. Profiler running. [root@172_29_137_29 ~]#
Then we save the data to the file:
[root@172_29_137_29 oprofile]# opcontrol –dump
Then we run the report:
[root@172_29_137_29 oprofile]# opreport CPU: P4 / Xeon with 2 hyper-threads, speed 3600.26 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 GLOBAL_POWER_E...| samples| %| ------------------ 130995 96.3787 no-vmlinux 2473 1.8195 libc-2.5.so 516 0.3796 oprofiled 510 0.3752 bash 260 0.1913 radeon_drv.so 199 0.1464 ld-2.5.so 155 0.1140 libcrypto.so.0.9.8b 145 0.1067 sshd 119 0.0876 libtermcap.so.2.0.8 112 0.0824 libpython2.4.so.1.0.#prelink#.sdY6Ku (deleted) 92 0.0677 libpthread-2.5.so 73 0.0537 irqbalance 60 0.0441 sendmail.sendmail 56 0.0412 libusb-0.1.so.4.4.4.#prelink#.JepXp7 (deleted) 40 0.0294 ls 18 0.0132 pcscd 9 0.0066 libxaa.so 8 0.0059 libevent-1.1a.so.1.0.2 7 0.0052 selectmodule.so 6 0.0044 grep 6 0.0044 libglib-2.0.so.0.1200.3.#prelink#.mgeaha (deleted) 6 0.0044 init 6 0.0044 gpm 5 0.0037 gawk 5 0.0037 libselinux.so.1 5 0.0037 automount 4 0.0029 libnss_files-2.5.so 4 0.0029 crond 3 0.0022 gconfd-2 2 0.0015 more 2 0.0015 libacl.so.1.1.0 2 0.0015 librt-2.5.so 2 0.0015 libsepol.so.1 2 0.0015 syslogd 2 0.0015 Xorg 2 0.0015 libgconf-2.so.4.1.0.#prelink#.bhvs1R (deleted) 1 7.4e-04 date 1 7.4e-04 libpcre.so.0.0.1 1 7.4e-04 expr 1 7.4e-04 ophelp 1 7.4e-04 tr
Though it is true that some of this information can be ascertained using ps commands, I like using this because of its reporting capabilities and the extent of the data that it keeps.
No monitoring article would be complete without mention of top. Top is a nice character based tool that illustrates CPU, memory and process information. I like this tool when I'm doing performance monitoring over a given period of time and I need to look at many different facets of the system on one screen.
Figure 1 – top
Finally, for those that prefer a graphical interface, the GNOME System Monitor is your tool. After getting X running, type in:
Figure 2- Resources view
There are three tabs for processes, filesystems and resources.
Next, we'll fine tune the Virtual Memory Manager (WMM) of Linux. VMM manages the allocation of both RAM and virtual pages. Given the choice between RAM and paging space, the preference is to use physical memory, if the RAM is available. We will be working with sysctl, which is used to configure kernel parameters.
Let's pull out only the memory tunables by looking for tunables with the vm prefix:
[root@172_29_137_29 sys]# sysctl -a | grep vm | more vm.min_slab_ratio = 5 vm.min_unmapped_ratio = 1 vm.zone_reclaim_mode = 0 vm.swap_token_timeout = 300 0 vm.legacy_va_layout = 0 vm.vfs_cache_pressure = 100 vm.block_dump = 0 vm.laptop_mode = 0 vm.max_map_count = 65536 vm.percpu_pagelist_fraction = 0 vm.min_free_kbytes = 11495 vm.drop_caches = 0 vm.lowmem_reserve_ratio = 256 256 32 vm.hugetlb_shm_group = 0 vm.nr_hugepages = 0 vm.swappiness = 60 vm.nr_pdflush_threads = 2 vm.dirty_expire_centisecs = 2999 vm.dirty_writeback_centisecs = 499 vm.dirty_ratio = 40 vm.dirty_background_ratio = 10 vm.page-cluster = 3 vm.overcommit_ratio = 50 vm.panic_on_oom = 0 vm.overcommit_memory = 0
Let's pick one of them: vm.swapiness, which defines the percentage of the kernel in physical memory which should swap memory into overall swap space. The lower one sets this, the more Linux will prefer to use physical RAM rather then swap. You should do this if you have plenty of RAM but too much paging. The default is 60, so we're going to change this to 20.
[root@172_29_137_29 sys]# sysctl -w vm.swappiness="20" vm.swappiness = 20 [root@172_29_137_29 sys]#
You can make your changes permanent by editing /etc/sysctl.conf.
Another tip I highly recommend is to run ckconfig.
This utility tells you what services are currently running:
[root@172_29_137_29 etc]# chkconfig --list NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off NetworkManagerDispatcher 0:off 1:off 2:off 3:off 4:off 5:off6 xinetd based services: tcpmux-server: off time-dgram: off time-stream: off [root@172_29_137_29 etc]#
Turn off any services that you are not using. You will find a lot more RAM available, as well as some additional CPU clock cycles.
And that is Red Hat Enterprise Linux 5 performance tuning, for now. It is important to note that there is a lot more to performance tuning than what we can detail in one small article but we covered the important parts: methodology, system snapshot and monitoring commands and basic tuning concepts. We defined the Virtual Memory Manager, looked at some kernel parameters and made tuning changes. I highly recommend that you have several environments for your critical systems. These can include development, QA and Production. System changes should never be made in a production environment without first testing them in a backed up system. Tuning methodology is important. Always adhere to one.
About the author: Ken Milberg is a systems consultant with two decades of experience working with Unix and Linux systems. He is a SearchEnterpriseLinux.com Ask the Experts advisor and columnist.
This was first published in October 2007