Tracking performance on your Linux hosts can be a complex exercise, relying on command line tools and aggregating data. Patterns can be hard to discern and correlation requires large amounts of data. Thankfully there are some tools available to make the process of spotting the patterns and making the correlations you need easier. We're going to look at one of those tools,
Munin is a data collection and graphing tool with a client-server architecture. Munin allows you
to track statistics from your hosts, which it calls "nodes", and send them to a central server
where you can then display them as graphs. You can see an example Munin graph showing Disk IO
Munin is quick and easy to install on most Linux distributions and can be installed via packages. On both Red Hat and Ubuntu/Debian you need the munin, munin-node and munin-common packages (this combination assumes you want to monitor the server too), for example on Red Hat:
$ yum install munin munin-common munin-node
Munin installs its configuration into the /etc/munin directory. Let's start by configuring our Munin server. The main server configuration file is munin.conf which controls the server's settings and contains the configuration for each node. In most cases the default settings are acceptable but there are some options you should be aware of: dbdir, htmldir, logdir and rundir.
The dbdir setting controls where Munin will store the collected statistical data in the form of RRD files , it defaults to /var/lib/munin on both Red Hat and Ubuntu.
The htmldir setting controls where Munin will write its output, the HTML files which display the graphs. This defaults to /var/www/html/munin on Red Hat platforms and /var/cache/munin/www on Ubuntu. This is the directory which we want to serve with a web server, for example Apache. One of the best ways to do this is to use an Apache Virtual Host, to create a virtual host that will display our nodes' graphs:
ServerAdmin webmaster@localhost ServerName munin.example.com DocumentRoot /var/www/html/munin <Directory /> Options FollowSymLinks AllowOverride None </Directory> LogLevel notice CustomLog /var/log/apache2/munin.access.log combined ErrorLog /var/log/apache2/munin.error.log ServerSignature On </VirtualHost>
The logdir and rundir settings control the location of Munin's log and PID files respectively.
Lastly, we must also then define any nodes that will be reporting to this server in the munin.conf file using the format:
[hostname.example.com] address 10.0.0.1 use_node_name yes [hostname2.example.com] address 10.0.0.2 use_node_name yes
Each node is specified by name in the block brackets, its IP address specified and the use_node_name setting controls how Munin will name the node, set to yes it'll use the value in the block brackets, set to no it will do a DNS lookup. Alternatively, you can use the includedir option to specify a single directory from which Munin will load all files, for example:
I often use this to manage Munin configurations with Puppet
exported resources by having each Puppet client create a node in a separate file and having Munin
load each node from that file.
Now that the server is configured we need to configure the nodes. Install the munin-node package on each node and then configure the munin-node.conf file in the /etc/munin directory. Most of the settings don't need to be changed (you can see a full list of options here ) but you will need to change the allow option, which controls which hosts can access Munin and retrieve statistics. We want to specify the IP address of the Munin server, for example:
As you can see the IP address must be specified in the form of a Perl Regular Expression. If you can have more than one Munin server you can specify multiple allow lines.
Each Munin node uses TCP port 4949 to communicate back to the Munin server, so you'll need to ensure this port is open on your host firewalls and on any intervening firewalls to allow connections between the node and the server. You can control this port using the port option in the munin-node.conf file.
In addition to its base configuration, we also need to tell the node what data to collect. Munin uses a modular framework that uses plug-ins to specify what it will monitor. For example, there are plug-ins for monitoring CPU, load, and memory amongst many others. A list of the plugins being used by Munin is contained in the /etc/munin/plugins directory in the form of sym-links to the plug-ins themselves. Adding a new plugin to Munin is simply a matter of symlinking the plug-in file into the /etc/munin/plugins directory. If a plug-in requires any configuration, for example specifying the user the plug-in should execute as, then you'll find a configuration file in the /etc/munin/plugins.conf.d directory.
By default, Munin enables a wide variety of metrics and you probably won't need to change the basic inclusions initially as they give you a good starting collection of statistics. Munin also comes with a broad collection of useful plug-ins you can enable and there is also a plug-in exchange with a variety of community committed plug-ins. Plug-ins are also very simple to develop using the language of your choice.
Finally, we start the Munin server and nodes by running the munin-node init script.
$ sudo /etc/init.d/munin-node start
This will start Munin monitoring the required statistics and the Munin master will periodically query each node for its data and populate the data files on the Munin server. The resulting graphs can then be displayed via a Web server on the Munin master.
And that's it! You can now see graphical representations of the behavior of your hosts and hopefully be able to detect performance trends and issues. If you need more help you can find more documentation on Munin here. If you don't find Munin to your tastes, also available is collectd, a similar tool to Munin, written in C and potentially providing better performance and scalability than Munin. Unlike Munin, Collectd does not directly produce graphs or have a console. It requires installing additional software to provide this functionality, and there are a variety of consoles available here.
ABOUT THE AUTHOR: James Turnbull works for the National Australia Bank as the manager of the CERT (Computer Emergency Response Team). He is an experienced infrastructure architect with a background in Linux/Unix, AS/400, Windows, and storage systems. He has been involved in security consulting, infrastructure security design, SLA and service definition and has an abiding interest in security metrics and measurement. James is also involved in the Free and Open Source Software community as a developer and contributor. He has authored several articles, and books including Pro Linux System Administration and Pulling Strings with Puppet: Configuration Management Made Easy.
This was first published in July 2010