Tip

Network monitoring with Nagios, part two

Welcome to the world of Nagios, an open source network monitoring tool. Besides being free, powerful and flexible, it can save IT managers a lot of time by automating network monitoring.

In this section of my introduction to Nagios, we'll look at an example of a Nagios configuration. In part one,

    Requires Free Membership to View

I discussed the usefulness and architecture of Nagios.

Nagios Configuration

As the previous paragraph implies, configuration plays a large role in successful Nagios operation. The configuration mechanics are conceptually quite straightforward, but require attention to detail. Essentially, a hierarchy of hosts and services are defined, with options defined for what check should be run and what should be done after a failed check.

Here is an example of a host configuration file entry:

define host{
 host_namelinux-server
 aliaslinux-server
 address 192.168.1.254
 check_command check-host-alive
 max_check_attempts 5
 contact_groups linux-admins
 notification_interval 30
 notification_period 24x7
 notification_options d,u,r
 }

Most of the entries are self-explanatory. The machine has a name, address, a check that should be run (check-host-alive), and a maximum number of checks that should be performed before concluding a problem exists. If there is a problem, the group linux-admins should be notified via the options listed every 30 minutes at all hours of the day or night (24x7). So for this resource, the machine itself must be checked to see if it is up and running.

Here is an example of a service configuration file entry:

define service{
 host_namelinux-server
 service_descriptioncheck-disk-sda1
 check_commandcheck-disk!/dev/sda1
 max_check_attempts5
 normal_check_interval5
 retry_check_interval3
 check_period24x7
 notification_interval30
 notification_period24x7
 notification_optionsw,c,r
 contact_groupslinux-admins
 }

Again, most of the entries are easy to understand. This service runs on the host defined in the previous example. (Services must have an entry for the server they reside on.) A service description and the command is there to check whether it is up and running, a maximum number of checks, and so on.

An obvious question is, "Now that I'm monitoring all of my hardware and software, how do I find out what's going on?" In addition to the problem notification mechanism listed in each configuration entry ("notification_options"), Nagios provides a number of prewritten CGI scripts that provide monitoring information; in essence a system status dashboard. These scripts provide listings of overall system status, network problems, trends, and so on. Between the dashboard information and the notifications, Nagios enables you to take a more proactive approach to managing your IT infrastructure.

Nagios Recommendations

Like all network management tools, Nagios is fairly complex to set up and requires ongoing tuning to ensure that the level of information provided is right -- neither too much detail nor too little information. Here are some recommendations about how to get the best use of your Nagios implementation:

  • Begin by planning what you need to keep track of, prioritized by most important resources first.
  • Work incrementally, first getting those most-important resources under management before moving on to less-important resources. For example, in most organizations e-mail is more important than FTP availability, so begin by putting e-mail under Nagios management. Working incrementally can reduce the burden of implementing a network management system.
  • Plan on regular reviews of the type and level of information you're getting, especially in the first few months. The purpose of these reviews is to get the system configured properly so that you can then use it on an ongoing (relatively) easy basis.
  • Take advantage of the Nagios community. A large number of sample configurations, dashboard extensions, and custom plugins are available, which can make it easier to get your Nagios implementation up and running.
  • Document your configuration. Comment your configuration files so it's clear what resources you're managing and what plugins you have running. Also, write up some external documentation on your Nagios implementation so that someone can pick it up later and get a good overview of how your management scheme works.

Nagios is very powerful and can make your life much easier after it's up and running. It's significantly less expensive than the commercial alternatives. And, best of all, because it's open source, it offers the ability to take advantage of the entire community's work with Nagios by sharing plugins and experience.


Bernard Golden is CEO of Navica Inc., a systems integration firm specializing in open source software. He writes a column for SearchEnterpriseLinux.com called Golden's Rules and answers your questions about open source software issues.

This was first published in April 2005

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.