Tải bản đầy đủ - 0 (trang)
3 Organizing Nagios’ Configuration Files Sanely

3 Organizing Nagios’ Configuration Files Sanely

Tải bản đầy đủ - 0trang

This is what the default /usr/local/nagios/etc directory looks like after following the

previous recipes:

$ cd /usr/local/nagios/

$ tree etc

etc

|-- cgi.cfg-sample

|-- commands.cfg-sample

|-- htpasswd.users

|-- localhost.cfg-sample

|-- nagios.cfg-sample

`-- resource.cfg-sample

|-|-|-|-|-|-`--



bigger.cfg-sample

cgi.cfg-sample

commands.cfg-sample

minimal.cfg-sample

misccommands.cfg-sample

nagios.cfg-sample

resource.cfg-sample



I like to organize them like this:

$ tree --dirsfirst etc

etc

|-- lan_objects

|

|-- commands.cfg

|

|-- contacts.cfg

|

|-- hosts.cfg

|

|-- commands.cfg

|

|-- services.cfg

|

`-- timeperiods.cfg

|-- sample

|

|-- cgi.cfg-sample

|

|-- commands.cfg-sample

|

|-- localhost.cfg-sample

|

|-- nagios.cfg-sample

|

`-- resource.cfg-sample

|-- cgi.cfg

|-- htpasswd.users

|-- nagios.cfg

`-- resource.cfg



How do all those files get there? First, move all the sample files into the sample/ directory. Then, enter the sample/ directory and copy these files into etc/ and lan_objects/:

$

#

#

#

#

#

#

#



cd etc

mkdir lan_objects

mkdir sample

mv *sample sample

cd sample

cp cgi.cfg-sample ../cgi.cfg

cp resource.cfg-sample ../resource.cfg

cp commands.cfg-sample ../lan_objects/commands.cfg



13.3



Organizing Nagios’ Configuration Files Sanely |



379



The rest will be created as we need them in the next few recipes.

See the next recipe to learn how to configure Nagios to use your nice new directory

organization, and to get started monitoring the local system.



Discussion

All Nagios configuration files must end in .cfg.

You are perfectly welcome to use a graphical file manager to shuffle everything

around. It’s easier and faster.

cgi.cfg, nagios.cfg, and resource.cfg are the primary Nagios configuration files, so they

don’t go with the others. htpasswd.users must be in the same directory as nagios.cfg.

The files in the lan_object/ directory are called object files. A Nagios object is a single

unit, such as a host, a command, a service, a contact, and the groups they belong to.

These objects are inheritable and reusable, which simplifies administration.



See Also

• man 1 tree

• man 1 cp



13.4 Configuring Nagios to Monitor Localhost

Problem

You’ve successfully installed Nagios, configured Apache, and set up your configuration files in an orderly manner as outlined in the previous recipe. Reading the local

Nagios documentation at http://localhost/nagios is nice, but you really want to get

going on setting up Nagios to keep an untiring eye on your network. What’s the next

step?



Solution

Nagios is best set up in small steps, so we’ll start with monitoring five basic functions on the Nagios server: ping, disk usage, local users, total processes, and CPU

load. This is a long recipe, but when you’re finished, you’ll have your basic Nagios

framework constructed.

Copy the following five configuration files exactly as shown, except where it says to

use your own information, and put them in the directories as outlined in the previous recipe:

• /usr/local/nagios/etc/nagios.cfg

• /usr/local/nagios/etc/lan_objects/timeperiods.cfg

• /usr/local/nagios/etc/lan_objects/contacts.cfg



380



|



Chapter 13: Network Monitoring with Nagios



• /usr/local/nagios/etc/lan_objects/hosts.cfg

• /usr/local/nagios/etc/lan_objects/services.cfg

Obviously, retyping all this is the path to madness, so please visit http://www.oreilly.

com/catalog/9780596102487 to download them.

First, create nagios.cfg:

################

# nagios.cfg

# main Nagios configuration file

################

log_file=/usr/local/nagios/var/nagios.log

cfg_dir=/usr/local/nagios/etc/lan_objects

object_cache_file=/usr/local/nagios/var/objects.cache

resource_file=/usr/local/nagios/etc/resource.cfg

status_file=/usr/local/nagios/var/status.dat

nagios_user=nagios

nagios_group=nagios

check_external_commands=1

command_check_interval=-1

command_file=/usr/local/nagios/var/rw/nagios.cmd

comment_file=/usr/local/nagios/var/comments.dat

downtime_file=/usr/local/nagios/var/downtime.dat

lock_file=/usr/local/nagios/var/nagios.lock

temp_file=/usr/local/nagios/var/nagios.tmp

event_broker_options=-1

log_rotation_method=d

log_archive_path=/usr/local/nagios/var/archives

use_syslog=1

log_notifications=1

log_service_retries=1

log_host_retries=1

log_event_handlers=1

log_initial_states=0

log_external_commands=1

log_passive_checks=1

service_inter_check_delay_method=s

max_service_check_spread=30

service_interleave_factor=s

host_inter_check_delay_method=s

max_host_check_spread=30

max_concurrent_checks=0

service_reaper_frequency=10

auto_reschedule_checks=0

auto_rescheduling_interval=30

auto_rescheduling_window=180



13.4



Configuring Nagios to Monitor Localhost |



381



sleep_time=0.25

service_check_timeout=60

host_check_timeout=30

event_handler_timeout=30

notification_timeout=30

ocsp_timeout=5

perfdata_timeout=5

retain_state_information=1

state_retention_file=/usr/local/nagios/var/retention.dat

retention_update_interval=60

use_retained_program_state=1

use_retained_scheduling_info=0

interval_length=60

use_aggressive_host_checking=0

execute_service_checks=1

accept_passive_service_checks=1

execute_host_checks=1

accept_passive_host_checks=1

enable_notifications=1

enable_event_handlers=1

process_performance_data=0

obsess_over_services=0

check_for_orphaned_services=0

check_service_freshness=1

service_freshness_check_interval=60

check_host_freshness=0

host_freshness_check_interval=60

aggregate_status_updates=1

status_update_interval=15

enable_flap_detection=0

low_service_flap_threshold=5.0

high_service_flap_threshold=20.0

low_host_flap_threshold=5.0

high_host_flap_threshold=20.0

date_format=us

p1_file=/usr/local/nagios/bin/p1.pl

illegal_object_name_chars=`~!$%^&*|'"<>?,( )=

illegal_macro_output_chars=`~$&|'"<>

use_regexp_matching=0

use_true_regexp_matching=0

admin_email=nagios

admin_pager=pagenagios

daemon_dumps_core=0



382



|



Chapter 13: Network Monitoring with Nagios



Now, create timeperiods.cfg:

# Time periods

# All times are valid for all

# checks and notifications

define timeperiod{

timeperiod_name

alias

sunday

monday

tuesday

wednesday

thursday

friday

saturday

}



24x7

24 Hours A Day, 7 Days A Week

00:00-24:00

00:00-24:00

00:00-24:00

00:00-24:00

00:00-24:00

00:00-24:00

00:00-24:00



Next, create contacts.cfg. The contact_name must be a Nagios user with a Nagios

login in htpasswd.users, and an email account:

################

# Contacts- individuals and groups

################

define contact{

contact_name

alias

service_notification_period

host_notification_period

service_notification_options

host_notification_options

service_notification_commands

host_notification_commands

email

}

#

#

#

#



nagios

Nagios Admin

24x7

24x7

w,u,c,r

d,r

notify-by-email

host-notify-by-email

nagios@alrac.net



contact groups

Nagios only talks to contact groups, not individuals

members must be Nagios users, alias and contact_group

are whatever you want



define contactgroup{

contactgroup_name

alias

members

}



admins

Nagios Administrators

nagios



Next, create hosts.cfg:

################

# Hosts file- individual hosts and host groups

################

# Generic host definition template - This is NOT a real host, just a template!

define host{

name



generic-host



13.4



Configuring Nagios to Monitor Localhost |



383



notifications_enabled

1

event_handler_enabled

1

flap_detection_enabled

1

failure_prediction_enabled 1

process_perf_data

1

retain_status_information

1

retain_nonstatus_information 1

; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!

register 0

}

# local host definition

define host{

use

generic-host

host_name

localhost

alias

Nagios Server

address

127.0.0.1

check_command

check-host-alive

max_check_attempts

10

check_period

24x7

notification_interval

120

notification_period

24x7

notification_options

d,r

contact_groups

admins

}

##############

# Host groups

##############

# Every host must belong to a host group

define hostgroup{

hostgroup_name

alias

members

}



test

Test Servers

localhost



Finally, create services.cfg:

################

# Services

################

# Generic service definition template - This is NOT a real service, just a template!

define service{

name

generic-service

active_checks_enabled

1

passive_checks_enabled

1

parallelize_check

1

obsess_over_service

1

check_freshness

0

notifications_enabled

1

event_handler_enabled

1



384



|



Chapter 13: Network Monitoring with Nagios



flap_detection_enabled

1

failure_prediction_enabled

1

process_perf_data

1

retain_status_information

1

retain_nonstatus_information 1

; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!

register

0

}

# Define a service to "ping" the local machine

define service{

use

host_name

service_description

is_volatile

check_period

max_check_attempts

normal_check_interval

retry_check_interval

contact_groups

notification_options

notification_interval

notification_period

check_command

}



generic-service

localhost

PING

0

24x7

4

5

1

admins

w,u,c,r

960

24x7

check_ping!100.0,20%!500.0,60%



# Define a service to check the disk space of the root partition

# on the local machine. Warning if < 20% free, critical if

# < 10% free space on partition.

define service{

use

host_name

service_description

is_volatile

check_period

max_check_attempts

normal_check_interval

retry_check_interval

contact_groups

notification_options

notification_interval

notification_period

check_command

}



generic-service

localhost

Root Partition

0

24x7

4

5

1

admins

w,u,c,r

960

24x7

check_local_disk!20%!10%!/



# Define a service to check the number of currently logged in

# users on the local machine. Warning if > 20 users, critical

# if > 50 users.

define service{

use

host_name

service_description



generic-service

localhost

Current Users



13.4



Configuring Nagios to Monitor Localhost |



385



is_volatile

check_period

max_check_attempts

normal_check_interval

retry_check_interval

contact_groups

notification_options

notification_interval

notification_period

check_command

}



0

24x7

4

5

1

admins

w,u,c,r

960

24x7

check_local_users!20!50



# Define a service to check the number of currently running procs

# on the local machine. Warning if > 250 processes, critical if

# > 400 users.

define service{

use

generic-service

host_name

localhost

service_description

Total Processes

is_volatile

0

check_period

24x7

max_check_attempts

4

normal_check_interval

5

retry_check_interval

1

contact_groups

admins

notification_options

w,u,c,r

notification_interval

960

notification_period

24x7

check_command

check_local_procs!250!400

}

# Define a service to check the load on the local machine.

define service{

use

host_name

service_description

is_volatile

check_period

max_check_attempts

normal_check_interval

retry_check_interval

contact_groups

notification_options

notification_interval

notification_period

check_command

}



generic-service

localhost

Current Load

0

24x7

4

5

1

admins

w,u,c,r

960

24x7

check_local_load!5.0,4.0,3.0!10.0,6.0,4.0



OK, we’re almost there! Make all the files in lan_objects/ owned and writable by the

nagios user:

# chown nagios:nagios /usr/local/nagios/etc/lan_objects/*

# chmod 0644 /usr/local/nagios/etc/lan_objects/*



386



|



Chapter 13: Network Monitoring with Nagios



Adjust these file ownerships and modes as shown:

#

#

#

#

#

#



chown

chmod

chown

chmod

chown

chmod



nagios:nagios /usr/local/nagios/etc/nagios.cfg

0644 /usr/local/nagios/etc/nagios.cfg

nagios:nagios /usr/local/nagios/etc/resource.cfg

0600 /usr/local/nagios/etc/resource.cfg

nagios:nagios /usr/local/nagios/etc/cgi.cfg

0644 /usr/local/nagios/etc/cgi.cfg



Now, you can run Nagios’ syntax checker. You need to do this as root:

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg



You should see a lot of output ending in these lines:

Total Warnings: 0

Total Errors:

0

Things look okay - No serious problems were detected during the pre-flight check



If there are any errors, it will tell you exactly what you need to fix. When you get a

clean run, start up the Nagios daemon:

# /etc/init.d/nagios start



Now, log in to the Nagios web interface at http://localhost/nagios, and start clicking

on various links in the left navigation bar. The Service Detail page should look like

Figure 13-2.



Figure 13-2. Service Detail page on a fresh Nagios installation

13.4



Configuring Nagios to Monitor Localhost |



387



This means you have successfully gotten Nagios up and running and monitoring

localhost. Congratulations!



Discussion

You may name Nagios configuration files whatever you want, as long they have the

.cfg extension—this is required.

You won’t be able to access all of the Nagios web interface pages yet; you’ll get an “It

appears as though you do not have permission to view the information you

requested...” error on some of them because we haven’t set the correct CGI permissions yet. See the next recipe to learn how to do this.

During its initial run, my Nagios system couldn’t run the “Total Processes” check.

The error message was check_procs: Unknown argument—(null). This means that

either one of the options in the command definition (commands.cfg) was incorrect,

or the service definition (services.cfg) was incorrect. I used the default files, so

chances are you fine readers might encounter the same error. A quick comparison

showed a mismatch between the two:

# commands.cfg

# 'check_local_procs' command definition

define command{

command_name

check_local_procs

command_line

$USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$

}

# services.cfg

define service{

use

host_name

service_description

<...>

check_command

}



generic-service

localhost

Total Processes

check_local_procs!250!400!



Compare the command_line and check_command lines. The check_local_procs command wants three arguments, but the service definition check_local_procs!250!400!

only defined two. Because all I want is to keep track of the total number of running

processes, the first two arguments are sufficient. Deleting -s $ARG3$ and restarting

Nagios fixed it.

When the total number of running processes reaches 250, Nagios sends a warning.

400 is critical.

The exclamation points simply separate the two alert values; they don’t mean you

need to get excited.



388



|



Chapter 13: Network Monitoring with Nagios



See Also

• Local Nagios documentation: http://localhost/nagios

• For definitions of the options in object definition files, which are all the files in

lan_objects/, start at “Template-Based Object Configuration”: http://localhost/

nagios/docs/xodtemplate.html

• For nagios.cfg and resources.cfg, see “Main Configuration File Options”: http://

localhost/nagios/docs/configmain.html

• For cgi.cfg, see “CGI Configuration File Options” (http://localhost/nagios/docs/

configcgi.html) and “Authentication And Authorization In The CGIs” (http://

localhost/nagios/docs/cgiauth.html)

• Nagios.org: http://www.nagios.org/



13.5 Configuring CGI Permissions for Full Nagios Web

Access

Problem

You have followed all the steps so far, but when you log in to the Nagios web interface, you can’t access all of the pages. Instead, you get this error: “It appears as

though you do not have permission to view information you requested.... If you

believe this is an error, check the HTTP server authentication requirements for

accessing this CGI and check the authorization options in your CGI configuration

file.” How do you fix this?



Solution

Uncomment these lines in /usr/local/nagios/etc/cgi.cfg, and make sure the correct

Nagios user is named, which in this chapter is nagios:

authorized_for_all_services=nagios

authorized_for_all_hosts=nagios

authorized_for_system_commands=nagios

authorized_for_configuration_information=nagios

authorized_for_all_service_commands=nagios

authorized_for_all_host_commands=nagios



Make sure this line is uncommented and set to 1:

use_authentication=1



This requires all CGI scripts to use authentication. Disabling this opens a great big

security hole; for example, any random person on your LAN could write whatever

they want to your command file.

Save the changes, and try again. Now, your nagios user should have full access to all

pages on the Nagios web interface, including the ability to run commands.



13.5



Configuring CGI Permissions for Full Nagios Web Access |



389



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

3 Organizing Nagios’ Configuration Files Sanely

Tải bản đầy đủ ngay(0 tr)

×