Monit » History » Version 4

Anonymous, 08/20/2008 01:37 PM
Installation Guide: Update Monit

1 1 Anonymous
= Monit: Monitoring of Services =
2 4 Anonymous
This installation guide will describe how to set up ''independent'' instances of [http://www.tildeslash.com/monit/ Monit] on the master node and each slave.[[br]]
3 4 Anonymous
In the future, [http://www.tildeslash.com/mmonit/ M|Monit] should be considered, which allows easy single point administration and monitoring (from the master node).
4 1 Anonymous
5 4 Anonymous
6 1 Anonymous
== Installation ==
7 1 Anonymous
 * Add the DAG repository on the ''master node'' and ''slave nodes''. Enter at the command line as ''root'':
8 1 Anonymous
{{{
9 1 Anonymous
 wget http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/rpmforge-release-0.3.6-1.el5.rf.x86_64.rpm
10 1 Anonymous
 rpm -Uvh rpmforge-release-0.3.6-1.el5.rf.x86_64.rpm
11 1 Anonymous
}}}
12 1 Anonymous
13 1 Anonymous
 * Install Monit on the ''master node'' and ''slave nodes''. Enter at the command line as ''root'':
14 1 Anonymous
{{{
15 1 Anonymous
 yum install monit
16 1 Anonymous
}}}
17 1 Anonymous
18 1 Anonymous
== Configuration ==
19 1 Anonymous
 
20 1 Anonymous
=== Master node ===
21 4 Anonymous
On the master node, the following services will be monitore:[[br]]
22 4 Anonymous
''apache'', ''cron'', ''devices'' (/ & /home), ''mysql'', ''nfs'' (/home_nfs), ''ntp'', ''pbs_mom'', ''pbs_sched'', ''pbs_server'', ''postfix'', ''ssh'', ''system'', ''ypbind'', ''yppasswd'', ''ypserv'', [[br]]
23 4 Anonymous
and if all ''slaves'' are reachable (''ping'')[[br]]
24 4 Anonymous
Currently, the monitoring of ''pbs_maui'' is switched off in favour of ''pbs_sched''.[[br]]
25 3 Anonymous
26 4 Anonymous
27 1 Anonymous
 * Download the [source:Externals/Cluster/procksi_monit.tgz configuration files] from the repository and extract the files. Enter at the command line:
28 1 Anonymous
{{{
29 1 Anonymous
tar -xvzf procksi_monit.tgz
30 1 Anonymous
}}}
31 1 Anonymous
32 2 Anonymous
 * Copy the files in ''./monit/master'' to the appropriate directories (''/etc/'', ''/etc/monit.d/'', ''/home/procksi/monit/'').
33 1 Anonymous
34 1 Anonymous
 * Change permissions of the monit token file. Enter at the command line:
35 1 Anonymous
{{{
36 1 Anonymous
chown -R procksi.procksi_dev /home/procksi/monit/token
37 1 Anonymous
}}}
38 1 Anonymous
39 1 Anonymous
 * Edit the Apache configuration file ''/etc/httpd/conf/httpd.conf'':
40 1 Anonymous
{{{
41 1 Anonymous
#General Aliases for Monitoring and Testing
42 1 Anonymous
Alias /monit/    "/home/procksi/monit/"
43 1 Anonymous
Alias /ganglia/  "/usr/local/ganglia/html/"
44 1 Anonymous
Alias /trees/    "/home/procksi/trees/"
45 1 Anonymous
46 1 Anonymous
#Conditional Logging: Don't log Ganglia and Monit requests
47 1 Anonymous
SetEnvIf Request_URI "ganglia" dontlog
48 1 Anonymous
SetEnvIf Request_URI "^\/monit\/token$" dontlog
49 1 Anonymous
}}}
50 1 Anonymous
51 1 Anonymous
 * Restart the Apache server. Enter at the command line as ''root'':
52 1 Anonymous
{{{
53 1 Anonymous
/sbin/service httpd restart
54 1 Anonymous
}}}
55 1 Anonymous
56 1 Anonymous
 * Make the Monit daemon start at bootup. Enter at the command line as ''root'':
57 1 Anonymous
{{{
58 1 Anonymous
/sbin/chkconfig  monit  on
59 1 Anonymous
}}}
60 1 Anonymous
61 1 Anonymous
 * Start the Monit daemon. Enter at the command line as ''root'':
62 1 Anonymous
{{{
63 1 Anonymous
/sbin/service  monit  start
64 1 Anonymous
}}}
65 1 Anonymous
66 1 Anonymous
67 1 Anonymous
=== Slave nodes ===
68 4 Anonymous
On the master node, the following services will be monitore:[[br]]
69 4 Anonymous
''devices'' (/ and /scratch), ''nfs'' (/home), ''ntp'', ''pbs_mom'', ''ssh'', ''system'', ''ypbind''
70 3 Anonymous
71 1 Anonymous
 * Download the [source:Externals/Cluster/procksi_monit.tgz configuration files] from the repository and extract the files. Enter at the command line:
72 1 Anonymous
{{{
73 1 Anonymous
tar -xvzf procksi_monit.tgz
74 1 Anonymous
}}}
75 1 Anonymous
76 2 Anonymous
 * Copy a the files in ''./monit/slave'' to the appropriate directories (''/etc/'', ''/etc/monit.d/'').
77 1 Anonymous
78 1 Anonymous
 * Edit ''/etc/monit.d/system'' and set the correct host name for each slave node.
79 1 Anonymous
80 1 Anonymous
 * Make the Monit daemon start at bootup. Enter at the command line as ''root'':
81 1 Anonymous
{{{
82 1 Anonymous
/sbin/chkconfig  monit  on
83 1 Anonymous
}}}
84 1 Anonymous
85 1 Anonymous
 * Start the Monit daemon. Enter at the command line as ''root'':
86 1 Anonymous
{{{
87 1 Anonymous
/sbin/service  monit  start
88 1 Anonymous
}}}
89 1 Anonymous
90 1 Anonymous
91 1 Anonymous
== Online Monitoring ==
92 1 Anonymous
93 1 Anonymous
The status of each monitored service, process, file, etc. is available with the Monit's integrated webserver at port 2812 from ''localhost'' and selected machines. Username and password can be found at the secret [[wiki:secretAuthentication authentication]] page.
94 1 Anonymous
95 1 Anonymous
 || master01 || [http://procksi0.cs.nott.ac.uk:2812]
96 1 Anonymous
 || slave01  || [http://procksi1.cs.nott.ac.uk:2812]
97 1 Anonymous
 || slave02  || [http://procksi2.cs.nott.ac.uk:2812]
98 1 Anonymous
 || slave03  || [http://procksi3.cs.nott.ac.uk:2812]
99 1 Anonymous
 || slave04  || [http://procksi4.cs.nott.ac.uk:2812]
100 1 Anonymous
101 1 Anonymous
102 1 Anonymous
== Offline Monitoring ==
103 1 Anonymous
104 1 Anonymous
Monit sends alerts to "procksi@cs.nott.ac.uk" if services are unavailable, have been restarted, or similar events.