InstallationGuide » History » Version 22

Version 21 (Anonymous, 09/10/2007 03:39 PM) → Version 22/42 (Anonymous, 09/14/2007 10:56 AM)

= Installation Guide =

This guide describes the installation procedure for ''ProCKSI''.

||'''Release''' ||procksi_8-2
||'''Environment''' ||Cluster with one master node and several slave nodes

All installations must be done with ''root'' access.

= System Design and Requirements =


== System Overview: Hardware, Software Requirements ==
We are assuming the following software components to be already installed the '''master node''':

||'''Operating system''' || ''Centos5'' (''RHEL5'')

* [wiki:SystemDescription System Overview] - Cluster description, IP addresses, backup, ... ||'''Webserver''' || Apache2
||'''Database''' || MySQL

* [wiki:SoftwareOverview Software Overview] ||'''Email server''' || Postfix (SMTP)
||'''Queuing system''' || PBS torque + maui

The '''slave nodes''' only requires the following components:

||'''Operating system''' || ''Centos5'' (''RHEL5'')

* [wiki:UrlEmailForwarding ||'''Queuing system ''' || PBS torque

The configuration for these components will be described later in this installation guide.

=
URL and Email Forwarding]

Forwarding =
ProCKSI uses URL and email forwarding in order to provide a stable internet address and corresponding email addresses.

== Operating System: Installation Provider ==
These are data for the domain
and Configuration email domain provider.

||'''Provider''' || [http://www.planetdomain.com/ukcheap/home.jsp www.planetdomain.com/ukcheap/home.jsp]
||'''Login''' || nkrasnogor
||'''Password''' || [BBSRC GRANT NUMBER]

== Domains ==
ProCKSI's main domain name is [http://www.procksi.net www.procksi.net], which is redirected to [http://procksi.cs.nott.ac.uk procksi.cs.nott.ac.uk], which is an alias for [http://procksi0.cs.nott.ac.uk procksi0.cs.nott.ac.uk]. All other domain names are redirected to its main domain name.



||'''Domain Name''' ||'''Redirected to''' ||'''Expires at'''
||www.procksi.net ||[http://procksi.cs.nott.ac.uk procksi.cs.nott.ac.uk] ||11-01-2011
||www.procksi.org ||[http://www.procksi.net/ www.procksi.net] ||11-01-2011
||www.procksi.com ||[http://www.procksi.net/ www.procksi.net] ||11-01-2011
||www.procksi.info ||[http://www.procksi.net/ www.procksi.net] ||11-01-2008



== DNS Settings ==
The primary and secondary DNS servers must be set as follows:

{{{
Primary ns1.iprimus.com.au
Secondary ns2.iprimus.com.au
}}}

The following changes must be made manually in ''Advanced DNS settings'':

{{{
CNAME *.procksi.net procksi.cs.nott.ac.uk.
CNAME *.procksi.org www.procksi.net.
CNAME *.procksi.com www.procksi.net.
CNAME *.procksi.info www.procksi.net.
}}}


== Email Settings ==
The following email addresses must be created and redirected to ''procksi@cs.nott.ac.uk'', which must be available:

||'''Email Address''' ||'''Redirected to'''
||admin@procksi.net ||procksi@cs.nott.ac.uk
||develop@procksi.net ||procksi@cs.nott.ac.uk
||info@procksi.net ||procksi@cs.nott.ac.uk
||research@procksi.net ||procksi@cs.nott.ac.uk
||pbs@procksi.net ||procksi@cs.nott.ac.uk
||webmaster@procksi.net||procksi@cs.nott.ac.uk

The following changes must be made manually in ''Advanced DNS settings:''

{{{
MX @.procksi.net mailhost.planetdomain.com 10
}}}


== Domain Usage Monitoring ==
The usage of ProCKSI's domains is monitored.

||Provider ||[http://www.sitemeter.com www.sitemeter.com]
||Login: ||s18procksi
||Password: ||FAKUIL


All HTML documents must contain the following code in order to be tracked correctly.

{{{
<!-- Site Meter -->
<script type="text/javascript" src="http://s18.sitemeter.com/js/counter.js?site=s18procksi">
</script>
<noscript>
<a href="http://s18.sitemeter.com/stats.asp?site=s18procksi" target="_top">
<img src=[http://s18.sitemeter.com/meter.asp?site=s18procksi http://s18.sitemeter.com/meter.asp?site=s18procksi]
alt="Site Meter" border="0"/>
</a>
</noscript>

<!-- Copyright (c)2006 Site Meter -->
}}}


= Data Management and Exchange =
The
master node and all slave nodes must be able to communicate with each other and exchange data. Therefore, a common user management and shared file system is necessary.

== Network Configuration ==
Make the following changes on the head node and each compute node:


* [wiki:OperatingSystemInstallation Operating System Installation] Modify ''/etc/sysconfig/network'' in order to enable networking, set the hostname, and disable the Zero Configuration Newtworking:
{{{
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=[Add Hostname]
NOZEROCONF=yes
}}}

* Configure the internal network inferface (eth0) in ''/etc/sysconfig/networking/devices/ifcfg-eth0'':
{{{
DEVICE=eth0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=static
HWADDR=[Add MAC Address]
IPADDR=[Add Internal IP Address]
BROADCAST=192.168.199.255
GATEWAY=192.168.0.10
NETWORK=192.168.199.0
NETMASK=255.255.255.0
}}}

* Configure the external network inferface (eth1) in ''/etc/sysconfig/networking/devices/ifcfg-eth1'':
{{{
DEVICE=eth1
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=static
HWADDR=[Add MAC Address]
IPADDR=[Add External IP Address]
BROADCAST=128.243.21.255
GATEWAY=128.243.21.1
NETWORK=128.243.21.0
NETMASK=255.255.255.0
}}}

* Add a default gateway, and routes to the internal and external networks to the Routing Table (if not done automatically yet):
{{{
/sbin/route add -net 192.168.199.0 netmask 255.255.255.0 dev eth0
/sbin/route add -net 128.243.21.0 netmask 255.255.255.0 dev eth1
/sbin/route add default gw 128.243.21.1 dev eth1
}}}

== User Management ==
Make the following changes on the head node and each compute node:

* Add a new user into ''/etc/passwd'':
{{{
procksi:x:510:510:ProCKSI-Server:/home/procksi:/bin/bash
}}}

* [wiki:SystemUpdates System Updates] Add an entry for the new user into ''/etc/shadow'' if desired:
{{{
procksi:[ENCRYPTED_PASSWORD]:13483:0:99999:7:::
}}}

* [wiki:Users User Management] Add a new group into ''/etc/group'', and add all users who should have access:
{{{
procksi:x:510:dxb
}}}
The members for group procksi are now: ''procksi'', ''dxb''

* [wiki:Networking Networking] - Network, Firewall, Generate home directory for ''procksi''


== Firewall ==
All network traffic using the internal (private) network is trusted and considered to be secure. So no firewall is needed on the internal network interface (''eth0'').

* Modify ''/etc/sysconfig/iptables'' on the head node and on each compute node. [[BR]]
If ''eth0'' is on the private network, add
{{{
-A RH-Firewall-1-INPUT -i eth0 -j ACCEPT
}}}
directly after
{{{
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
}}}
* Restart the firewall on the head node and on each compute node:
{{{
/sbin/service iptables restart
}}}

Changes in the firewall settings regarding the external network interface (''eth1'') will be described in other sections where necessary.



==
Host Name Resolution ==
As each node consists of two network interfaces (= multihomed host), the host name resolution must be configured correctly in order to prioritize the internal, trusted network for communication between different nodes.

* The official hostname for each node must be set to the ''internal'' name of the machine in ''/etc/sysconfig/network''. This is an example for the head node:

{{{
HOSTNAME=procksi0-priv.cs.nott.ac.uk
}}}

The compute nodes must be named and configured accordingly.
* [wiki:TimeSynchronisation Time Synchronisation] - NTP Add the following to ''/etc/hosts'' on the master node:
{{{
127.0.0.1 master01.procksi.local master01 localhost.localdomain localhost
}}}
and alter the line for each slave node (slave01 ... slaveXX) accordingly.
* [wiki:DataAccess Add the following to ''/etc/hosts'' on the head node and each compute node:
{{{
192.168.199.10 master01.procksi.local master01 m01
192.168.199.11 slave01.procksi.local slave01 s01
192.168.199.12 slave02.procksi.local slave02 s02
192.168.199.13 slave03.procksi.local slave03 s03
192.168.199.14 slave04.procksi.local slave04 s04
}}}
Edit ''/etc/host.conf'' so that local setting in ''/etc/hosts'' take precedence over DNS:
{{{
order hosts,bind
}}}

==
Data Access] - Access ==
The head node hosts a RAID system of hard disks that will store all data generated by ProCKSI on all slave nodes and the master node itself. This partition must be accessible by all nodes and is exported as a network file system (NFS) therefore. Executables used by ProCKSI must be installed locally on each slave node for better performance.

* Add the following to the end of ''/etc/exports'' on the master node (''master01''):
{{{
/home 192.168.199.0/4(rw,async,no_subtree_check,no_root_squash)
}}}
* Add the following to the end of ''/etc/fstab'' on each slave node (''slave01'' ... ''slaveXX''):
{{{
master01:/home /home nfs bg,hard,intr,tcp 0 0
}}}
* Tune
NFS by increasing the number of nfsd threads. Modify ''/etc/sysconfig/nfs'' on the master node:
{{{
RPCNFSDCOUNT=16
}}}

* [wiki:JobManagement Job Management] - Queuing System (torque, maui) Make the NFS daemons start at bootup. Enter at the command line of the master node and each slave node:
{{{
/sbin/chkconfig nfs on
}}}

* [wiki:ClusterMonitoring Cluster Monitoring] - Ganglia, !JobMonarch Start the NFS daemons. Enter at the command line on the master node and on each slave node:
{{{
/sbin/service nfsd start
}}}

* [wiki:ClusterBackup Cluster Backup]

Generate the following temp directory on the master node and each slave node, at best on a separate partition:
{{{
mkdir /scratch
}}}

== ProCKSI Server: Installation Time Synchronisation ==
The system time on all nodes must be synchronized as
a) data is written/read on/from a common, shared file system or even expires after a certain period of time
and must be deleted, and
b) system logs are maintained independently but entries must be able to be associated with each other.

* Add your own time server to ''/etc/ntp/ntpservers'':
{{{
128.243.21.16 #marian.cs.nott.ac.uk
128.243.21.17 #robin.cs.nott.ac.uk
128.243.21.18 #tuck.cs.nott.ac.uk
128.243.21.19 #pat.cs.nott.ac.uk
}}}

* Modify ''/etc/ntp.conf'' in order to permit systems on the subnet to synchronise with this time service:
{{{
# -- CLIENT NETWORK -------
restrict 192.168.199.0 mask 255.255.255.0 nomodify notrap
broadcastclient
}}}
* Modify ''/etc/ntp.conf'' and add further time servers:
{{{
# --- OUR TIMESERVERS -----
server 128.243.21.16 #marian.cs.nott.ac.uk
restrict 128.243.21.16 mask 255.255.255.255 nomodify notrap noquery
server 128.243.21.17 #robin.cs.nott.ac.uk
restrict 128.243.21.17 mask 255.255.255.255 nomodify notrap noquery
server 128.243.21.18 #tuck.cs.nott.ac.uk
restrict 128.243.21.18 mask 255.255.255.255 nomodify notrap noquery
server 128.243.21.19 #pat.cs.nott.ac.uk
restrict 128.243.21.19 mask 255.255.255.255 nomodify notrap noquery
}}}
* Make the NTP daemon start at bootup. Enter at the command line:
{{{
/sbin/chkconfig ntpd on
}}}
* Start the NTP daemon. Enter at the command line:
{{{
/sbin/service ntpd start
}}}

= Queuing System =
The queueing system (resource manager) is the heart of the distributed computing on a cluster. It consists of three parts, the server, the scheduler, and the machine-oriented mini-server (MOM) executing the jobs.



We are assuming the following configuration:

||PBS TORQUE|| version 2.1.6 ||server, basic scheduler, mom
||MAUI || version 3.2.6.p18 ||scheduler

The sources can be obtained from:

||PBS TORQUE ||http://www.clusterresources.com/pages/products/torque-resource-manager.php
||MAUI ||http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php


The install directories for ''TORQUE'' and ''MAUI'' will be:

||PBS TORQUE ||''/usr/local/torque''
||MAUI ||''/usr/local/maui''


== TORQUE ==

=== Register new services ===
Edit ''/etc/services'' and add at the end:
{{{
# PBS/Torque services

pbs 15001/tcp # pbs_server
pbs 15001/udp # pbs_server
pbs_mom 15002/tcp # pbs_mom <-> pbs_server
pbs_mom 15002/udp # pbs_mom <-> pbs_server
pbs_resmom 15003/tcp # pbs_mom resource management
pbs_resmom 15003/udp # pbs_mom resource management
pbs_sched 15004/tcp # pbs scheduler (pbs_sched)
pbs_sched 15004/udp # pbs scheduler (pbs_sched)
}}}




=== Setup and
Configuration == on the Head Node ===
Before Extract and build the ''ProCKSI Framework'' distribution TORQUE on the head node. Configure server, monitor and clients to use secure file transfer (scp).
{{{
export TORQUECFG=/usr/local/torque
tar -xzvf TORQUE.tar.gz
cd TORQUE
}}}
Configuration for a 64bit machine with the following compiler options:
{{{
FFLAGS = "-m64 -march=nocona -O3 -fPIC"
CFLAGS = "-m64 -march=nocona -O3 -fPIC"
CXXFLAGS = "-m64 -march=nocona -O3 -fPIC"
LDFLAGS = "-L/usr/local/lib -L/usr/local/lib64"
}}}
Configure, build, and install:
{{{
./configure --enable-server --enable-monitor --enable-clients
--with-server-home=$TORQUECFG --with-server-name
--with-rcp=scp --disable-filesync
make
make install
}}}


If not configures otherwise, binaries are installed in ''/usr/local/bin'' and ''/usr/local/sbin''. You should have these directories included in your path. But you
can configure TORQUE to have the binaries in the default system directory with
{{{
./configure --bindir=/usr/bin --sbindir=/usr/sbin
}}}


Initialise/configure the queueing system's server daemon (pbs_server):
{{{
pbs_server -t create
}}}

Set the PBS operator and manager (must
be installed a valid user name).
{{{
qmgr
> set server_name = procksi0-priv.cs.nott.ac.uk
> set server scheduling = true
> set server operators += “root@procksi.cs.nott.ac.uk"
> set server operators += “procksi@ procksi.cs.nott.ac.uk"
> set server managers += “root@ procksi.cs.nott.ac.uk"
> set server managers += “procksi@ procksi.cs.nott.ac.uk"
}}}

Allow only ''procksi''
and configured, ''root'' to submit jobs into the queue:
{{{
> set server acl_users = “root, procksi"
> set server acl_user_enable = true
}}}

Set the default queue to ''batch''
{{{
> set server default_queue=batch
}}}


Set email address for email that is sent by PBS:
{{{
> set mail_from = pbs@procksi.net
}}}


Allow submissions from compute hosts (only):
{{{
> set server allow_node_submit = true
> set server submit_hosts = procksi0-priv.cs.nott.ac.uk
procksi1-priv.cs.nott.ac.uk
procksi2-priv.cs.nott.ac.uk
}}}


Restrict nodes that can access the PBS server:
{{{
> set server acl_hosts = procksi0-priv.cs.nott.ac.uk
procksi1-priv.cs.nott.ac.uk
procksi2-priv.cs.nott.ac.uk
> set acl_host_enable = true
}}}
And set in ''torque.cfg'' in order to use the internal interface:
{{{
SERVERHOST procksi0-priv.cs.nott.ac.uk
ALLOWCOMPUTEHOSTSUBMIT true
}}}


Configure the main queue ''batch'':
{{{
> create queue batch queue_type=execution
> set queue batch started=true
> set queue batch enabled=true
> set queue batch resources_default.nodes=1
}}}

Configure queue ''test ''accordingly''.

Configure default node to be used (see below):
{{{
> set server default_node = slave
}}}

Specify
all external software compute nodes to be used by ProCKSI creating/editing ''$TORQUECFG/server_priv/nodes.'' This may include the same machine where pbs_server will run. If the compute nodes have more than one processor, just add np=X after the name with X being the number of processors. Add node attributes so that a subset of nodes can be requested during the submission stage.
{{{
procksi0-priv.cs.nott.ac.uk np=1 procksi head
procksi1-priv.cs.nott.ac.uk np=2 procksi slave slave1
procksi2-priv.cs.nott.ac.uk np=2 procksi slave slave2
}}}

Although the head node (''procksi0'') has two processors as well, we only allow one processor to be used for the queueing system as the other processor will be used for handling all frontend communication and I/O. (Make sure that hyperthreading technology is disabled on the head node and all compute nodes!)


Build packages for the compute nodes and copy them to each compute node:
{{{
cd $TORQUE
make packages
scp torque-package-mom-linux-i686.sh procksi1|procksi2
scp torque-package-clients-linux-i686.sh procksi1|procksi2
}}}
ATTENTION: Does only work for the same architecture! Thus, building on the Intel head node and deploying to AMD slaves does not work!

=== Setup and Configuration on the Compute Nodes ===
Install prepared packes. A directory similar to ''$TORQUECFG'' will be automatically created.
{{{
pdsh torque-package-mom-linux-i686.sh --install
pdsh torque-package-clients-linux-i686.sh --install
}}}


Check if the nodes know the head node
{{{
$TORQUECFG/server_name
#procksi0-priv.cs.nott.ac.uk
}}}

Configure the compute nodes by creating/editing ''$TORQUECFG/mom_priv/config''. The first line specifies the PBS server, the second line specifies hosts which can be trusted to access mom services as non-root, and the last line allows to copy data via NFS without using SCP.

{{{
$pbsserver procksi0-priv.cs.nott.ac.uk
$loglevel 255
$restricted procksi?-priv.cs.nott.ac.uk
$usecp procksi0-priv.cs.nott.ac.uk:/home/procksi /home/procksi
}}}

Start the queueing system (manually) in the correct order:
* Start the mom:
{{{
/usr/local/sbin/pbs_mom
}}}
* Kill the server:
{{{
/usr/local/sbin/qterm -t quick
}}}
* Start the server:
{{{
/usr/local/sbin/pbs_server
}}}
* Start the scheduler:
{{{
/usr/local/sbin/pbs_sched
}}}

If you want to use MAUI as the final scheduler, keep in mind to kill ''pbs_sched'' after testing the TORQURE installation.

Check that all nodes are properly configured and correctly reporting
{{{
qstat -q
pbsnodes -a
}}}



=== Prologue and Epilogue Scripts ===
The ''prologue'' script is executed just before the submitted job starts. Here, it generates a unique temp directory for each job in ''/scratch''. It
must be installed on each node:
{{{
cp $PROCKSI/install/prologue $TORQUECFG/mom_priv
chmod 500 $TORQUECFG/mom_priv/prologue
}}}

The ''epilogue'' script is executed right after the submitted job has ended. Here, it deletes the job's temp directory from ''/scratch.'' It must be installed on each node:
{{{
cp $PROCKSI/install/epilogue $TORQUECFG/mom_priv
chmod 500 $TORQUECFG/mom_priv/epilogue
}}}


== MAUI ==

=== Register new services ===
Edit ''/etc/services'' and add at the end:
{{{
# PBS/MAUI services
pbs_maui 42559/tcp # pbs scheduler (maui)
pbs_maui 42559/udp # pbs scheduler (maui)
}}}

=== Setup and Configuration on the Head Node ===
Extract and build the distribution MAUI.
{{{
export MAUIDIR=/usr/local/maui
tar -xzvf MAUI.tar.gz
cd TORQUE
}}}

Configuration for a 64bit machine with the following compiler options:
{{{
FFLAGS = “-m64 -march=[Add Architecture] -O3 -fPIC"
CFLAGS = “-m64 -march=[Add Architecture] -O3 -fPIC"
CXXFLAGS = “-m64 -march=[Add Architecture] -O3 -fPIC"
LDFLAGS = “-L/usr/local/lib -L/usr/local/lib64"
}}}
Attention: For Intel Xenon processors use ''-march=nocona'', for AMD Opteron processors use ''-march=opteron''.

Configure, build, and install:
{{{
./configure --with-pbs=$TORQUECFG --with-spooldir=$MAUIDIR
make
make install
}}}

Fine-tune MAUI
in place $''MAUIDIR/maui.cfg'':
{{{
SERVERHOST procksi0-priv.cs.nott.ac.uk

# primary admin must be first in list
ADMIN1 procksi
ADMIN1 root

# Resource Manager Definition
RMCFG[PROCKSI0-PRIV.CS.NOTT.AC.UK]
TYPE=PBS
HOST=PROCKSI0-PRIV.CS.NOTT.AC.UK
PORT=15001
EPORT=15004 [CAN BE ALTERNATIVELY: 15017 - TRY!!!]
SERVERPORT 42559
SERVERMODE NORMAL

# Node Allocation:
NODEALLOCATIONPOLICY PRIORITY
NODECFG[DEFAULT] PRIORITY='- JOBCOUNT'
}}}

Configure attributes of compute nodes:
{{{
qmgr
> set node procksi0.cs.nott.ac.uk properties = “procksi, head"
> set node procksi1.cs.nott.ac.uk properties = “procksi, slave"
> set node procksi0.cs.nott.ac.uk properties = “procksi, slave"
}}}

Request job to be run on specific nodes (on submission):

* Run on any compute node:
{{{

qsub -q batch -l nodes=1:procksi
}}}
* [wiki:ExternalSoftware External Software] Run on any slave node:
{{{

qsub -q batch -l nodes=1:slave
}}}
* [wiki:ProcksiFramework ProCKSI Framework] Run on head node:
{{{
qsub -q batch -l nodes=1:head
}}}


Start the MAUI scheduler manually. Make sure that pbs_sched is not running any longer.

* Start the scheduler:
{{{
/usr/local/sbin/maui
}}}


Make the entire queueing system start at bootup:
{{{
cp /home/procksi/latest/install/pbs_head-node /etc/init.d/pbs
/sbin/chkconfig --add pbs
/sbin/chkconfig pbs on
}}}

=== Setup and Configuration on the Compute Nodes ===
__Attention:__ If the head node is a compute node itself, do NOT proceed with the following steps as the head node was configured in the previous step!


Make the entire queueing system start at bootup:
{{{
cp /home/procksi/latest/install/pbs_compute-node /etc/init.d/pbs
/sbin/chkconfig --add pbs
/sbin/chkconfig pbs on
}}}

= Cluster Monitoring =

== Single Point Administration Ganglia ==
“Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids."

* Download the latest release of the ''Ganglia Monitoring Core'' from [http://ganglia.sourceforge.net/ http://ganglia.sourceforge.net].

* [wiki:YellowPages Yellow Pages] Install Ganglia into ''/usr/local/ganglia'', its web frontend into ''/usr/local/ganglia/html/', and its databases into ''/usr/local/ganglia/rrds/'.
* [wiki:ClusterAuthentication Cluster Authentication] Install the ''Ganglia Monitoring Daemon'' (gmond) on each node, and the ''Ganglia Meta Daemon'' (gmetad) on the the head node.

=== Ganglia Monitoring Daemon ===

* [wiki:InstallPDSH Configure the ''Ganglia Moditoring Daemon'' in ''/etc/gmond.conf'':
* Set the name of the cluster:
{{{
cluster {
name = "ProCKSI"
}
}}}
* Set the IP address and port for multicast data exchange:
{{{
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
}}}
* Add additional route for correct data exchange via multicast using the ''internal'' interface (''eth0''). Modify ''/etc/inid.d/gmond'':
{{{
#Add multicast route to internal interface
/sbin/route add -host 239.2.11.71 dev eth0
daemon $GMOND
}}}
{{{
#Remove multicast route to internal interface
/sbin/route delete -host 239.2.11.71 dev eth0
killproc gmond
}}}
* Make the Ganglia Monitoring Daemon start at bootup.
{{{
/sbin/chkconfig gmond on
}}}
* Start the Ganglia Monitoring Daemon:
{{{
/sbin/service gmond start
}}}

=== Ganglia Meta Daemon ===
*
Install PDSH] - How and configure the ''Ganglia Meta Daeomn'' (gmetad) on the head node.
* Make the Ganglia Meta Daemon start at bootup.
{{{
/sbin/chkconfig gmetad on
}}}
* Start the Meta Meta Daemon:
{{{
/sbin/service gmetad start
}}}


=== Further Customisation ===
In order
to display more fine-grained time intervals, edit the following files in ''/usr/local/ganglia/html/'':
* '''header.php'''
{{{
if (!$physical) {
$context_ranges[]="10 minutes";
$context_ranges[]="20 minutes";
$context_ranges[]="30 minutes";
$context_ranges[]="1 hour";
$context_ranges[]="2 hours";
$context_ranges[]="4 hours";
$context_ranges[]="8 hours";
$context_ranges[]="12 hours";
$context_ranges[]="1 day";
$context_ranges[]="2 days";
$context_ranges[]="week";
$context_ranges[]="month";
$context_ranges[]="year";
}}}

* '''get_context.php'''
{{{
switch ($range) {
case "10 minutes": $start = -600; break;
case "20 minutes": $start = -1200; break;
case "30 minutes": $start = -1800; break;
case "1 hour": $start = -3600; break;
case "2 hours": $start = -7200; break;
case "4 hours": $start = -14400; break;
case "8 hours": $start = -28800; break;
case "12 hours": $start = -43200; break;
case "1 day": $start = -86400; break;
case "2 days": $start = -172800; break;
case "week": $start = -604800; break;
case "month": $start = -2419200; break;
case "year": $start = -31449600; break;
}}}

== !JobMonarch ==
!JobMonarch is an add-on to Ganglia which provides PBS job monitoring through the web browser.

See [http://subtrac.rc.sara.nl/oss/jobmonarch/wiki/Documentation http://subtrac.rc.sara.nl/oss/jobmonarch/wiki/Documentation] for information on requirements, configuration and installation.

= Additional Software =

== PERL Libraries ==
Please make sure that the following libraries are installed in the official library directory and
install all depending libraries, if necessary. For ''Image::Magick'', use the Parallel Distributed Shell corresponding libraries that come with the main installation.

||Error|| 0.17008
||Config::Simple||4.58
||DBI||1.53||Remember to install DBD::mysql from the OS sources, too!
||CGI||3.25
||CGI::Session||4.13
||Data::!FormValidator||4.40
||HTML::Template||2.8
||HTML::Template::Pro||0.64
||MIME::Lite||
||!FreezeThaw||
||Storable||
||Time::Format||
||IMAP::Client||
||Time::Local||
||Clone||
||SOAP::Lite||
||Inline::Python||


PERL modules are installed best with the CPAN shell:
{{{
perl -MCPAN -eshell
}}}

== Third Party Executables for ProCKSI ==
Generate the following directories on each compute node to contain important executables:
{{{
/usr/local/procksi/Cluster/
/usr/local/procksi/DaliLite
/usr/local/procksi/MaxCMO
/usr/local/procksi/MolScript
}}}


For the following installation of the ProCKSI server components, the following executables must be present:
{{{
/usr/local/procksi/Cluster/qclust
/usr/local/procksi/DaliLite/DaliLite
/usr/local/procksi/MaxCMO/ProtCompVNS
/usr/local/procksi/molauto
/usr/local/procksi/molscript
}}}

== Image Software ==

=== Installation ===
* Install ''!ImageMagick'' from [http://www.imagemagick.org www.imagemagick.org] if not already installed.
* Install ''!MolScript'' from [http://www.avatar.se/molscript www.avatar.se/molscript]. Please link the MesaGL libraries instead of the OpenGL libraries; a modified makefile can be found under [source:ProCKSI/install/Makefile.molscript]


=== Virtual Display ===
!MolScript needs an X display in order to generate images (jpg, gif, …). Its possible to use the console X display for the OpenGL bits even if it is not logged in. Therefore, ''procksi'' must be authenticated and allowed to use this X display virtually.

Get and unpack the ProCKSI x-authentication patch from [repos:Externals/Cluster/xauth.tgz].

On each node copy magic cookie file for x-authentication:
{{{
cp :0.Xauth /var/gdm/:0.Xauth
}}}

On each node copy scripts for automatic x-authentication:
{{{
cp procksixauth /usr/local/sbin/procksixauth
cp :0 /etc/gdm/Init/:0
}}}

Restart the X display manager for the changes to take effect:
{{{
/usr/sbin/gdm-restart
}}}

The virtual X display can be used with unix socket '':0'', e.g.:
{{{
molauto protein.pdb | DISPLAY=unix:0.0 molscript -jpeg -out protein.jpeg
}}}

= ProCKSI Server Component =

== Installation and Basic Configuration ==
This section describes the installation and configuration of the ProCKSI server component. This includes the configuration of the web server and the database.

The server component will be installed into the home directory of the user ''procksi''. Therefore, make sure that it is on a separate partition / hard disk with much space. In the best case, this will be a RAID system.

Get the latest release of the server component, referred to in the following as ''RELEASE'', and extract it into ''/home/procksi/RELEASE.''
{{{
tar -xvzf RELEASE.tgz
}}}

Create a softlink from ''RELEASE'' to a generic directory ''/home/procksi/latest''. This will be accessed by the web server:
{{{
ln -s /home/procksi/RELEASE /home/procksi/latest
}}}

In order to test new versions, referred in the following as ''TEST'', before taking them officially online, create a softlink from ''TEST'' to a generic directory ''/home/procksi/test''. This will be accessed by the web server:
{{{
ln -s /home/procksi/TEST /home/procksi/test
}}}

In case that you want to bring the test version online, just delete the softlinks and repeat the previous steps for the new release. Please make sure that always both softlinks exist!

Change into the administrative directory and run the installation script. Change the server settings, database settings and directory settings if necessary.
{{{
cd /home/procksi/latest/admin
./configure.pl
}}}

== Database Configuration ==
Make sure that the MySQL daemon is running, and that it will start at boot time:
{{{
/sbin/service mysqld start
/sbin/chkconfig --add mysqld
/sbin/chkconfig mysqld on
}}}

Make sure that you have access to the MySQL database management as ''root'' and login as user ''root ''with the corresponding password:
{{{
mysql -u root -p
}}}

Create new mysql users ''procksi_user ''and ''procksi_admin'':
{{{
USE mysql;
INSERT INTO user SET host='localhost', user='procksi_user', password=PASSWORD('''password_procksi_user''');
INSERT INTO user SET host='localhost', user='procksi_admin', password=PASSWORD('''password_procksi_admin''');
FLUSH PRIVILEGES;
}}}

Repeat these steps analogously for ''procksi0-priv'', ''procksi1-priv'', and ''procksi2-priv.''

Create a new database:
{{{
CREATE DATABASE procksi_latest;
}}}

Give privileges to users ''procksi_user ''and ''procksi_admin'' for all compute nodes:
{{{
GRANT ALL ON procksi_latest.* TO procksi_admin@localhost WITH GRANT OPTION;
GRANT SELECT, UPDATE, INSERT, DELETE ON procksi_latest.* TO procksi_user@localhost;
GRANT ALL ON procksi_latest.* TO procksi_admin@procksi0.cs.nott.ac.uk WITH GRANT OPTION;
GRANT SELECT, UPDATE, INSERT, DELETE ON procksi_latest.* TO procksi_user@procksi0.cs.nott.ac.uk;
GRANT SELECT, UPDATE, INSERT, DELETE ON procksi_latest.* TO procksi_user@procksi1.cs.nott.ac.uk;
GRANT SELECT, UPDATE, INSERT, DELETE ON procksi_latest.* TO procksi_user@procksi2.cs.nott.ac.uk;
FLUSH PRIVILEGES;
}}}

If you change the password for ''procksi_user'', please make sure that you also change it in ''/home/procksi/latest/config/main.ini''

Import the main database ''procksi_latest'' from the backup given in ''/home/procksi/RELEASE/admin'':
{{{
msysql -u procksi_admin -p procksi_latest < procksi_latest.sql
}}}

In order to create a database ''procksi_test'' for the test version, repeat the previous steps and set the privileges accordingly.


== Web Server Configuration ==
Make the following changes to the Apache configuration file (''/etc/httpd/conf/httpd.conf''):
{{{
User procksi
Group procksi
ServerAdmin procksi@cs.nott.ac.uk
ServerName procksi.cs.nott.ac.uk
DocumentRoot /home/procksi/latest/html
<Directory /home/procksi/latest/html">
AllowOverride AuthConfig
</Directory>
LogFormat "%t %h %l %u \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%t %h %l %u \"%r\" %>s %b" common
LogFormat "%t %{Referer}i -> %U" referer
LogFormat "%t %{User-agent}i" agent

#Exclude Logging of Ganglia Requests
SetEnvIf Request_URI "ganglia" ganglia

#
# The location and format of the access logfile (Common Logfile Format).
# If you do not define any access logfiles within a <VirtualHost>
# container, they will be logged here. Contrariwise, if you *do*
# define per-<VirtualHost> access logfiles, transactions will be
# logged therein and *not* in this file.
#

CustomLog /home/procksi/latest/logs/access.log common env=!ganglia

#
# If you would like to have agent and referer logfiles, uncomment the
# following directives.
#

CustomLog /home/procksi/latest/logs/referer.log referer env=!ganglia
CustomLog /home/procksi/latest/logs/agent.log agent env=!ganglia

#
# For a single logfile with access, agent, and referer information
# (Combined Logfile Format), use the following directive:
#
#CustomLog logs/access_log combined env=!ganglia

ScriptAlias /cgi-bin/ /home/procksi/latest/cgi-bin/

<Directory "/home/procksi/latest/cgi-bin">
AllowOverride None
Options None
Order allow,deny
Allow from all
</Directory>

Alias /data/ /home/procksi/latest/data/
Alias /images/ /home/procksi/latest/images/
Alias /styles/ /home/procksi/latest/styles/
Alias /applets/ /home/procksi/latest/applets/
Alias /scripts/ /home/procksi/latest/scripts/
Alias /ganglia/ /usr/local/ganglia/html/

#Redirection
Redirect /trac https://psiren.cs.nott.ac.uk/projects/procksi/

AddLanguage de .de
AddLanguage en .en
AddLanguage es .es
AddLanguage fr .fr
LanguagePriority en es de fr

Alias /errordocs/ "/home/procksi/errordocs"
<IfModule mod_negotiation.c>
<IfModule mod_include.c>
<Directory /home/procksi/errordocs>
AllowOverride none
Options MultiViews IncludesNoExec FollowSymLinks
AddType text/html .shtml
<FilesMatch "\.shtml[.$]">
SetOutputFilter INCLUDES
</FilesMatch>
</Directory>

ErrorDocument 400 /errordocs/400_BAD_REQUEST
ErrorDocument 401 /errordocs/401_UNAUTHORIZED
ErrorDocument 403 /errordocs/403_FORBIDDEN
ErrorDocument 404 /errordocs/404_NOT_FOUND
ErrorDocument 405 /errordocs/405_METHOD_NOT_ALLOWED
ErrorDocument 406 /errordocs/406_NOT_ACCEPTABLE
ErrorDocument 408 /errordocs/408_REQUEST_TIMEOUT
ErrorDocument 410 /errordocs/410_GONE
ErrorDocument 411 /errordocs/411_LENGTH_REQUIRED
ErrorDocument 412 /errordocs/412_PRECONDITION_FAILED
ErrorDocument 413 /errordocs/413_REQUEST_ENTITY_TOO_LARGE
ErrorDocument 414 /errordocs/414_REQUEST_URI_TOO_LARGE
ErrorDocument 415 /errordocs/415_UNSUPPORTED_MEDIA_TYPE
ErrorDocument 500 /errordocs/500_INTERNAL_SERVER_ERROR
ErrorDocument 501 /errordocs/501_NOT_IMPLEMENTED
ErrorDocument 502 /errordocs/502_BAD_GATEWAY
ErrorDocument 503 /errordocs/503_SERVICE_UNAVAILABLE
ErrorDocument 506 /errordocs/506_VARIANT_ALSO_VARIES
</IfModule>
</IfModule>

<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
Allow from .cs.nott.ac.uk
</Location>

<Location /server-info>
SetHandler server-info
Order deny,allow
Deny from all
Allow from .cs.nott.ac.uk
</Location>
}}}

Make sure that the server accepts connections to port 80. Check the firewall settings in ''/etc/sysconfig/iptables'' for the following entry:
{{{
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT
}}}

Make sure that the apache daemon is running, and that it will start at boot time:
{{{
/sbin/service httpd start
/sbin/chkconfig --add httpd
/sbin/chkconfig httpd on
}}}

== Email Configuration ==
The ProCKSI server component send emails to the user for several occasions. In order to make sure that they are delivered correctly even when the internet is temporarily not available, a local SMTP server (''postfix'') is set up. This will accect emails from the private network only, store them temporarily (if necessary), and forward them to an email relay server.

Make sure that ''postfix'' is the default mailing software (and not ''sendmail''!).
{{{
system-switch-mail -activate postfix
}}}

Make the following changes to the ''postfix ''configuration file (''/etc/postfix/main.cf''):
{{{
myhostname = procksi0.cs.nott.ac.uk
mydomain = cs.nott.ac.uk
myorigin = $mydomai
inet_interfaces = all
mydestination = $myhostname, localhost.$mydomain, localhost
mynetworks_style = subnet
virtual_alias_maps = hash:/etc/postfix/virtual
relayhost = marian.cs.nott.ac.uk
}}}

Create or modify ''/etc/postfix/virtual'':
{{{
root root@localhost
postmaster postmaster@localhost
adm root@localhost
}}}

Generate the corresponding database file (''virtual.db''):
{{{
postmap /etc/postfix/virtual
}}}

Make sure that the postfix daemon is running, and that it will start at boot time:
{{{
/sbin/service postfix start
/sbin/chkconfig --add postfix
/sbin/chkconfig postfix on
}}}

Make sure that the firewall is not open for port 25 or port 28!

Check that the STMTP server in ''/home/procksi/latest/conf/main.ini'' is set correctly set to ''procksi0.cs.nott.ac.uk''

== Garbage Cleanup Scheduling ==
After a certain period of time, given in ''/home/procksi/latest/conf/main.ini'', sessions and requests expire and must be deleted.

Edit ''procksi's'' crontab file taking effect for the ''latest'' and ''test'' version:
{{{
crontab -e
0-59/1 * * * * /home/procksi/latest/cron/check_sessions.sh
1-59/1 * * * * /home/procksi/latest/cron/check_tasks.sh
2-59/1 * * * * /home/procksi/latest/cron/check_requests.sh
}}}

Analogously for ''/home/procksi/test''.

== Linking External Software ==
Make sure that all links in ''/home/procksi/latest/bin'' point to the correct files of the operating system:
{{{
sh, compress, bzip2, gzip, zip, ppmz, qsub
}}}

Make sure that all further executable links in ''/home/procksi/latest/bin'' point to the correct files on the file system:
{{{
exec_cluster, exec_!DaliLite, exec_MaxCMO, exec_molauto, exec_molscript
}}}