source: trunk/INSTALL @ 223

Last change on this file since 223 was 223, checked in by bastiaans, 15 years ago

INSTALL:

  • fixed typos
File size: 4.6 KB
Line 
1DESCRIPTION
2===========
3
4        Job Monarch is a set of tools to monitor and optionally archive (batch)job information.
5
6        It is a addon for the Ganglia monitoring system and plugs in to a existing Ganglia setup.
7
8        To view a operational setup with Job Monarch, have a look here: http://ganglia.sara.nl/
9
10
11        Job Monarch stands for 'Job Monitoring and Archiving' tool and consists of three (3) components:
12
13        * jobmond
14
15                The Job Monitoring Daemon.
16                 
17                Gathers PBS/Torque batch statistics on jobs/nodes and submits them into
18                Ganglia's XML stream.
19
20                Through this daemon, users are able to view the PBS/Torque batch system and the
21                jobs/nodes that are in it (be it either running or queued).
22
23        * jobarchived
24
25                The Job Archiving Daemon (optionally).
26
27                Listens to Ganglia's XML stream and archives the job and node statistics.
28                It stores the job statistics in a Postgres SQL database and the node statistics
29                in RRD files.
30               
31                Through this daemon, users are able to lookup a old/finished job
32                and view all it's statistics.
33
34                Optionally: You can either choose to use this daemon if your users have use for it.
35                As it can be a heavy application to run - even though optimized (staged/buffered writes
36                and multi threaded) - and not everyone may have a need for it.
37               
38        * web
39
40                The Job Monarch web interface.
41
42                This interfaces with the jobmond data and (optionally) the jobarchived and presents the
43                data and graphs.
44
45                It does this in a similar layout/setup as Ganglia itself, so the navigation and usage is intuitive.
46               
47
48REQUIREMENTS
49============
50
51        all:
52
53                - Python 2.3 or higher
54
55        jobmond:
56
57                - pbs_python v2.8.1 or higher
58                  ftp://ftp.sara.nl/pub/outgoing/pbs_python.tar.gz
59
60                - gmond v3.0.1 or higher
61                  http://www.ganglia.info
62
63        jobarchived:
64
65                - Postgres SQL v7.xx
66                  http://www.postgres.org
67
68                - rrdtool v1.xx
69                  http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/
70
71                - python-pgsql v4.x.x
72                  http://sourceforge.net/projects/pypgsql/
73
74                - gmetad v3.x.x
75                  http://www.ganglia.info
76
77        web:
78
79                - PHP v4.1 or higher
80                  http://www.php.net
81
82                - php-pgsql v4.x.x
83                  (should come with Postgres)
84
85                - GD v2.x
86                  http://www.boutell.com/gd/
87
88                - Ganglia web frontend v3.x.x
89                  http://www.ganglia.info
90
91
92INSTALLATION
93============
94
95        Prior to installing the software make sure you meet the necessary requirements as
96        mentioned above.
97
98        NOTE: You can choose to install to other path/directories if your setup is different.
99
100        * jobmond
101
102                1. Copy jobmond.py:
103
104                 > cp jobmond/jobmond.py /usr/local/sbin/jobmond.py
105
106                2. Copy jobmond.conf:
107               
108                 > cp jobmond/jobmond.conf /etc/jobmond.conf
109
110        * jobarchived
111
112                1. Create a Postgres SQL database for jobarchived:
113
114                 > createdb jobarchive
115
116                2. Setup jobarchived's tables:
117
118                 > psql -f jobarchived/job_dbase.sql jobarchive
119
120                3. Copy jobarchived/jobarchived.conf:
121
122                 > cp jobarchived/jobarchived.conf /etc/jobarchived.conf
123
124                4. Copy jobarchived.py and DBClass.py:
125
126                 > cp jobarchived/jobarchived.py /usr/local/sbin/jobarchived.py
127                 > cp jobarchived/DBClass.py /usr/local/sbin/DBClass.py
128
129        * web
130
131                1. Copy the Job Monarch Template to your Ganglia installation
132
133                 > cp -a web/templates/job_monarch /var/www/ganglia/templates
134
135                2. Copy the web interface files to the addon directory in Ganglia
136
137                 > cp -a web/addons/job_monarch /var/www/ganglia/addons
138
139CONFIGURATION
140=============
141
142        After installation each component requires additional configuration.
143
144        * jobmond
145       
146                1. Edit Jobmond's config to reflect your settings:
147
148                 - In /etc/jobmond.conf
149
150                   ( see config comments for syntax and explanation )
151
152        * jobarchived
153
154                1. Edit Jobarchived's config to reflect your settings:
155
156                 - In /etc/jobarchived.conf
157
158                   ( see config comments for syntax and explanation )
159
160        * web
161
162                1. Change your Ganglia's web template to Job Monarch
163
164                 - In /var/www/ganglia/conf.php:
165
166                 > $template_name = "job_monarch";
167
168                2. Change Job Monarch's config to reflect your settings:
169
170                 - In /var/www/ganglia/addons/job_monarch/conf.php
171
172                   ( see config comments for syntax and explanation )
173
174START
175=====
176
177        * jobmond
178
179                The Job Monitor has to be run on a machine that is allowed to
180                query the PBS/Torque server.
181                Make sure that if you have 'acl_hosts' enabled on your PBS/Torque
182                server that jobmond's machine is in it.
183
184                1. Start the Job Monitor:
185
186                 > /usr/local/sbin/jobmond.py -c /etc/jobmond.conf
187
188        * jobarchived
189
190                1. Start the Job Archiver:
191
192                 > /usr/local/sbin/jobarchived.py -c /etc/jobarchived.conf
193
194        * web
195
196                Doesn't require you to (re)start anything.
197                ( make sure the Postgres database is running though )
198
199CONTACT
200=======
201
202        To contact the author for anything from bugfixes to flame/hate mail:
203
204        * Ramon Bastiaans
205
206          <ramon ( a t ) sara ( d o t ) nl>
Note: See TracBrowser for help on using the repository browser.