source: trunk/INSTALL @ 222

Last change on this file since 222 was 222, checked in by bastiaans, 18 years ago

jobarchived/make_dbase.sh:

  • even niet meer nodig

INSTALL:

  • added installation and configuration explanation
File size: 4.6 KB
Line 
1DESCRIPTION
2===========
3
4        Job Monarch is a set of tools to monitor and optionally archive (batch)job information.
5
6        It is a addon for the Ganglia monitoring system and plugs in to a existing Ganglia setup.
7
8        To view a operational setup with Job Monarch, have a look here: http://ganglia.sara.nl/
9
10
11        Job Monarch stands for 'Job Monitoring and Archiving' tool and consists of three (3) components:
12
13        * jobmond
14
15                The Job Monitoring Daemon.
16                 
17                Gathers PBS/Torque batch statistics on jobs/nodes and submits them into
18                Ganglia's XML stream.
19
20                Through this daemon, users are able to view the PBS/Torque batch system and the
21                jobs/nodes that are in it (be it either running or queued).
22
23        * jobarchived
24
25                The Job Archiving Daemon (optionally).
26
27                Listens to Ganglia's XML stream and archives the job and node statistics.
28                It stores the job statistics in a Postgres SQL database and the node statistics
29                in RRD files.
30               
31                Through this daemon, users are able to lookup a old/finished job
32                and view all it's statistics.
33
34                Optionally: You can either choose to use this daemon if your users have use for it.
35                As it can be a heavy application to run - even though optimized (staged/buffered writes
36                and multi threaded) - and not everyone may have a need for it.
37               
38        * web
39
40                The Job Monarch web interface.
41
42                This interfaces with the jobmond data and (optionally) the jobarchived and presents the
43                data and graphs.
44
45                It does this in a similar layout/setup as Ganglia itself, so the navigation and usage is intuitive.
46               
47
48REQUIREMENTS
49============
50
51        all:
52
53                - Python 2.3 or higher
54
55        jobmond:
56
57                - pbs_python v2.8.1 or higher
58                  ftp://ftp.sara.nl/pub/outgoing/pbs_python.tar.gz
59
60                - gmond v3.0.1 or higher
61                  http://www.ganglia.info
62
63        jobarchived:
64
65                - Postgres vSQL 7.xx
66                  http://www.postgres.org
67
68                - rrdtool v1.xx
69                  http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/
70
71                - python-pgsql v4.x.x
72                  http://sourceforge.net/projects/pypgsql/
73
74                - gmetad v3.x.x
75                  http://www.ganglia.info
76
77        web:
78
79                - PHP v4.1 or higher
80                  http://www.php.net
81
82                - php-pgsql v4.x.x
83                  (should come with Postgres)
84
85                - GD v2.x
86                  http://www.boutell.com/gd/
87
88                - Ganglia web frontend v3.x.x
89
90
91INSTALLATION
92============
93
94        Prior to installing the software make sure you meet the necessary requirements as
95        mentioned above.
96
97        NOTE: You can choose to install to other path/directories if your setup is different.
98
99        * jobmond
100
101                1. Copy jobmond.py:
102
103                 > cp jobmond/jobmond.py /usr/local/sbin/jobmond.py
104
105                2. Copy jobmond.conf:
106               
107                 > cp jobmond/jobmond.conf /etc/jobmond.conf
108
109        * jobarchived
110
111                1. Create a Postgres SQL database for jobarchived:
112
113                 > createdb jobarchive
114
115                2. Setup jobarchived's tables:
116
117                 > psql -f jobarchived/job_dbase.sql jobarchive
118
119                3. Copy jobarchived/jobarchived.conf:
120
121                 > cp jobarchived/jobarchived.conf /etc/jobarchived.conf
122
123                4. Copy jobarchived.py and DBClass.py:
124
125                 > cp jobarchived/jobarchived.py /usr/local/sbin/jobarchived.py
126                 > cp jobarchived/DBClass.py /usr/local/sbin/DBClass.py
127
128        * web
129
130                1. Copy the Job Monarch Template to your Ganglia installation
131
132                 > cp -a web/templates/job_monarch /var/www/ganglia/templates
133
134                2. Copy the web interface files to the addon directory in Ganglia
135
136                 > cp -a web/addons/job_monarch /var/www/ganglia/addons
137
138CONFIGURATION
139=============
140
141        After installation each component requires additional configuration.
142
143        * jobmond
144       
145                1. Edit Jobmond's config to reflect your settings:
146
147                 - In /etc/jobmond.conf
148
149                   ( see config comments for syntax and explanation )
150
151        * jobarchived
152
153                1. Edit Jobarchived's config to reflect your settings:
154
155                 - In /etc/jobarchived.conf
156
157                   ( see config comments for syntax and explanation )
158
159        * web
160
161                1. Change your Ganglia's web template to Job Monarch
162
163                 - In /var/www/ganglia/conf.php:
164
165                 > $template_name = "job_monarch";
166
167                2. Change Job Monarch's config to reflect your settings:
168
169                 - In /var/www/ganglia/addons/job_monarch/conf.php
170
171                   ( see config comments for syntax and explanation )
172
173START
174=====
175
176        * jobmond
177
178                The Job Monitor has to be run on a machine that is allowed to
179                query the PBS/Torque server.
180                Make sure that if you have 'acl_hosts' enabled on your PBS/Torque
181                server that jobmond's machine is in it.
182
183                1. Start the Job Monitor:
184
185                 > /usr/local/sbin/jobmond.py -c /etc/jobmond.conf
186
187        * jobarchived
188
189                1. Start the Job Archiver:
190
191                 > /usr/local/sbin/jobarchived.py -c /etc/jobarchived.conf
192
193        * web
194
195                Doesn't require you to (re)start anything.
196                ( make sure the Postgres database is running though )
197
198CONTACT
199=======
200
201        To contact the author for anything from bugfixes to flame/hate mail:
202
203        * Ramon Bastiaans
204
205          <ramon ( a t ) sara ( d o t ) nl>
Note: See TracBrowser for help on using the repository browser.