source: trunk/INSTALL @ 222

Last change on this file since 222 was 222, checked in by bastiaans, 18 years ago

jobarchived/make_dbase.sh:

  • even niet meer nodig

INSTALL:

  • added installation and configuration explanation
File size: 4.6 KB
RevLine 
[221]1DESCRIPTION
2===========
3
4        Job Monarch is a set of tools to monitor and optionally archive (batch)job information.
5
6        It is a addon for the Ganglia monitoring system and plugs in to a existing Ganglia setup.
7
[222]8        To view a operational setup with Job Monarch, have a look here: http://ganglia.sara.nl/
[221]9
10
11        Job Monarch stands for 'Job Monitoring and Archiving' tool and consists of three (3) components:
12
13        * jobmond
14
15                The Job Monitoring Daemon.
16                 
17                Gathers PBS/Torque batch statistics on jobs/nodes and submits them into
18                Ganglia's XML stream.
19
20                Through this daemon, users are able to view the PBS/Torque batch system and the
21                jobs/nodes that are in it (be it either running or queued).
22
23        * jobarchived
24
25                The Job Archiving Daemon (optionally).
26
27                Listens to Ganglia's XML stream and archives the job and node statistics.
28                It stores the job statistics in a Postgres SQL database and the node statistics
29                in RRD files.
30               
31                Through this daemon, users are able to lookup a old/finished job
32                and view all it's statistics.
33
34                Optionally: You can either choose to use this daemon if your users have use for it.
35                As it can be a heavy application to run - even though optimized (staged/buffered writes
36                and multi threaded) - and not everyone may have a need for it.
37               
38        * web
39
40                The Job Monarch web interface.
41
42                This interfaces with the jobmond data and (optionally) the jobarchived and presents the
43                data and graphs.
44
45                It does this in a similar layout/setup as Ganglia itself, so the navigation and usage is intuitive.
46               
47
48REQUIREMENTS
49============
50
[222]51        all:
52
53                - Python 2.3 or higher
54
[221]55        jobmond:
56
57                - pbs_python v2.8.1 or higher
58                  ftp://ftp.sara.nl/pub/outgoing/pbs_python.tar.gz
59
[222]60                - gmond v3.0.1 or higher
[221]61                  http://www.ganglia.info
62
63        jobarchived:
64
[222]65                - Postgres vSQL 7.xx
[221]66                  http://www.postgres.org
67
68                - rrdtool v1.xx
69                  http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/
70
[222]71                - python-pgsql v4.x.x
72                  http://sourceforge.net/projects/pypgsql/
73
74                - gmetad v3.x.x
[221]75                  http://www.ganglia.info
76
77        web:
78
[222]79                - PHP v4.1 or higher
[221]80                  http://www.php.net
81
[222]82                - php-pgsql v4.x.x
83                  (should come with Postgres)
[221]84
[222]85                - GD v2.x
86                  http://www.boutell.com/gd/
87
88                - Ganglia web frontend v3.x.x
89
90
[221]91INSTALLATION
92============
93
94        Prior to installing the software make sure you meet the necessary requirements as
95        mentioned above.
96
[222]97        NOTE: You can choose to install to other path/directories if your setup is different.
[221]98
[222]99        * jobmond
[221]100
[222]101                1. Copy jobmond.py:
102
103                 > cp jobmond/jobmond.py /usr/local/sbin/jobmond.py
104
105                2. Copy jobmond.conf:
106               
107                 > cp jobmond/jobmond.conf /etc/jobmond.conf
108
109        * jobarchived
110
111                1. Create a Postgres SQL database for jobarchived:
112
113                 > createdb jobarchive
114
115                2. Setup jobarchived's tables:
116
117                 > psql -f jobarchived/job_dbase.sql jobarchive
118
119                3. Copy jobarchived/jobarchived.conf:
120
121                 > cp jobarchived/jobarchived.conf /etc/jobarchived.conf
122
123                4. Copy jobarchived.py and DBClass.py:
124
125                 > cp jobarchived/jobarchived.py /usr/local/sbin/jobarchived.py
126                 > cp jobarchived/DBClass.py /usr/local/sbin/DBClass.py
127
128        * web
129
130                1. Copy the Job Monarch Template to your Ganglia installation
131
132                 > cp -a web/templates/job_monarch /var/www/ganglia/templates
133
134                2. Copy the web interface files to the addon directory in Ganglia
135
136                 > cp -a web/addons/job_monarch /var/www/ganglia/addons
137
[221]138CONFIGURATION
139=============
140
[222]141        After installation each component requires additional configuration.
[221]142
[222]143        * jobmond
144       
145                1. Edit Jobmond's config to reflect your settings:
146
147                 - In /etc/jobmond.conf
148
149                   ( see config comments for syntax and explanation )
150
151        * jobarchived
152
153                1. Edit Jobarchived's config to reflect your settings:
154
155                 - In /etc/jobarchived.conf
156
157                   ( see config comments for syntax and explanation )
158
159        * web
160
161                1. Change your Ganglia's web template to Job Monarch
162
163                 - In /var/www/ganglia/conf.php:
164
165                 > $template_name = "job_monarch";
166
167                2. Change Job Monarch's config to reflect your settings:
168
169                 - In /var/www/ganglia/addons/job_monarch/conf.php
170
171                   ( see config comments for syntax and explanation )
172
[221]173START
174=====
175
[222]176        * jobmond
[221]177
[222]178                The Job Monitor has to be run on a machine that is allowed to
179                query the PBS/Torque server.
180                Make sure that if you have 'acl_hosts' enabled on your PBS/Torque
181                server that jobmond's machine is in it.
[221]182
[222]183                1. Start the Job Monitor:
184
185                 > /usr/local/sbin/jobmond.py -c /etc/jobmond.conf
186
187        * jobarchived
188
189                1. Start the Job Archiver:
190
191                 > /usr/local/sbin/jobarchived.py -c /etc/jobarchived.conf
192
193        * web
194
195                Doesn't require you to (re)start anything.
196                ( make sure the Postgres database is running though )
197
[221]198CONTACT
199=======
200
201        To contact the author for anything from bugfixes to flame/hate mail:
202
[222]203        * Ramon Bastiaans
204
205          <ramon ( a t ) sara ( d o t ) nl>
Note: See TracBrowser for help on using the repository browser.