Changes between Version 20 and Version 21 of WikiStart


Ignore:
Timestamp:
12/06/06 16:36:38 (17 years ago)
Author:
bastiaans
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v20 v21  
    66Job Monarch is an addon to the [http://www.ganglia.info/ Ganglia Monitoring System] that provides (batch) job monitoring and graphical overview of clusters and assorted batch systems. Monarch is an abbreviation for Monitoring and Archiving, as Monarch also provides the ability to archive these job (monitoring) statistics so that your (batch) cluster users may lookup job information of old (and possibly failed) jobs to analyze possible problems.
    77
     8== Features ==
     9
     10Job Monarch stands for 'Job Monitoring and Archiving' tool and consists of three (3) components:
     11
     12 __jobmond__::
     13
     14   The Job Monitoring Daemon.
     15                 
     16   Gathers PBS/Torque batch statistics on jobs/nodes and submits them into
     17   Ganglia's XML stream.
     18
     19   Through this daemon, users are able to view the PBS/Torque batch system and the
     20   jobs/nodes that are in it (be it either running or queued).
     21
     22 __jobarchived (optionally)__::
     23
     24   The Job Archiving Daemon.
     25
     26   Listens to Ganglia's XML stream and archives the job and node statistics.
     27   It stores the job statistics in a Postgres SQL database and the node statistics
     28   in RRD files.
     29               
     30   Through this daemon, users are able to lookup a old/finished job
     31   and view all it's statistics.
     32
     33   Optionally: You can either choose to use this daemon if your users have use for it.
     34   As it can be a heavy application to run and not everyone may have a need for it.
     35
     36   * Key features
     37     * Multithreaded[[BR]]
     38       Will not miss any data regardless of (slow) storage
     39     * Staged writing[[BR]]
     40       Spread load over bigger time periods
     41     * High precision RRDs[[BR]]
     42       Allow for zooming on old periods with large precision
     43     * Timeperiod RRDs[[BR]]
     44       Allow for smaller number of files while still keeping advantage of small disk space
     45               
     46 __web__::
     47
     48   The Job Monarch web interface.
     49
     50   This interfaces with the jobmond data and (optionally) the jobarchived and presents the
     51   data and graphs.
     52
     53   It does this in a similar layout/setup as Ganglia itself, so the navigation and usage is intuitive.
     54
     55   * Key features
     56     * Graphical usage[[BR]]
     57       Displays graphical cluster overview so you can see the cluster (job) state
     58       in one view/image and additional pie chart with relevant information on your
     59       current view
     60     * Filters[[BR]]
     61       Ability to filter output to limit information displayed (usefull for those
     62       clusters with 500+ jobs). This also filters the graphical overview images output
     63       and pie chart so you only see the filter relevant data
     64     * Archive[[BR]]
     65       When enabling jobarchived, users can go back as far as recorded in the database
     66       or archived RRDs to find out what happened to a crashed or old job
     67     * Zoom ability[[BR]]
     68       Users can zoom into a timepriod as small as the smallest grain of the RRDS
     69       (typically up to 10 seconds) when a jobarchived is present
     70
     71== Requirements ==
     72
     73 * Python 2.3 or higher
     74
     75 __jobmond__
     76
     77 * pbs_python v2.8.2 or higher[[BR]]
     78   ftp://ftp.sara.nl/pub/outgoing/pbs_python.tar.gz
     79 * gmond v3.0.1 or higher[[BR]]
     80   http://www.ganglia.info
     81
     82 __jobarchived__
     83
     84 * Postgres SQL v7.xx[[BR]]
     85   http://www.postgres.org
     86
     87 * rrdtool v1.xx[[BR]]
     88   http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/
     89
     90 * python-pgsql v4.x.x[[BR]]
     91   http://sourceforge.net/projects/pypgsql/
     92
     93 * gmetad v3.x.x[[BR]]
     94   http://www.ganglia.info
     95
     96 __web__
     97
     98 * PHP v4.1 or higher[[BR]]
     99   http://www.php.net
     100 * php-mbstring (multibyte string handling support)[[BR]]
     101   (configure php with --enable-mbstring)
     102 * php-pgsql v4.x.x[[BR]]
     103   (should come with Postgres)
     104 * GD v2.x[[BR]]
     105   http://www.boutell.com/gd/
     106 * Ganglia web frontend v3.x.x[[BR]]
     107   http://www.ganglia.info
     108
    8109== Documentation ==
    9110
    10  * [wiki:ToBeAdded Installation]
    11  * [wiki:ToBeAdded Usage]
     111 * [wiki:Documentation/Requirements Requirements]
     112 * [wiki:Documentation/Installation Installation]
     113 * [wiki:Documentation/Usage Usage]
    12114
    13115== Source code ==