wiki:Documentation/Configuration

Version 9 (modified by ramonb, 8 years ago) (diff)

--

Configuration

After installation each component requires additional configuration.

jobmond

Here is an example of a typical jobmond.conf file contents:

[DEFAULT]
# Specify debugging level here;
#
# 10 = gemtric cmd's
#
DEBUG_LEVEL     : 0

# Wether or not to run as a daemon in background
#
DAEMONIZE       : 1

# What Batch type is the system
# 
# Currently supported: pbs, slurm, sge (experimental), lsf (experimental)
#
BATCH_API       : pbs

# Which Batch server to monitor
#
BATCH_SERVER        : localhost

# Which queue(s) to report jobs of
# (optional)
#
#QUEUE          : long, short

# How many seconds interval for polling of jobs
#
# this will effect directly how accurate the
# end time of a job can be determined
#
BATCH_POLL_INTERVAL : 30

# Location of gmond.conf
#
# Default: /etc/gmond.conf
#
# DEPRECATED!:      use GMETRIC_TARGET!
#
#GMOND_CONF     : /etc/gmond.conf

# Location of gmetric binary
#
# Default: /usr/bin/gmetric
#
# DEPRECATED!:      use GMETRIC_TARGET!
#
#GMETRIC_BINARY     : /usr/bin/gmetric

# Target of Gmetric's: where should we report to
# (usually: your udp_send_channel from gmond)
#
# Syntax: <ip>:<port>
#
GMETRIC_TARGET      : 239.2.11.71:8649

# Enable logging to syslog?
#
USE_SYSLOG                      : 1
# What level msg'es should be logged to syslog?
#
# usually: lvl 0 (errors)
#
SYSLOG_LEVEL                    : 0

# Which facility to use in syslog
#
# Known:
#       KERN, USER, MAIL, DAEMON, AUTH, LPR,
#       NEWS, UUCP, CRON and LOCAL0 through LOCAL7
#
SYSLOG_FACILITY                 : DAEMON


# Wether or not to detect differences in
# time from Torque server and local time.
#
# Ideally both machines (if not the same)
# should have the same time (via ntp or whatever)
#
DETECT_TIME_DIFFS   : 1

# Regexp style hostname translation
#
# Usefull if your Batch hostnames are not the same as your
# Ganglia hostnames (different network interfaces)
#
# Syntax: /orig/new/, /orig/new/
#
BATCH_HOST_TRANSLATE    :

DEBUG_LEVEL

  • required
  • valid values: any number between 0 - 20

This level sets which level of messages are either syslogged (in daemon mode) and/or printed to stdout (in foreground mode)

DAEMONIZE

  • required
  • valid values: 0 or 1
    • 0 : Don't daemonize: run in the foreground : any DEBUG_LEVEL messages are sent to stdout
    • 1 : Daemonize: run in the background : any DEBUG_LEVEL messages are sent to syslog

Determines wether or not jobmond should run as daemon in background.

BATCH_API

What type of batch (api) system is used.

BATCH_SERVER

  • optional
  • valid values: any text string

Tell's jobmond wether or not to connect to a remote batch server (of type BATCH_API) or not.

If set: connect with BATCH_API to BATCH_SERVER If not set: use BATCH_API on local system where jobmond is running (should be on batch server)

QUEUE

  • optional
  • valid values: any text string or comma seperated list

If you would like to limit job reporting to only certain queue's, you can specify them here.

If set: only jobs are reported that reside in QUEUE

BATCH_POLL_INTERVAL

  • required
  • valid values: any number (of seconds)

Sets how often jobmond will poll the BATCH_API and how often this info will be reported.

This directly affects how accurately jobarchived can monitor for finished jobs. For example: if this is set to 180 seconds and a job has finished it may take jobarchived up to 180 seconds to set an finished time in the job database

jobarchived

  1. Edit Jobarchived's config to reflect your settings:
    vi /etc/jobarchived.conf
    

( see config comments for syntax and explanation )

web

  1. Change your Ganglia's web template to Job Monarch
    vi /var/www/ganglia/conf.php
    
    $template_name = "job_monarch";
    
  2. Change Job Monarch's config to reflect your settings:
    vi /var/www/ganglia/addons/job_monarch/conf.php
    

( see config comments for syntax and explanation )