Opened 16 years ago

Closed 11 years ago

#62 closed defect (worksforme)

job archive question

Reported by: jsarlo@… Owned by: somebody
Priority: normal Milestone: 1.0
Component: jobarchived Version: 0.3
Keywords: Cc:
Estimated Number of Hours:

Description

Question about setting up job archive. If I have jobmond.py running on multiple clusters, but the data is only viewable through the web on a central repository web site, do I have to set up the archive database on each of the clusters or just on the central repository server? Also where would I run the two .py scripts - on each cluster or just the central one?

Thanks. Jeff

Change History (8)

comment:1 Changed 16 years ago by ramonb

  • Cc jsarlo@… added

archiveing multiple clusters in 1 jobarchived should work in theory, but it hasn't been tested thorougly. You can run jobarchived on any machine that has read access to your gmetad ("trusted hosts" in gmetad.conf).

comment:2 Changed 16 years ago by ramonb

Also: the sql database does not have to be on the same machine as where jobarchived is running.

The archived RRD files created by jobarchived do need to be accessable by the web frontend, so that probably means that you have to run it on your Ganglia web server.

comment:3 Changed 16 years ago by jsarlo@…

OK - I put all the requirements for jobarchive on my web server where gmetad runs for all the clusters. I got the following when I tried to start jobarchived.py

Starting jobarchived.py: Traceback (most recent call last):

File "/opt/jobmonarch/sbin/jobmond.py", line 814, in <module>

main()

File "/opt/jobmonarch/sbin/jobmond.py", line 772, in main

if not processArgs( sys.argv[1:] ):

File "/opt/jobmonarch/sbin/jobmond.py", line 76, in processArgs

return loadConfig( config_filename )

File "/opt/jobmonarch/sbin/jobmond.py", line 131, in loadConfig

BATCH_SERVER = cfg.get( 'DEFAULT', 'TORQUE_SERVER' )

File "/usr/local/lib/python2.5/ConfigParser.py", line 520, in get

raise NoOptionError?(option, section)

ConfigParser?.NoOptionError?: No option 'torque_server' in section: 'DEFAULT'

comment:4 Changed 16 years ago by jsarlo@…

Sorry I had a typo which caused the error. Fixed that typo and got a new error, but I will work on that for a while.

comment:5 Changed 16 years ago by jsarlo@…

The jobarchived list python-pgsql v4.x.x as a requirement. All I could find was python-pgsql v2.5.1 which I installed. When I try to start up jobarchived.py, I get

Starting jobarchived.py: FATAL ERROR: pyPgSQL python module not found

when I did the python setup.py install for python-pgsql, it installed it in the /usr/local/lib/python2.5 directory under the site-packages.

Any ideas what I need to do to get it to find the module?

Thanks. Jeff

comment:6 Changed 16 years ago by jsarlo@…

Any new ideas on how I can get jobarchive to run?

Thanks. Jeff

comment:7 Changed 14 years ago by mike.scchen@…

Python does not look into the /usr/local in default.

Just copy /usr/local/lib (i386 Linux?), including sub-directories, into /usr/lib, and it should work.

comment:8 Changed 11 years ago by ramonb

  • Cc jsarlo@… removed
  • Milestone set to 1.0
  • Resolution set to worksforme
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.