Custom Query (101 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (1 - 3 of 101)

1 2 3 4 5 6 7 8 9 10 11
Ticket Resolution Summary Owner Reporter
#24 worksforme SGE support broken ramonb bastiaans
Description
After going through the instructions, I attempted to execute
jobmond.py. When I did that, I received the following error message:
cluster1:/usr/local/sbin # /usr/local/sbin/jobmond.py -c /etc/jobmond.conf
Traceback (most recent call last):
 File "/usr/local/sbin/jobmond.py", line 814, in ?
   main()
 File "/usr/local/sbin/jobmond.py", line 807, in main
   gather.daemon()
UnboundLocalError: local variable 'gather' referenced before assignment

An examination of the code reveals that the SGE data gathering code
was commented out on line 792. Uncommenting it had the following
effect:
cluster1:/usr/local/sbin # /usr/local/sbin/jobmond.py -c /etc/jobmond.conf
 File "/usr/local/sbin/jobmond.py", line 797
   debug_msg( 0, "fatal error: BATCH_API set to 'sge' but python
module 'sge_drmaa' is not installed' )

                               ^
SyntaxError: EOL while scanning single-quoted string

Commenting out everything but "gather = SgeDataGatherer()" gave me the
following error:
cluster1:/usr/local/sbin # /usr/local/sbin/jobmond.py -c /etc/jobmond.conf
Traceback (most recent call last):
 File "/usr/local/sbin/jobmond.py", line 814, in ?
   main()
 File "/usr/local/sbin/jobmond.py", line 800, in main
   gather = SgeDataGatherer()
 File "/usr/local/sbin/jobmond.py", line 419, in __init__
   self.initSgeJobInfo()
 File "/usr/local/sbin/jobmond.py", line 426, in initSgeJobInfo
   self.qstatparser = SgeQstatXMLParser( SGE_QSTAT_XML_FILE )
NameError: global name 'SGE_QSTAT_XML_FILE' is not defined

At this point, I decided to search my systems for references to drmaa.
I saw several references to C++ example and header files related to
it. Is the sge_drmaa module supposed to be provided by Job Monarch or
Sun Grid Engine? 
#45 worksforme jobarchived storage threads can't be stopped if they take too long ramonb bastiaans
Description

We need to add a function to jobarchived's storage threads so they can be stopped if it is taking too long. Or else too many storage threads may get started, since they are not killed correctly.

Also see ticket #34

#53 worksforme Error trying to run jobarchive ramonb mhanafi@…
Description

Looks like it doesn't find all the hosts and give the following error... I have tried version 0.3.1 and 0.4

[root@aphrodite-adm jobarchived]# python jobarchived.py 
Mon 17 Mar 2008 15:37:36 - Checking database..
Mon 17 Mar 2008 15:37:36 - Check done.
Mon 17 Mar 2008 15:37:36 - Checking rrd archive..
Mon 17 Mar 2008 15:37:36 - Check done.
Mon 17 Mar 2008 15:37:36 - torque_xml_thread(): started.
Mon 17 Mar 2008 15:37:36 - torque_xml_thread(): Retrieving XML data..
Mon 17 Mar 2008 15:37:36 - torque_xml_thread(): Done retrieving.
Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): Parsing XML..
Mon 17 Mar 2008 15:37:36 - main threading started.
Mon 17 Mar 2008 15:37:36 - XML: Processed 1492 elements - found 1 (updated) jobs
Mon 17 Mar 2008 15:37:36 - ganglia_xml_thread(): started.
Mon 17 Mar 2008 15:37:36 - ganglia_xml_thread(): Sleeping.. (15s)
Mon 17 Mar 2008 15:37:36 - torque_xml_thread(): Storing..
Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): started.
Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): Retrieving XML data..
Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): Done retrieving.
Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): Parsing XML..
Mon 17 Mar 2008 15:37:36 - ganglia_store_metric_thread(): started.
Mon 17 Mar 2008 15:37:36 - ganglia_store_metric_thread(): Storing data..
Mon 17 Mar 2008 15:37:36 - ganglia_store_thread(): started.
Mon 17 Mar 2008 15:37:36 - ganglia_store_thread(): Sleeping.. (360s)
Mon 17 Mar 2008 15:37:36 - Entering storeMetrics()
Mon 17 Mar 2008 15:37:36 - size of cluster 'aphrodite': 3 hosts 71 metrics 71 values 1027 bits 128 bytes 
Exception in thread store_metric_thread:
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "/usr/lib64/python2.4/threading.py", line 422, in run
    self.__target(*self.__args, **self.__kwargs)
  File "jobarchived.py", line 1378, in storeThread
    ret = self.myXMLHandler.storeMetrics()
  File "jobarchived.py", line 1104, in storeMetrics
    ret = rrdh.storeMetrics()
  File "jobarchived.py", line 1752, in storeMetrics
    create_ret = self.createCheck( hostname, metricname, period )
  File "jobarchived.py", line 1891, in createCheck
    heartbeat   = 8 * int( interval )
TypeError: int() argument must be a string or a number

Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): Done parsing.
Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): finished.
Mon 17 Mar 2008 15:37:36 - torque_xml_thread(): Done storing.
Mon 17 Mar 2008 15:37:36 - ganglia_parse_thread(): Done parsing.
Mon 17 Mar 2008 15:37:36 - torque_xml_thread(): Sleeping.. (15s)
1 2 3 4 5 6 7 8 9 10 11
Note: See TracQuery for help on using queries.