Custom Query (101 matches)
Results (37 - 39 of 101)
Ticket | Resolution | Summary | Owner | Reporter |
---|---|---|---|---|
#279 | fixed | patch for SLURM usage | somebody | mrobbert@… |
Description |
related to recent changes in the pyslurm library that jobmond.py uses to interface with Slurm. It was recently re-worked to bring it up to date with the most recent Slurm API and along the way the seem to have changed some of their data structures. Below is a patch I used to get it to run at our site. I hope this helps. --- jobmond/jobmond.py 2014-01-20 09:24:08.000000000 -0700 +++ /usr/local/sbin/jobmond.py 2014-12-16 17:22:00.501223234 -0700 @@ -1306,7 +1306,7 @@ for node, attrs in slurm_nodes.items(): - ( num_state, name_state ) = attrs['node_state'] + name_state = attrs['node_state'] if name_state == 'DOWN': @@ -1371,7 +1371,7 @@ else: ppn = min_cpus - ( something, status_long ) = self.getAttr( attrs, 'job_state' ) + status_long = self.getAttr( attrs, 'job_state' ) status = 'Q' Thanks, Mike Robbert HPC Engineer Colorado School of Mines |
|||
#169 | fixed | sorting (by node) does not work on web | sil | ramonb |
Description |
Thanks to Craig West for discovering this: Clicking the nodes column header should set descending/ascending sorting based upon hostname. This does not seem to work (correctly). |
|||
#24 | worksforme | SGE support broken | ramonb | bastiaans |
Description |
After going through the instructions, I attempted to execute jobmond.py. When I did that, I received the following error message: cluster1:/usr/local/sbin # /usr/local/sbin/jobmond.py -c /etc/jobmond.conf Traceback (most recent call last): File "/usr/local/sbin/jobmond.py", line 814, in ? main() File "/usr/local/sbin/jobmond.py", line 807, in main gather.daemon() UnboundLocalError: local variable 'gather' referenced before assignment An examination of the code reveals that the SGE data gathering code was commented out on line 792. Uncommenting it had the following effect: cluster1:/usr/local/sbin # /usr/local/sbin/jobmond.py -c /etc/jobmond.conf File "/usr/local/sbin/jobmond.py", line 797 debug_msg( 0, "fatal error: BATCH_API set to 'sge' but python module 'sge_drmaa' is not installed' ) ^ SyntaxError: EOL while scanning single-quoted string Commenting out everything but "gather = SgeDataGatherer()" gave me the following error: cluster1:/usr/local/sbin # /usr/local/sbin/jobmond.py -c /etc/jobmond.conf Traceback (most recent call last): File "/usr/local/sbin/jobmond.py", line 814, in ? main() File "/usr/local/sbin/jobmond.py", line 800, in main gather = SgeDataGatherer() File "/usr/local/sbin/jobmond.py", line 419, in __init__ self.initSgeJobInfo() File "/usr/local/sbin/jobmond.py", line 426, in initSgeJobInfo self.qstatparser = SgeQstatXMLParser( SGE_QSTAT_XML_FILE ) NameError: global name 'SGE_QSTAT_XML_FILE' is not defined At this point, I decided to search my systems for references to drmaa. I saw several references to C++ example and header files related to it. Is the sge_drmaa module supposed to be provided by Job Monarch or Sun Grid Engine? |
Note: See TracQuery
for help on using queries.