Custom Query (101 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (52 - 54 of 101)

Ticket Resolution Summary Owner Reporter
#76 worksforme jobarchived does not change status to "F" ramonb j.kasiak@…
Description

Jobarchived does not update a jobs status to "F" once it finishes. Jobmond runs on the head node. gmetad runs on a seperate box. I've narrowed down the problem: when I do on my gmetad box

telnet -l ganglia localhost 8651 | grep -i monarch | grep -i 23055

<METRIC NAME="MONARCH-JOB-23055-0" VAL="status=R start_timestamp=1269222985 name=STDIN poll_interval=30 queue=batch reported=1269223164 requested_time=100:00:00 queued_timestamp=1269222984 owner=user1 nodes=p340050" TYPE="string" UNITS="" TN="442" TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmond"> Connection closed by foreign host.

The job is still there!!! Only a restart of gmetad clears this. This is a problem, since jobarchived parses this xml file and puts this node in an array of active nodes, and never gets to set the job_status to "F".

How can I fix this? Thanks, Jan

#160 fixed version 1.0 ramonb ramonb
Description

Work in progress to get Job Monarch working again with new Ganglia.

Should result in release: 1.0

Now have a semi-working setup:

Tested with Ganglia 3.40 and Ganglia-web2 3.5.6

Already fixed a lot, but still some stuff to do.

TODO:

  • Still a little slow
  • update web templates to ganglia-web2 html/css
  • fix running/queued jobs graph
  • rename metrics so they display lower on alphabet sort of metrics list
  • fix job arrays
  • check archive
  • address (missing) jobrange/jobstart line for RRDs in overview
#161 fixed xml parsetime not set in html ramonb ramonb
Description

Viewing cluster overview job report, it says at bottom of page:

Downloading and parsing ganglia's XML tree took 0.0000s.

That is not correct

Note: See TracQuery for help on using queries.