Custom Query (101 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (1 - 3 of 101)

1 2 3 4 5 6 7 8 9 10 11
Ticket Resolution Summary Owner Reporter
#163 fixed Exception in thread store_metric_thread ramonb vitt@…
Description

I have tried version 1.0. Jobarchive is available in web, it shows list archived jobs, but there is no store metrics.

[root@master ~]# service jobarchived start
Starting Job Archiving Daemon: Sun 05 May 2013 23:53:00 - XML: Handler created
Sun 05 May 2013 23:53:00 - Checking database..
Sun 05 May 2013 23:53:00 - Check done.
Sun 05 May 2013 23:53:00 - Checking rrd archive..
Sun 05 May 2013 23:53:00 - Check done.
Sun 05 May 2013 23:53:00 - job_xml_thread(): started.
Sun 05 May 2013 23:53:00 - job_xml_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:00 - job_xml_thread(): Done retrieving: data size 37183
Sun 05 May 2013 23:53:00 - job_xml_thread(): Parsing XML..
Sun 05 May 2013 23:53:00 - main threading started.
Sun 05 May 2013 23:53:00 - ganglia_xml_thread(): started.
Sun 05 May 2013 23:53:00 - ganglia_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:53:00 - ganglia_parse_thread(): started.
Sun 05 May 2013 23:53:00 - ganglia_parse_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:00 - ganglia_parse_thread(): Done retrieving: data size 37183
Sun 05 May 2013 23:53:00 - ganglia_parse_thread(): Parsing XML..
Sun 05 May 2013 23:53:00 - ganglia_store_metric_thread(): started.
Sun 05 May 2013 23:53:00 - ganglia_store_metric_thread(): Storing data..
Sun 05 May 2013 23:53:00 - ganglia_store_thread(): started.
Sun 05 May 2013 23:53:00 - Entering storeMetrics()
Sun 05 May 2013 23:53:00 - size of cluster 'Test Cluster': 0 hosts 0 metrics 0 values 0 bits 0 bytes 
Sun 05 May 2013 23:53:00 - ganglia_store_thread(): Sleeping.. (60s)
Sun 05 May 2013 23:53:00 - Leaving storeMetrics()
Sun 05 May 2013 23:53:00 - ganglia_store_metric_thread(): Done storing.
Sun 05 May 2013 23:53:00 - ganglia_store_metric_thread(): finished.
Sun 05 May 2013 23:53:00 - XML: Start document
Sun 05 May 2013 23:53:00 - XML: Processed 518 elements - found 0 jobs
Sun 05 May 2013 23:53:00 - job_xml_thread(): Found 0 updated jobs.
Sun 05 May 2013 23:53:00 - job_xml_thread(): No jobs to store.
Sun 05 May 2013 23:53:00 - job_xml_thread(): Done parsing.
Sun 05 May 2013 23:53:00 - job_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:53:00 - ganglia_parse_thread(): Done parsing.
Sun 05 May 2013 23:53:00 - ganglia_parse_thread(): finished.
Sun 05 May 2013 23:53:15 - ganglia_xml_thread(): Done sleeping.
Sun 05 May 2013 23:53:15 - ganglia_xml_thread(): finished.
Sun 05 May 2013 23:53:15 - ganglia_parse_thread(): started.
Sun 05 May 2013 23:53:15 - ganglia_parse_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:15 - ganglia_xml_thread(): started.
Sun 05 May 2013 23:53:15 - ganglia_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:53:15 - ganglia_parse_thread(): Done retrieving: data size 37196
Sun 05 May 2013 23:53:15 - ganglia_parse_thread(): Parsing XML..
Sun 05 May 2013 23:53:15 - job_xml_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:15 - job_xml_thread(): Done retrieving: data size 37196
Sun 05 May 2013 23:53:15 - job_xml_thread(): Parsing XML..
Sun 05 May 2013 23:53:15 - XML: Start document
Sun 05 May 2013 23:53:15 - XML: Processed 518 elements - found 0 jobs
Sun 05 May 2013 23:53:15 - job_xml_thread(): Found 0 updated jobs.
Sun 05 May 2013 23:53:15 - job_xml_thread(): No jobs to store.
Sun 05 May 2013 23:53:15 - job_xml_thread(): Done parsing.
Sun 05 May 2013 23:53:15 - job_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:53:15 - ganglia_parse_thread(): Done parsing.
Sun 05 May 2013 23:53:15 - ganglia_parse_thread(): finished.
Sun 05 May 2013 23:53:30 - ganglia_xml_thread(): Done sleeping.
Sun 05 May 2013 23:53:30 - ganglia_xml_thread(): finished.
Sun 05 May 2013 23:53:30 - ganglia_xml_thread(): started.
Sun 05 May 2013 23:53:30 - ganglia_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:53:30 - ganglia_parse_thread(): started.
Sun 05 May 2013 23:53:30 - ganglia_parse_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:30 - ganglia_parse_thread(): Done retrieving: data size 37194
Sun 05 May 2013 23:53:30 - ganglia_parse_thread(): Parsing XML..
Sun 05 May 2013 23:53:30 - job_xml_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:30 - job_xml_thread(): Done retrieving: data size 37194
Sun 05 May 2013 23:53:30 - job_xml_thread(): Parsing XML..
Sun 05 May 2013 23:53:30 - XML: Start document
Sun 05 May 2013 23:53:30 - XML: Processed 518 elements - found 0 jobs
Sun 05 May 2013 23:53:30 - job_xml_thread(): Found 0 updated jobs.
Sun 05 May 2013 23:53:30 - job_xml_thread(): No jobs to store.
Sun 05 May 2013 23:53:30 - job_xml_thread(): Done parsing.
Sun 05 May 2013 23:53:30 - job_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:53:30 - ganglia_parse_thread(): Done parsing.
Sun 05 May 2013 23:53:30 - ganglia_parse_thread(): finished.
Sun 05 May 2013 23:53:45 - ganglia_xml_thread(): Done sleeping.
Sun 05 May 2013 23:53:45 - ganglia_xml_thread(): finished.
Sun 05 May 2013 23:53:45 - ganglia_parse_thread(): started.
Sun 05 May 2013 23:53:45 - ganglia_parse_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:45 - ganglia_xml_thread(): started.
Sun 05 May 2013 23:53:45 - ganglia_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:53:45 - ganglia_parse_thread(): Done retrieving: data size 37162
Sun 05 May 2013 23:53:45 - ganglia_parse_thread(): Parsing XML..
Sun 05 May 2013 23:53:45 - ganglia_parse_thread(): Done parsing.
Sun 05 May 2013 23:53:45 - ganglia_parse_thread(): finished.
Sun 05 May 2013 23:53:45 - job_xml_thread(): Retrieving XML data..
Sun 05 May 2013 23:53:45 - job_xml_thread(): Done retrieving: data size 37162
Sun 05 May 2013 23:53:45 - job_xml_thread(): Parsing XML..
Sun 05 May 2013 23:53:45 - XML: Start document
Sun 05 May 2013 23:53:45 - XML: Processed 518 elements - found 0 jobs
Sun 05 May 2013 23:53:45 - job_xml_thread(): Found 0 updated jobs.
Sun 05 May 2013 23:53:45 - job_xml_thread(): No jobs to store.
Sun 05 May 2013 23:53:45 - job_xml_thread(): Done parsing.
Sun 05 May 2013 23:53:45 - job_xml_thread(): Sleeping.. (15s)
Sun 05 May 2013 23:54:00 - ganglia_store_thread(): Done sleeping.
Sun 05 May 2013 23:54:00 - ganglia_store_thread(): finished.
Sun 05 May 2013 23:54:00 - ganglia_store_metric_thread(): started.
Sun 05 May 2013 23:54:00 - ganglia_store_metric_thread(): Storing data..
Sun 05 May 2013 23:54:00 - Entering storeMetrics()
Sun 05 May 2013 23:54:00 - size of cluster 'Test Cluster': 1 hosts 97 metrics 388 values 6172 bits 771 bytes 
Sun 05 May 2013 23:54:00 - ganglia_store_thread(): started.
Sun 05 May 2013 23:54:00 - ganglia_store_thread(): Sleeping.. (60s)
Exception in thread store_metric_thread:
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "/usr/lib64/python2.4/threading.py", line 422, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/sbin/jobarchived", line 1464, in storeThread
    ret = self.myXMLHandler.storeMetrics()
  File "/usr/sbin/jobarchived", line 1188, in storeMetrics
    ret = rrdh.storeMetrics()
  File "/usr/sbin/jobarchived", line 1843, in storeMetrics
    create_ret = self.createCheck( hostname, metricname, period )
  File "/usr/sbin/jobarchived", line 1982, in createCheck
    heartbeat    = 8 * int( interval )
TypeError: int() argument must be a string or a number

Sun 05 May 2013 23:54:00 - ganglia_xml_thread(): Done sleeping.
Sun 05 May 2013 23:54:00 - ganglia_xml_thread(): finished.
Sun 05 May 2013 23:54:00 - job_xml_thread(): Retrieving XML data..
Sun 05 May 2013 23:54:00 - job_xml_thread(): Done retrieving: data size 37162
Sun 05 May 2013 23:54:00 - job_xml_thread(): Parsing XML..
Sun 05 May 2013 23:54:00 - XML: Start document
Sun 05 May 2013 23:54:00 - XML: Processed 518 elements - found 0 jobs
Sun 05 May 2013 23:54:00 - job_xml_thread(): Found 0 updated jobs.
Sun 05 May 2013 23:54:00 - job_xml_thread(): No jobs to store.
Sun 05 May 2013 23:54:00 - job_xml_thread(): Done parsing.
Sun 05 May 2013 23:54:00 - job_xml_thread(): Sleeping.. (15s)
#56 invalid Display of a name and information from nodes. somebody user
Description

Hello,

I see interest situation in display the info from nodes in jobmonarch. I see two way for this: my nodes displayed as short name & full dns name (whit domain name), for example, as node01 & node01.local.net. In this case in statistics (displayed in webfronted) I have two numbers of nodes: one - always present & showed (full name), other - when jobs running on this node (short name). And Number of Nodes = real number (based on full name) + number of Nodes which now used by jobs (based on short name). And I have also two list of nodes with different name in clusterimage for jobs. But only nodes with short name marked as "J" for example.

I don't know how I can solve this because I don't know who produces this thing: jobmonarch or ganglia or batch system or problem in dns. What people think about this & how I can identify my problem.

Thanx.

#69 wontfix Job information leaking over from one Ganglia cluster to another when clusters are in the same PBS queue ramonb renfro@…
Description

At one time, I had many Torque queues: one for each group of homogeneous systems. Since I couldn't rely on my users to consistently check qstat, showq, or Ganglia before submitting a job to an queue with free CPUs rather than a queue with none, I converted my Torque settings to put all cluster systems into one queue, and use Maui partitions to keep parallel jobs on a group of homogeneous systems. This has worked out great as far as queue efficiency is concerned.

Now that I'm getting Job Monarch integrated into the setup, I've noticed that active jobs in my batch queue show up in all cluster joblists and overviews, even when that particular cluster has no active jobs on its nodes. I'll try to attach screenshots, but if my users still have jobs running when you read this ticket, you can see for yourself on the live server:

Ganglia knows, for example, that "ChE Compute Nodes" has 9 systems in it named ch226-11 ... ch226-19. Monarch displays those, but also displays ch226-29 and ch226-31 from the "PNGV Project Compute Nodes" cluster that had active jobs. Any cluster view where Monarch was enabled had this effect.

1 2 3 4 5 6 7 8 9 10 11
Note: See TracQuery for help on using queries.