Opened 11 years ago

Last modified 4 years ago

#16 new enhancement

incremental/diff based job reporting

Reported by: bastiaans Owned by: ramonb
Priority: normal Milestone: 2.1x-tbd
Component: general Version:
Keywords: Cc:
Estimated Number of Hours:

Description

Only inject metrics when changes occur in the joblist, to improve performance and reduce overhead.

I.e.: <GMETRIC NAME="+MONARCH-JOB-<ID>" VALUE="<stats>"> <GMETRIC NAME="-MONARCH-JOB-<ID>" VALUE="<stats>">

Or something similar. Perhaps a state file would be usefull, for example for the jobarchived?

Change History (8)

comment:1 Changed 11 years ago by bastiaans

  • Owner changed from somebody to bastiaans
  • Status changed from new to assigned

comment:2 Changed 10 years ago by alexis.michon@…

Hi Ramon,

Can you explain more in details how you see this functionnality ? In your example, i suppose that the sign + before "MONARCH-JOB" symbolizes the change in the status of a job and the sign - the disappearance of the job. Are my guesses correct?

Thanks you

Alexs MICHON

comment:3 Changed 10 years ago by bastiaans

  • Cc alexis.michon@… added

Hi Alexis,

Correct. I'm not sure yet on the technical details of this will work and if it could work. This is just a feature on my wishlist, which has been playing around in my head.

Right now jobmond uses a polling interval, for example 60 seconds. How the backend now works is that it ALWAYS reports ALL jobinfo every 60 seconds.

When you maintain a 500+ node cluster, like one of our bigger Beowulf's, it tends to generate a lot of overhead. It would be great if jobmond would only report changes in jobstate's.

For example: a job start running (from state Q to state R) jobmond would only report jobid=xx state=R, it only reports that once.

However there are a lot of caveats. For one, the frontend could desynchronize with the backend if it misses the state change. Furthermore, the frontend would require some sort of state file to keep a cache of the current job states. In addition you would need the server frontend to 'talk back' to the backend jobmond, when it first starts up it would need to 'ask' for a complete statedump.

In the current gmetric communication setup for jobmond however, this is not possible. It is one-way only.

Cheers,

  • Ramon.

comment:4 Changed 10 years ago by bastiaans

  • Cc alexis.michon@… removed

comment:5 Changed 10 years ago by alexis.michon@…

Thanks you for the explaining. This would be a great feature. Here, we have a small cluster (30 nodes) but sometime more than 10000 jobs and the overload on the master is huge.

++

Alexis

comment:6 Changed 9 years ago by ramonb

  • Owner changed from bastiaans to ramonb
  • Status changed from assigned to new

my username changed

comment:7 Changed 5 years ago by ramonb

  • Milestone set to 2.0

comment:8 Changed 4 years ago by ramonb

  • Milestone changed from 2.0 to 2.1x-tbd
Note: See TracTickets for help on using tickets.