source: branches/1.1/CHANGELOG @ 935

Last change on this file since 935 was 935, checked in by ramonb, 11 years ago


  • updated for 1.1.1
File size: 12.3 KB
2        LEGEND  f: fixed - c: changed - a: added - r: removed
6    web)
8        c: column nodes renamed to: hosts
9        a: sorting by hosts now implemented
11    jobarchived)
13        f: now properly exits on fatal xml errors
14        f: prevent exception to occur when no timed out jobs are found during
15           Housekeeping
17    jobmond)
19        f: BATCH_HOST_TRANSLATE no longer required in jobmond.conf
23    web)
25        a: archive search now has "include running jobs" option
26        c: rewritten short versus FQDN hostname detection: now works properly
27           with ganglia hosts not using FQDN hostnames
28        f: display of xml parsetime for overview. no longer display parsetime
29           for archive (no parsing done)
30        f: down/offline nodes are now properly marked in cluster image again
31        f: bug where "Unavailable" row would not be shown in overview summary
32           table
34    packaging)
36        c: completely redone and rewritten by Olivier Lahaye - thanks!
38    jobmond)
40        a: now supports SLURM Workload Manager!
41        a: warning if connecting to remote BATCH_SERVER is not supported by
42           selected BATCH_API
43        f: bug where incorrect commandline option would trigger traceback in
44           usage()
46    jobarchived)
48        a: now performs regular database Housekeeping every 20 job XML
49           iterations (previously only once at startup)
50        a: now checks if ARCHIVE_DATASOURCES are present in gmetad.conf
51        f: prevent an Exception to occur when determining datasource polling
52           interval
53        f: bug where config file handle was not closed
57    jobmond)
59        a: now supports multiple udp send channels
60        a: now supports job arrays
61        c: updated Gmetric XDR protocol to version 3.1+ compatible
63        c: gmond.conf parsing has been rewritten to handle include's and
64           multiple send channels
65        c: METRIC_MAX_VAL_LEN is now determined from gmond.conf
66        c: utilize new job monarch protocol
68        f: can now handle new PBSQuery / pbs_python versions
69        f: default gmond.conf search location is now /etc/ganglia/gmond.conf
70        f: fatal error's are now printed to shell upon startup, not just syslog
71        f: more error checking and miscellanious bugfixes
73    jobarchived)
75        r: no longer use pyPgSQL for postgres database
76        c: now use psycopg2 module for postgres database
78        a: job thread now utilizes db commits and rollbacks
79        a: now use USER/PASS authentication to database (in stead of hostbased)
81        c: database schema: changed job_id to varchar to support job arrays
82        c: database schema: changed job_name max length to 255, just like
83           torque
84        c: database schema: added username/password role authentication
85        c: utilize new job monarch protocol
87        f: job thread no longer hangs when insert/update of a job in database
88           fails
89        f: rewrite of job (finished) detection: all finished jobs again
90           properly detected
91        f: job checking now done post-parsing not while parsing
92        f: more error checking and miscellanious bugfixes
94    web)
96        r: removed Pie chart
97        r: removed TemplatePower
98        r: removed php ini_set's and time limit directive: should be handled in
99           php.ini
100        r: removed "Get Fresh Data" button: served no purpose anymore
101        a: now utilize Dwoo templates for html output
103        a: now use USER/PASS authentication to database (in stead of hostbased)
104        a: ClusterImage now drops a shadow below nodes
105        a: RRDs now show "Last: Min: Avg: Max:" values in legend
107        c: utilize new job monarch protocol
108        c: all templates rewritten from TemplatePower to Dwoo
109        c: graph.php now used for overview and archive
110        c: RRDs job start/finish line is now dashed green/red line with legend
112        f: some dbase fields are now CAST to INT for php since postgres now
113           requires explicit casts
114        f: sort order descending/ascending is now correct
115        f: many, many speed and memory improvements
116        f: more error checking and miscellanious bugfixes
120        jobmond)
121                a:      SGE support
122                        thanks to: Dave Love - d(d.o.t)love(a.t)liverpool(d.o.t)ac(d.o.t)uk
123                        for writing it!
124                a:      LSF support
125                        thanks to: Mahmoud Hanafi - mhanafi(a.t)csc(d.o.t)com
126                        for writing it!
127                a:      GMETRIC_TARGET is now parsed from gmond.conf
128                a:      GMETRIC_BINARY is now looked for in PATH
129                f:      queue selection support is now working
130                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
131                        for the patch
132        web)
133                a:      large graphs link for job report
134                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
135                a:      SHOW_EMPTY_COLUMN, SHOW_EMPTY_ROW options for ClusterImage hostname parsing
139        other)
140                f:      updated INSTALL since "addons" directory is not included by default anymore in Ganglia
141                        thanks to: Steven DuChene linux(d.a.s.h)clusters(a.t)mindspring(d.o.t)com
142                        for reporting it
144        rpm)
145                f:      add "addons" directory since it's not included by default anymore in Ganglia
146                f:      properly rewrite WEBDIR path in %files when rebuilding rpms with Makefile
148        web)
149                f:      typo in empty_cpu variable: causing incorrect 'free cpu' count and similar errors
150                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
151                        for reporting it
152                f:      changed erroneous domain detection a little
153                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
154                        for reporting it
155                a:      now properly detects whether or not to use FQDN or short hostnames w/o domain
156                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
157                        thanks to: Jeffrey Sarlo - JSarlo(a.t)Central(d.o.t)UH(d.o.t)EDU
158                        for the many testing and reporting it
160                        SPECIAL THANKS to the University of Houston for sending me a shirt!
162        jobarchived)
163                f:      properly catch postgres exception
164                f:      don't use debug_message while loading config file
168        web)
169                a:      allow per-cluster settings/override options: see CLUSTER_CONFS option
170                a:      clusterimage can now draw nodes at x,y position parsed from hostname
171                        see SORTBY_HOSTNAME for this in clusterconf/example.php
172                a:      clusterimage nodes are now clickable: has link to all jobs from that host
173                a:      clusterimage nodes now have a tooltip: displays hostname and jobids for now
174                a:      jobmonarch logo image
175                        thank to: Robin Day
176                        for the design
177                a:      rrd graph of running/queued jobs to overview
178                a:      per-cluster settings for archive database
179                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
180                        for the patch
182                c:      host archive view is now more complete and detailed in the same manner as
183                        Ganglia's own host view
184                c:      host archive view available metric list is now compiled from disk,
185                        so that the detailed archive host view works even when the node is currently down.
186                c:      removed size restrictions from detailed host archive view
188                f:      compatibility: removed php5 call
189                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
190                        for the patch
191                f:      prevent negative cpu/node calculation
192                        thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es
193                        for the patch
194                f:      archive search not properly resetting nodes list
195                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
196                        for the patch
197                f:      detailed host view from jobarchive was broken since hostbased support of 0.2
198                        now host view is properly set and parsed again
199                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
200                        for reporting the bug and suggesting a patch
201                f:      bug where jobstart redline indicator in host detail graphs was set incorrectly
202                        or not at all due to a miscalculation in job times
203                f:      bug where hostimage headertext xoffset was miscalculated, causing the column names
204                        to overlap their position when the columnname was longer than the columnvalues
206        jobmond)
208                a:      syslog support
209                a:      report number of running/queued jobs as seperate metrics
210                a:      native gmetric support, much faster and cleaner!
211                        thanks to: Nick Galbreath - nickg(a.t)modp(d.o.t)com
212                        for writing it and allowing inclusion in jobmond
214                f:      crashing jobmond when multiple nodes amounts are requested in
215                        a queued job: numeric_node variable not initialized properly
216                        thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es
217                        for supplying the patch
218                        and many others for reporting and helping debug this
219                f:      hanging/blocked, increased cpu usage and halted reporting
220                        thanks to: Bas van der Vlies - basv(a.t)sara(d.o.t)nl
221                        for discovering the origin of the bug
222                        thanks to: Mickael Gastineau - gastineau(a.t)imcce(d.o.t)fr
223                        for reporting it and testing the fix
224                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
225                        for reporting it and testing the fix
226                f:      uninitialized variable in checkGmetricVersion()
227                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
228                        for the patch
229                f:      undefined PBSError
230                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
231                        for reporting it
233                r:      SGE support broken
235        jobarchived)
237                a:      can now use py-rrdtool api instead of pipes, much faster!
238                        install py-rrdtool to use this
239                        backwards compatible fails back to pipes if module not installed
241                c:      all XML input was uniencoded, which could cause errors,
242                        now all properly converted to normal strings
244                f:      when XML data source (gmetad) is unavailable parsethread didn't return correctly
245                        which caused a large number of threads to spawn while consuming large amounts of memory
246                f:      autocreate clusterdirs in archivedir
247                f:      unhandled gather exception
248                f:      incorrect stop_timestamping when jobs finished
249                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
250                        for finding and debugging/testing it
254        web)
255                f:      misc. optimization and bugfixes
256                f:      now fully compatible with latest PHP5 and PHP4
258                c:      cluster image now incorporates small text descr.
259                c:      monarch (cluster/host) images no longer displayed
260                        for clusters that are not jobmond enabled
261                c:      pie chart percentages are now cpu-based instead of node-based
263                a:      host template for Ganglia
264                        adds a extra monarch host image to Ganglia's host overview
265                        which displays/links to the jobs on that host
266                        NOTE!: be sure to copy/install new template from addons/templates
267                a:      (optional) nodes hostnames column
268                        thanks to: Daniel Barthel - daniel(d.o.t)barthel(a.t)nottingham(d.o.t)ac(d.o.t)uk
269                        for the suggestion
271        jobmond)
273                f:      when a job metric is longer than maximum metric length,
274                        the info is split up amongst multiple metrics
275                f:      no longer exit when batch server is unavailable
276                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
277                        for the patch
278                f:      fd closure bug causing stderr/stdout to remain open after daemonizing
280                c:      rearranged code to allow support for other batch systems
282                a:      (experimental) SGE (Sun Grid Engine) support as batch server
283                        thanks to: Babu Sundaram - babu(a.t)cs(d.o.t)uh(d.o.t)edu
284                        who developed it for a OSCAR's Google-SoC project
285                a:      pidfile support
286                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
287                        for the patch
288                a:      usage display
289                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
290                        for the patch
291                a:      queue selection support: ability to specify which QUEUE's to get jobinfo from
292                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
293                        for the patch
295        jobarchived)
297                f:      XML retrieval for Ganglia version >= 3.0.3 working properly again
298                f:      database storing for Ganglia version >= 3.0.3 working properly again
299                f:      fd closure bug causing stderr/stdout to remain open after daemonizing
301                c:      misc. bugfixes to optimize XML connections
302                c:      misc. bugfixes for misc. minor issues
304                a:      cleaning of stale jobs in dbase: see JOB_TIMEOUT option
308        web)
310                f:      misc. layout bugs for overview & search
311                f:      bug that occured when calculating the number of nodes when there
312                        was more than one job running on a machine
314                c:      column requested memory is now optional through conf.php
315                c:      search and overview tables are now full screen (100%)
316                c:      overview jobnames are now cutoff at max 9 characters
317                        to prevent (layout) scews in the tables
318                c:      overview graphs are no longer downsized
320                a:      (optional) column 'queued' (since) in overview
321                a:      search results (can) now have a SEARCH_RESULT_LIMIT
322                        this increases performance of the query's significantly!
323                a:      date/time format as displayed is now configurable through conf.php
325        jobmond)
327                a:      now reports 'queued since' (or creation time) of jobs
329        documentation)
331                f:      wrong e-mail adress in INSTALL (doh!)
335        - First public release
Note: See TracBrowser for help on using the repository browser.