source: branches/1.1/CHANGELOG @ 942

Last change on this file since 942 was 939, checked in by ramonb, 11 years ago

CHANGELOG:

  • updated
File size: 12.6 KB
RevLine 
[342]1
[363]2        LEGEND  f: fixed - c: changed - a: added - r: removed
[361]3
[935]41.1.1:
5
[937]6    packaging)
7
[939]8        f: correctly set the JOBARCHIVE_RRDS in both jobarchived.conf and
[937]9           web/conf.php.in
10        f: debian init.d script names in post/pre pkg corrected to new name
[939]11        f: debian postrm was incorrectly trying redhat conditional restart
[937]12
[935]13    web)
14
15        c: column nodes renamed to: hosts
16        a: sorting by hosts now implemented
17
18    jobarchived)
19
20        f: now properly exits on fatal xml errors
21        f: prevent exception to occur when no timed out jobs are found during
22           Housekeeping
23
24    jobmond)
25
26        f: BATCH_HOST_TRANSLATE no longer required in jobmond.conf
27
[877]281.1:
29
30    web)
31
[881]32        a: archive search now has "include running jobs" option
[877]33        c: rewritten short versus FQDN hostname detection: now works properly
34           with ganglia hosts not using FQDN hostnames
35        f: display of xml parsetime for overview. no longer display parsetime
36           for archive (no parsing done)
37        f: down/offline nodes are now properly marked in cluster image again
38        f: bug where "Unavailable" row would not be shown in overview summary
39           table
40
41    packaging)
42
43        c: completely redone and rewritten by Olivier Lahaye - thanks!
44
45    jobmond)
46
47        a: now supports SLURM Workload Manager!
48        a: warning if connecting to remote BATCH_SERVER is not supported by
49           selected BATCH_API
50        f: bug where incorrect commandline option would trigger traceback in
51           usage()
52
53    jobarchived)
54
55        a: now performs regular database Housekeeping every 20 job XML
56           iterations (previously only once at startup)
57        a: now checks if ARCHIVE_DATASOURCES are present in gmetad.conf
58        f: prevent an Exception to occur when determining datasource polling
59           interval
60        f: bug where config file handle was not closed
61
[827]621.0:
63
64    jobmond)
65
66        a: now supports multiple udp send channels
67        a: now supports job arrays
68        c: updated Gmetric XDR protocol to version 3.1+ compatible
69
70        c: gmond.conf parsing has been rewritten to handle include's and
71           multiple send channels
72        c: METRIC_MAX_VAL_LEN is now determined from gmond.conf
73        c: utilize new job monarch protocol
74
75        f: can now handle new PBSQuery / pbs_python versions
76        f: default gmond.conf search location is now /etc/ganglia/gmond.conf
77        f: fatal error's are now printed to shell upon startup, not just syslog
78        f: more error checking and miscellanious bugfixes
79
80    jobarchived)
81
82        r: no longer use pyPgSQL for postgres database
83        c: now use psycopg2 module for postgres database
84
85        a: job thread now utilizes db commits and rollbacks
86        a: now use USER/PASS authentication to database (in stead of hostbased)
87
88        c: database schema: changed job_id to varchar to support job arrays
89        c: database schema: changed job_name max length to 255, just like
90           torque
91        c: database schema: added username/password role authentication
92        c: utilize new job monarch protocol
93
94        f: job thread no longer hangs when insert/update of a job in database
95           fails
96        f: rewrite of job (finished) detection: all finished jobs again
97           properly detected
98        f: job checking now done post-parsing not while parsing
99        f: more error checking and miscellanious bugfixes
100
101    web)
102
103        r: removed Pie chart
104        r: removed TemplatePower
105        r: removed php ini_set's and time limit directive: should be handled in
106           php.ini
107        r: removed "Get Fresh Data" button: served no purpose anymore
108        a: now utilize Dwoo templates for html output
109
110        a: now use USER/PASS authentication to database (in stead of hostbased)
111        a: ClusterImage now drops a shadow below nodes
112        a: RRDs now show "Last: Min: Avg: Max:" values in legend
113
114        c: utilize new job monarch protocol
115        c: all templates rewritten from TemplatePower to Dwoo
116        c: graph.php now used for overview and archive
117        c: RRDs job start/finish line is now dashed green/red line with legend
118
119        f: some dbase fields are now CAST to INT for php since postgres now
120           requires explicit casts
121        f: sort order descending/ascending is now correct
122        f: many, many speed and memory improvements
123        f: more error checking and miscellanious bugfixes
124
[511]1250.4:
126
127        jobmond)
128                a:      SGE support
129                        thanks to: Dave Love - d(d.o.t)love(a.t)liverpool(d.o.t)ac(d.o.t)uk
130                        for writing it!
[526]131                a:      LSF support
132                        thanks to: Mahmoud Hanafi - mhanafi(a.t)csc(d.o.t)com
133                        for writing it!
[521]134                a:      GMETRIC_TARGET is now parsed from gmond.conf
135                a:      GMETRIC_BINARY is now looked for in PATH
[511]136                f:      queue selection support is now working
137                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
138                        for the patch
139        web)
140                a:      large graphs link for job report
141                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
[523]142                a:      SHOW_EMPTY_COLUMN, SHOW_EMPTY_ROW options for ClusterImage hostname parsing
[511]143
[498]1440.3.1:
145
146        other)
147                f:      updated INSTALL since "addons" directory is not included by default anymore in Ganglia
148                        thanks to: Steven DuChene linux(d.a.s.h)clusters(a.t)mindspring(d.o.t)com
149                        for reporting it
150
151        rpm)
152                f:      add "addons" directory since it's not included by default anymore in Ganglia
[501]153                f:      properly rewrite WEBDIR path in %files when rebuilding rpms with Makefile
[498]154
155        web)
156                f:      typo in empty_cpu variable: causing incorrect 'free cpu' count and similar errors
157                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
158                        for reporting it
[502]159                f:      changed erroneous domain detection a little
160                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
[498]161                        for reporting it
162                a:      now properly detects whether or not to use FQDN or short hostnames w/o domain
[502]163                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
[498]164                        thanks to: Jeffrey Sarlo - JSarlo(a.t)Central(d.o.t)UH(d.o.t)EDU
165                        for the many testing and reporting it
166
167                        SPECIAL THANKS to the University of Houston for sending me a shirt!
168
[500]169        jobarchived)
170                f:      properly catch postgres exception
171                f:      don't use debug_message while loading config file
172
[452]1730.3:
[342]174
175        web)
176                a:      allow per-cluster settings/override options: see CLUSTER_CONFS option
177                a:      clusterimage can now draw nodes at x,y position parsed from hostname
[427]178                        see SORTBY_HOSTNAME for this in clusterconf/example.php
[342]179                a:      clusterimage nodes are now clickable: has link to all jobs from that host
[427]180                a:      clusterimage nodes now have a tooltip: displays hostname and jobids for now
[345]181                a:      jobmonarch logo image
182                        thank to: Robin Day
183                        for the design
[414]184                a:      rrd graph of running/queued jobs to overview
[460]185                a:      per-cluster settings for archive database
186                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
187                        for the patch
[342]188
[414]189                c:      host archive view is now more complete and detailed in the same manner as
190                        Ganglia's own host view
[427]191                c:      host archive view available metric list is now compiled from disk,
192                        so that the detailed archive host view works even when the node is currently down.
[400]193                c:      removed size restrictions from detailed host archive view
194
[465]195                f:      compatibility: removed php5 call
[460]196                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
197                        for the patch
[458]198                f:      prevent negative cpu/node calculation
199                        thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es
200                        for the patch
[364]201                f:      archive search not properly resetting nodes list
202                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
203                        for the patch
[400]204                f:      detailed host view from jobarchive was broken since hostbased support of 0.2
205                        now host view is properly set and parsed again
206                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
207                        for reporting the bug and suggesting a patch
[403]208                f:      bug where jobstart redline indicator in host detail graphs was set incorrectly
[414]209                        or not at all due to a miscalculation in job times
[427]210                f:      bug where hostimage headertext xoffset was miscalculated, causing the column names
211                        to overlap their position when the columnname was longer than the columnvalues
[364]212
[342]213        jobmond)
214
[376]215                a:      syslog support
[427]216                a:      report number of running/queued jobs as seperate metrics
217                a:      native gmetric support, much faster and cleaner!
[361]218                        thanks to: Nick Galbreath - nickg(a.t)modp(d.o.t)com
219                        for writing it and allowing inclusion in jobmond
220
[452]221                f:      crashing jobmond when multiple nodes amounts are requested in
222                        a queued job: numeric_node variable not initialized properly
223                        thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es
224                        for supplying the patch
225                        and many others for reporting and helping debug this
[361]226                f:      hanging/blocked, increased cpu usage and halted reporting
227                        thanks to: Bas van der Vlies - basv(a.t)sara(d.o.t)nl
228                        for discovering the origin of the bug
229                        thanks to: Mickael Gastineau - gastineau(a.t)imcce(d.o.t)fr
230                        for reporting it and testing the fix
231                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
232                        for reporting it and testing the fix
[342]233                f:      uninitialized variable in checkGmetricVersion()
234                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
235                        for the patch
[364]236                f:      undefined PBSError
237                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
238                        for reporting it
[342]239
[363]240                r:      SGE support broken
241
[361]242        jobarchived)
243
[427]244                a:      can now use py-rrdtool api instead of pipes, much faster!
[376]245                        install py-rrdtool to use this
246                        backwards compatible fails back to pipes if module not installed
[367]247
[427]248                c:      all XML input was uniencoded, which could cause errors,
249                        now all properly converted to normal strings
250
[470]251                f:      when XML data source (gmetad) is unavailable parsethread didn't return correctly
252                        which caused a large number of threads to spawn while consuming large amounts of memory
[376]253                f:      autocreate clusterdirs in archivedir
254                f:      unhandled gather exception
[361]255                f:      incorrect stop_timestamping when jobs finished
256                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
[376]257                        for finding and debugging/testing it
[361]258
[308]2590.2:
260
261        web)
[342]262                f:      misc. optimization and bugfixes
263                f:      now fully compatible with latest PHP5 and PHP4
[308]264
[342]265                c:      cluster image now incorporates small text descr.
266                c:      monarch (cluster/host) images no longer displayed
267                        for clusters that are not jobmond enabled
268                c:      pie chart percentages are now cpu-based instead of node-based
[308]269
[342]270                a:      host template for Ganglia
271                        adds a extra monarch host image to Ganglia's host overview
272                        which displays/links to the jobs on that host
273                        NOTE!: be sure to copy/install new template from addons/templates
274                a:      (optional) nodes hostnames column
275                        thanks to: Daniel Barthel - daniel(d.o.t)barthel(a.t)nottingham(d.o.t)ac(d.o.t)uk
276                        for the suggestion
[308]277
278        jobmond)
279
[342]280                f:      when a job metric is longer than maximum metric length,
281                        the info is split up amongst multiple metrics
282                f:      no longer exit when batch server is unavailable
283                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
284                        for the patch
285                f:      fd closure bug causing stderr/stdout to remain open after daemonizing
[308]286
[342]287                c:      rearranged code to allow support for other batch systems
[308]288
[342]289                a:      (experimental) SGE (Sun Grid Engine) support as batch server
290                        thanks to: Babu Sundaram - babu(a.t)cs(d.o.t)uh(d.o.t)edu
291                        who developed it for a OSCAR's Google-SoC project
292                a:      pidfile support
293                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
294                        for the patch
295                a:      usage display
296                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
297                        for the patch
298                a:      queue selection support: ability to specify which QUEUE's to get jobinfo from
299                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
300                        for the patch
[308]301
302        jobarchived)
303
[342]304                f:      XML retrieval for Ganglia version >= 3.0.3 working properly again
305                f:      database storing for Ganglia version >= 3.0.3 working properly again
306                f:      fd closure bug causing stderr/stdout to remain open after daemonizing
[308]307
[342]308                c:      misc. bugfixes to optimize XML connections
309                c:      misc. bugfixes for misc. minor issues
[308]310
[342]311                a:      cleaning of stale jobs in dbase: see JOB_TIMEOUT option
[308]312
[283]3130.1.1:
[249]314
315        web)
316
[342]317                f:      misc. layout bugs for overview & search
318                f:      bug that occured when calculating the number of nodes when there
319                        was more than one job running on a machine
[253]320
[342]321                c:      column requested memory is now optional through conf.php
322                c:      search and overview tables are now full screen (100%)
323                c:      overview jobnames are now cutoff at max 9 characters
324                        to prevent (layout) scews in the tables
325                c:      overview graphs are no longer downsized
[253]326
[342]327                a:      (optional) column 'queued' (since) in overview
328                a:      search results (can) now have a SEARCH_RESULT_LIMIT
329                        this increases performance of the query's significantly!
330                a:      date/time format as displayed is now configurable through conf.php
[249]331
332        jobmond)
333
[342]334                a:      now reports 'queued since' (or creation time) of jobs
[249]335
336        documentation)
337
[342]338                f:      wrong e-mail adress in INSTALL (doh!)
[249]339
[342]3400.1:
[249]341
342        - First public release
Note: See TracBrowser for help on using the repository browser.