source: branches/1.0/CHANGELOG @ 852

Last change on this file since 852 was 827, checked in by ramonb, 11 years ago
  • updated for 1.0
File size: 10.7 KB
RevLine 
[342]1
[363]2        LEGEND  f: fixed - c: changed - a: added - r: removed
[361]3
[827]41.0:
5
6    jobmond)
7
8        a: now supports multiple udp send channels
9        a: now supports job arrays
10        c: updated Gmetric XDR protocol to version 3.1+ compatible
11
12        c: gmond.conf parsing has been rewritten to handle include's and
13           multiple send channels
14        c: METRIC_MAX_VAL_LEN is now determined from gmond.conf
15        c: utilize new job monarch protocol
16
17        f: can now handle new PBSQuery / pbs_python versions
18        f: default gmond.conf search location is now /etc/ganglia/gmond.conf
19        f: fatal error's are now printed to shell upon startup, not just syslog
20        f: more error checking and miscellanious bugfixes
21
22    jobarchived)
23
24        r: no longer use pyPgSQL for postgres database
25        c: now use psycopg2 module for postgres database
26
27        a: job thread now utilizes db commits and rollbacks
28        a: now use USER/PASS authentication to database (in stead of hostbased)
29
30        c: database schema: changed job_id to varchar to support job arrays
31        c: database schema: changed job_name max length to 255, just like
32           torque
33        c: database schema: added username/password role authentication
34        c: utilize new job monarch protocol
35
36        f: job thread no longer hangs when insert/update of a job in database
37           fails
38        f: rewrite of job (finished) detection: all finished jobs again
39           properly detected
40        f: job checking now done post-parsing not while parsing
41        f: more error checking and miscellanious bugfixes
42
43    web)
44
45        r: removed Pie chart
46        r: removed TemplatePower
47        r: removed php ini_set's and time limit directive: should be handled in
48           php.ini
49        r: removed "Get Fresh Data" button: served no purpose anymore
50        a: now utilize Dwoo templates for html output
51
52        a: now use USER/PASS authentication to database (in stead of hostbased)
53        a: ClusterImage now drops a shadow below nodes
54        a: RRDs now show "Last: Min: Avg: Max:" values in legend
55
56        c: utilize new job monarch protocol
57        c: all templates rewritten from TemplatePower to Dwoo
58        c: graph.php now used for overview and archive
59        c: RRDs job start/finish line is now dashed green/red line with legend
60
61        f: some dbase fields are now CAST to INT for php since postgres now
62           requires explicit casts
63        f: sort order descending/ascending is now correct
64        f: many, many speed and memory improvements
65        f: more error checking and miscellanious bugfixes
66
[511]670.4:
68
69        jobmond)
70                a:      SGE support
71                        thanks to: Dave Love - d(d.o.t)love(a.t)liverpool(d.o.t)ac(d.o.t)uk
72                        for writing it!
[526]73                a:      LSF support
74                        thanks to: Mahmoud Hanafi - mhanafi(a.t)csc(d.o.t)com
75                        for writing it!
[521]76                a:      GMETRIC_TARGET is now parsed from gmond.conf
77                a:      GMETRIC_BINARY is now looked for in PATH
[511]78                f:      queue selection support is now working
79                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
80                        for the patch
81        web)
82                a:      large graphs link for job report
83                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
[523]84                a:      SHOW_EMPTY_COLUMN, SHOW_EMPTY_ROW options for ClusterImage hostname parsing
[511]85
[498]860.3.1:
87
88        other)
89                f:      updated INSTALL since "addons" directory is not included by default anymore in Ganglia
90                        thanks to: Steven DuChene linux(d.a.s.h)clusters(a.t)mindspring(d.o.t)com
91                        for reporting it
92
93        rpm)
94                f:      add "addons" directory since it's not included by default anymore in Ganglia
[501]95                f:      properly rewrite WEBDIR path in %files when rebuilding rpms with Makefile
[498]96
97        web)
98                f:      typo in empty_cpu variable: causing incorrect 'free cpu' count and similar errors
99                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
100                        for reporting it
[502]101                f:      changed erroneous domain detection a little
102                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
[498]103                        for reporting it
104                a:      now properly detects whether or not to use FQDN or short hostnames w/o domain
[502]105                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
[498]106                        thanks to: Jeffrey Sarlo - JSarlo(a.t)Central(d.o.t)UH(d.o.t)EDU
107                        for the many testing and reporting it
108
109                        SPECIAL THANKS to the University of Houston for sending me a shirt!
110
[500]111        jobarchived)
112                f:      properly catch postgres exception
113                f:      don't use debug_message while loading config file
114
[452]1150.3:
[342]116
117        web)
118                a:      allow per-cluster settings/override options: see CLUSTER_CONFS option
119                a:      clusterimage can now draw nodes at x,y position parsed from hostname
[427]120                        see SORTBY_HOSTNAME for this in clusterconf/example.php
[342]121                a:      clusterimage nodes are now clickable: has link to all jobs from that host
[427]122                a:      clusterimage nodes now have a tooltip: displays hostname and jobids for now
[345]123                a:      jobmonarch logo image
124                        thank to: Robin Day
125                        for the design
[414]126                a:      rrd graph of running/queued jobs to overview
[460]127                a:      per-cluster settings for archive database
128                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
129                        for the patch
[342]130
[414]131                c:      host archive view is now more complete and detailed in the same manner as
132                        Ganglia's own host view
[427]133                c:      host archive view available metric list is now compiled from disk,
134                        so that the detailed archive host view works even when the node is currently down.
[400]135                c:      removed size restrictions from detailed host archive view
136
[465]137                f:      compatibility: removed php5 call
[460]138                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
139                        for the patch
[458]140                f:      prevent negative cpu/node calculation
141                        thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es
142                        for the patch
[364]143                f:      archive search not properly resetting nodes list
144                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
145                        for the patch
[400]146                f:      detailed host view from jobarchive was broken since hostbased support of 0.2
147                        now host view is properly set and parsed again
148                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
149                        for reporting the bug and suggesting a patch
[403]150                f:      bug where jobstart redline indicator in host detail graphs was set incorrectly
[414]151                        or not at all due to a miscalculation in job times
[427]152                f:      bug where hostimage headertext xoffset was miscalculated, causing the column names
153                        to overlap their position when the columnname was longer than the columnvalues
[364]154
[342]155        jobmond)
156
[376]157                a:      syslog support
[427]158                a:      report number of running/queued jobs as seperate metrics
159                a:      native gmetric support, much faster and cleaner!
[361]160                        thanks to: Nick Galbreath - nickg(a.t)modp(d.o.t)com
161                        for writing it and allowing inclusion in jobmond
162
[452]163                f:      crashing jobmond when multiple nodes amounts are requested in
164                        a queued job: numeric_node variable not initialized properly
165                        thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es
166                        for supplying the patch
167                        and many others for reporting and helping debug this
[361]168                f:      hanging/blocked, increased cpu usage and halted reporting
169                        thanks to: Bas van der Vlies - basv(a.t)sara(d.o.t)nl
170                        for discovering the origin of the bug
171                        thanks to: Mickael Gastineau - gastineau(a.t)imcce(d.o.t)fr
172                        for reporting it and testing the fix
173                        thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu
174                        for reporting it and testing the fix
[342]175                f:      uninitialized variable in checkGmetricVersion()
176                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
177                        for the patch
[364]178                f:      undefined PBSError
179                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
180                        for reporting it
[342]181
[363]182                r:      SGE support broken
183
[361]184        jobarchived)
185
[427]186                a:      can now use py-rrdtool api instead of pipes, much faster!
[376]187                        install py-rrdtool to use this
188                        backwards compatible fails back to pipes if module not installed
[367]189
[427]190                c:      all XML input was uniencoded, which could cause errors,
191                        now all properly converted to normal strings
192
[470]193                f:      when XML data source (gmetad) is unavailable parsethread didn't return correctly
194                        which caused a large number of threads to spawn while consuming large amounts of memory
[376]195                f:      autocreate clusterdirs in archivedir
196                f:      unhandled gather exception
[361]197                f:      incorrect stop_timestamping when jobs finished
198                        thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr
[376]199                        for finding and debugging/testing it
[361]200
[308]2010.2:
202
203        web)
[342]204                f:      misc. optimization and bugfixes
205                f:      now fully compatible with latest PHP5 and PHP4
[308]206
[342]207                c:      cluster image now incorporates small text descr.
208                c:      monarch (cluster/host) images no longer displayed
209                        for clusters that are not jobmond enabled
210                c:      pie chart percentages are now cpu-based instead of node-based
[308]211
[342]212                a:      host template for Ganglia
213                        adds a extra monarch host image to Ganglia's host overview
214                        which displays/links to the jobs on that host
215                        NOTE!: be sure to copy/install new template from addons/templates
216                a:      (optional) nodes hostnames column
217                        thanks to: Daniel Barthel - daniel(d.o.t)barthel(a.t)nottingham(d.o.t)ac(d.o.t)uk
218                        for the suggestion
[308]219
220        jobmond)
221
[342]222                f:      when a job metric is longer than maximum metric length,
223                        the info is split up amongst multiple metrics
224                f:      no longer exit when batch server is unavailable
225                        thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com
226                        for the patch
227                f:      fd closure bug causing stderr/stdout to remain open after daemonizing
[308]228
[342]229                c:      rearranged code to allow support for other batch systems
[308]230
[342]231                a:      (experimental) SGE (Sun Grid Engine) support as batch server
232                        thanks to: Babu Sundaram - babu(a.t)cs(d.o.t)uh(d.o.t)edu
233                        who developed it for a OSCAR's Google-SoC project
234                a:      pidfile support
235                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
236                        for the patch
237                a:      usage display
238                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
239                        for the patch
240                a:      queue selection support: ability to specify which QUEUE's to get jobinfo from
241                        thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca
242                        for the patch
[308]243
244        jobarchived)
245
[342]246                f:      XML retrieval for Ganglia version >= 3.0.3 working properly again
247                f:      database storing for Ganglia version >= 3.0.3 working properly again
248                f:      fd closure bug causing stderr/stdout to remain open after daemonizing
[308]249
[342]250                c:      misc. bugfixes to optimize XML connections
251                c:      misc. bugfixes for misc. minor issues
[308]252
[342]253                a:      cleaning of stale jobs in dbase: see JOB_TIMEOUT option
[308]254
[283]2550.1.1:
[249]256
257        web)
258
[342]259                f:      misc. layout bugs for overview & search
260                f:      bug that occured when calculating the number of nodes when there
261                        was more than one job running on a machine
[253]262
[342]263                c:      column requested memory is now optional through conf.php
264                c:      search and overview tables are now full screen (100%)
265                c:      overview jobnames are now cutoff at max 9 characters
266                        to prevent (layout) scews in the tables
267                c:      overview graphs are no longer downsized
[253]268
[342]269                a:      (optional) column 'queued' (since) in overview
270                a:      search results (can) now have a SEARCH_RESULT_LIMIT
271                        this increases performance of the query's significantly!
272                a:      date/time format as displayed is now configurable through conf.php
[249]273
274        jobmond)
275
[342]276                a:      now reports 'queued since' (or creation time) of jobs
[249]277
278        documentation)
279
[342]280                f:      wrong e-mail adress in INSTALL (doh!)
[249]281
[342]2820.1:
[249]283
284        - First public release
Note: See TracBrowser for help on using the repository browser.