1 | |
---|
2 | LEGEND f: fixed - c: changed - a: added - r: removed |
---|
3 | |
---|
4 | 1.1: |
---|
5 | |
---|
6 | web) |
---|
7 | |
---|
8 | a: archive search now has "include running jobs" option |
---|
9 | c: rewritten short versus FQDN hostname detection: now works properly |
---|
10 | with ganglia hosts not using FQDN hostnames |
---|
11 | f: display of xml parsetime for overview. no longer display parsetime |
---|
12 | for archive (no parsing done) |
---|
13 | f: down/offline nodes are now properly marked in cluster image again |
---|
14 | f: bug where "Unavailable" row would not be shown in overview summary |
---|
15 | table |
---|
16 | |
---|
17 | packaging) |
---|
18 | |
---|
19 | c: completely redone and rewritten by Olivier Lahaye - thanks! |
---|
20 | |
---|
21 | jobmond) |
---|
22 | |
---|
23 | a: now supports SLURM Workload Manager! |
---|
24 | a: warning if connecting to remote BATCH_SERVER is not supported by |
---|
25 | selected BATCH_API |
---|
26 | f: bug where incorrect commandline option would trigger traceback in |
---|
27 | usage() |
---|
28 | |
---|
29 | jobarchived) |
---|
30 | |
---|
31 | a: now performs regular database Housekeeping every 20 job XML |
---|
32 | iterations (previously only once at startup) |
---|
33 | a: now checks if ARCHIVE_DATASOURCES are present in gmetad.conf |
---|
34 | f: prevent an Exception to occur when determining datasource polling |
---|
35 | interval |
---|
36 | f: bug where config file handle was not closed |
---|
37 | |
---|
38 | 1.0: |
---|
39 | |
---|
40 | jobmond) |
---|
41 | |
---|
42 | a: now supports multiple udp send channels |
---|
43 | a: now supports job arrays |
---|
44 | c: updated Gmetric XDR protocol to version 3.1+ compatible |
---|
45 | |
---|
46 | c: gmond.conf parsing has been rewritten to handle include's and |
---|
47 | multiple send channels |
---|
48 | c: METRIC_MAX_VAL_LEN is now determined from gmond.conf |
---|
49 | c: utilize new job monarch protocol |
---|
50 | |
---|
51 | f: can now handle new PBSQuery / pbs_python versions |
---|
52 | f: default gmond.conf search location is now /etc/ganglia/gmond.conf |
---|
53 | f: fatal error's are now printed to shell upon startup, not just syslog |
---|
54 | f: more error checking and miscellanious bugfixes |
---|
55 | |
---|
56 | jobarchived) |
---|
57 | |
---|
58 | r: no longer use pyPgSQL for postgres database |
---|
59 | c: now use psycopg2 module for postgres database |
---|
60 | |
---|
61 | a: job thread now utilizes db commits and rollbacks |
---|
62 | a: now use USER/PASS authentication to database (in stead of hostbased) |
---|
63 | |
---|
64 | c: database schema: changed job_id to varchar to support job arrays |
---|
65 | c: database schema: changed job_name max length to 255, just like |
---|
66 | torque |
---|
67 | c: database schema: added username/password role authentication |
---|
68 | c: utilize new job monarch protocol |
---|
69 | |
---|
70 | f: job thread no longer hangs when insert/update of a job in database |
---|
71 | fails |
---|
72 | f: rewrite of job (finished) detection: all finished jobs again |
---|
73 | properly detected |
---|
74 | f: job checking now done post-parsing not while parsing |
---|
75 | f: more error checking and miscellanious bugfixes |
---|
76 | |
---|
77 | web) |
---|
78 | |
---|
79 | r: removed Pie chart |
---|
80 | r: removed TemplatePower |
---|
81 | r: removed php ini_set's and time limit directive: should be handled in |
---|
82 | php.ini |
---|
83 | r: removed "Get Fresh Data" button: served no purpose anymore |
---|
84 | a: now utilize Dwoo templates for html output |
---|
85 | |
---|
86 | a: now use USER/PASS authentication to database (in stead of hostbased) |
---|
87 | a: ClusterImage now drops a shadow below nodes |
---|
88 | a: RRDs now show "Last: Min: Avg: Max:" values in legend |
---|
89 | |
---|
90 | c: utilize new job monarch protocol |
---|
91 | c: all templates rewritten from TemplatePower to Dwoo |
---|
92 | c: graph.php now used for overview and archive |
---|
93 | c: RRDs job start/finish line is now dashed green/red line with legend |
---|
94 | |
---|
95 | f: some dbase fields are now CAST to INT for php since postgres now |
---|
96 | requires explicit casts |
---|
97 | f: sort order descending/ascending is now correct |
---|
98 | f: many, many speed and memory improvements |
---|
99 | f: more error checking and miscellanious bugfixes |
---|
100 | |
---|
101 | 0.4: |
---|
102 | |
---|
103 | jobmond) |
---|
104 | a: SGE support |
---|
105 | thanks to: Dave Love - d(d.o.t)love(a.t)liverpool(d.o.t)ac(d.o.t)uk |
---|
106 | for writing it! |
---|
107 | a: LSF support |
---|
108 | thanks to: Mahmoud Hanafi - mhanafi(a.t)csc(d.o.t)com |
---|
109 | for writing it! |
---|
110 | a: GMETRIC_TARGET is now parsed from gmond.conf |
---|
111 | a: GMETRIC_BINARY is now looked for in PATH |
---|
112 | f: queue selection support is now working |
---|
113 | thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu |
---|
114 | for the patch |
---|
115 | web) |
---|
116 | a: large graphs link for job report |
---|
117 | thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu |
---|
118 | a: SHOW_EMPTY_COLUMN, SHOW_EMPTY_ROW options for ClusterImage hostname parsing |
---|
119 | |
---|
120 | 0.3.1: |
---|
121 | |
---|
122 | other) |
---|
123 | f: updated INSTALL since "addons" directory is not included by default anymore in Ganglia |
---|
124 | thanks to: Steven DuChene linux(d.a.s.h)clusters(a.t)mindspring(d.o.t)com |
---|
125 | for reporting it |
---|
126 | |
---|
127 | rpm) |
---|
128 | f: add "addons" directory since it's not included by default anymore in Ganglia |
---|
129 | f: properly rewrite WEBDIR path in %files when rebuilding rpms with Makefile |
---|
130 | |
---|
131 | web) |
---|
132 | f: typo in empty_cpu variable: causing incorrect 'free cpu' count and similar errors |
---|
133 | thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu |
---|
134 | for reporting it |
---|
135 | f: changed erroneous domain detection a little |
---|
136 | thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu |
---|
137 | for reporting it |
---|
138 | a: now properly detects whether or not to use FQDN or short hostnames w/o domain |
---|
139 | thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu |
---|
140 | thanks to: Jeffrey Sarlo - JSarlo(a.t)Central(d.o.t)UH(d.o.t)EDU |
---|
141 | for the many testing and reporting it |
---|
142 | |
---|
143 | SPECIAL THANKS to the University of Houston for sending me a shirt! |
---|
144 | |
---|
145 | jobarchived) |
---|
146 | f: properly catch postgres exception |
---|
147 | f: don't use debug_message while loading config file |
---|
148 | |
---|
149 | 0.3: |
---|
150 | |
---|
151 | web) |
---|
152 | a: allow per-cluster settings/override options: see CLUSTER_CONFS option |
---|
153 | a: clusterimage can now draw nodes at x,y position parsed from hostname |
---|
154 | see SORTBY_HOSTNAME for this in clusterconf/example.php |
---|
155 | a: clusterimage nodes are now clickable: has link to all jobs from that host |
---|
156 | a: clusterimage nodes now have a tooltip: displays hostname and jobids for now |
---|
157 | a: jobmonarch logo image |
---|
158 | thank to: Robin Day |
---|
159 | for the design |
---|
160 | a: rrd graph of running/queued jobs to overview |
---|
161 | a: per-cluster settings for archive database |
---|
162 | thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr |
---|
163 | for the patch |
---|
164 | |
---|
165 | c: host archive view is now more complete and detailed in the same manner as |
---|
166 | Ganglia's own host view |
---|
167 | c: host archive view available metric list is now compiled from disk, |
---|
168 | so that the detailed archive host view works even when the node is currently down. |
---|
169 | c: removed size restrictions from detailed host archive view |
---|
170 | |
---|
171 | f: compatibility: removed php5 call |
---|
172 | thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr |
---|
173 | for the patch |
---|
174 | f: prevent negative cpu/node calculation |
---|
175 | thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es |
---|
176 | for the patch |
---|
177 | f: archive search not properly resetting nodes list |
---|
178 | thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr |
---|
179 | for the patch |
---|
180 | f: detailed host view from jobarchive was broken since hostbased support of 0.2 |
---|
181 | now host view is properly set and parsed again |
---|
182 | thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr |
---|
183 | for reporting the bug and suggesting a patch |
---|
184 | f: bug where jobstart redline indicator in host detail graphs was set incorrectly |
---|
185 | or not at all due to a miscalculation in job times |
---|
186 | f: bug where hostimage headertext xoffset was miscalculated, causing the column names |
---|
187 | to overlap their position when the columnname was longer than the columnvalues |
---|
188 | |
---|
189 | jobmond) |
---|
190 | |
---|
191 | a: syslog support |
---|
192 | a: report number of running/queued jobs as seperate metrics |
---|
193 | a: native gmetric support, much faster and cleaner! |
---|
194 | thanks to: Nick Galbreath - nickg(a.t)modp(d.o.t)com |
---|
195 | for writing it and allowing inclusion in jobmond |
---|
196 | |
---|
197 | f: crashing jobmond when multiple nodes amounts are requested in |
---|
198 | a queued job: numeric_node variable not initialized properly |
---|
199 | thanks to: aloga(a.t)ifca(d.o.t)unican(d.o.t)es |
---|
200 | for supplying the patch |
---|
201 | and many others for reporting and helping debug this |
---|
202 | f: hanging/blocked, increased cpu usage and halted reporting |
---|
203 | thanks to: Bas van der Vlies - basv(a.t)sara(d.o.t)nl |
---|
204 | for discovering the origin of the bug |
---|
205 | thanks to: Mickael Gastineau - gastineau(a.t)imcce(d.o.t)fr |
---|
206 | for reporting it and testing the fix |
---|
207 | thanks to: Craig West - cwest(a.t)astro(d.o.t)umass(d.o.t)edu |
---|
208 | for reporting it and testing the fix |
---|
209 | f: uninitialized variable in checkGmetricVersion() |
---|
210 | thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com |
---|
211 | for the patch |
---|
212 | f: undefined PBSError |
---|
213 | thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com |
---|
214 | for reporting it |
---|
215 | |
---|
216 | r: SGE support broken |
---|
217 | |
---|
218 | jobarchived) |
---|
219 | |
---|
220 | a: can now use py-rrdtool api instead of pipes, much faster! |
---|
221 | install py-rrdtool to use this |
---|
222 | backwards compatible fails back to pipes if module not installed |
---|
223 | |
---|
224 | c: all XML input was uniencoded, which could cause errors, |
---|
225 | now all properly converted to normal strings |
---|
226 | |
---|
227 | f: when XML data source (gmetad) is unavailable parsethread didn't return correctly |
---|
228 | which caused a large number of threads to spawn while consuming large amounts of memory |
---|
229 | f: autocreate clusterdirs in archivedir |
---|
230 | f: unhandled gather exception |
---|
231 | f: incorrect stop_timestamping when jobs finished |
---|
232 | thanks to: Alexis Michon - alexis(d.o.t)michon(a.t)ibcp(d.o.t)fr |
---|
233 | for finding and debugging/testing it |
---|
234 | |
---|
235 | 0.2: |
---|
236 | |
---|
237 | web) |
---|
238 | f: misc. optimization and bugfixes |
---|
239 | f: now fully compatible with latest PHP5 and PHP4 |
---|
240 | |
---|
241 | c: cluster image now incorporates small text descr. |
---|
242 | c: monarch (cluster/host) images no longer displayed |
---|
243 | for clusters that are not jobmond enabled |
---|
244 | c: pie chart percentages are now cpu-based instead of node-based |
---|
245 | |
---|
246 | a: host template for Ganglia |
---|
247 | adds a extra monarch host image to Ganglia's host overview |
---|
248 | which displays/links to the jobs on that host |
---|
249 | NOTE!: be sure to copy/install new template from addons/templates |
---|
250 | a: (optional) nodes hostnames column |
---|
251 | thanks to: Daniel Barthel - daniel(d.o.t)barthel(a.t)nottingham(d.o.t)ac(d.o.t)uk |
---|
252 | for the suggestion |
---|
253 | |
---|
254 | jobmond) |
---|
255 | |
---|
256 | f: when a job metric is longer than maximum metric length, |
---|
257 | the info is split up amongst multiple metrics |
---|
258 | f: no longer exit when batch server is unavailable |
---|
259 | thanks to: Peter Kruse - pk(a.t)q-leap(d.o.t)com |
---|
260 | for the patch |
---|
261 | f: fd closure bug causing stderr/stdout to remain open after daemonizing |
---|
262 | |
---|
263 | c: rearranged code to allow support for other batch systems |
---|
264 | |
---|
265 | a: (experimental) SGE (Sun Grid Engine) support as batch server |
---|
266 | thanks to: Babu Sundaram - babu(a.t)cs(d.o.t)uh(d.o.t)edu |
---|
267 | who developed it for a OSCAR's Google-SoC project |
---|
268 | a: pidfile support |
---|
269 | thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca |
---|
270 | for the patch |
---|
271 | a: usage display |
---|
272 | thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca |
---|
273 | for the patch |
---|
274 | a: queue selection support: ability to specify which QUEUE's to get jobinfo from |
---|
275 | thanks to: Michael Jeanson - michael(a.t)ccs(d.o.t)usherbrooke(d.o.t)ca |
---|
276 | for the patch |
---|
277 | |
---|
278 | jobarchived) |
---|
279 | |
---|
280 | f: XML retrieval for Ganglia version >= 3.0.3 working properly again |
---|
281 | f: database storing for Ganglia version >= 3.0.3 working properly again |
---|
282 | f: fd closure bug causing stderr/stdout to remain open after daemonizing |
---|
283 | |
---|
284 | c: misc. bugfixes to optimize XML connections |
---|
285 | c: misc. bugfixes for misc. minor issues |
---|
286 | |
---|
287 | a: cleaning of stale jobs in dbase: see JOB_TIMEOUT option |
---|
288 | |
---|
289 | 0.1.1: |
---|
290 | |
---|
291 | web) |
---|
292 | |
---|
293 | f: misc. layout bugs for overview & search |
---|
294 | f: bug that occured when calculating the number of nodes when there |
---|
295 | was more than one job running on a machine |
---|
296 | |
---|
297 | c: column requested memory is now optional through conf.php |
---|
298 | c: search and overview tables are now full screen (100%) |
---|
299 | c: overview jobnames are now cutoff at max 9 characters |
---|
300 | to prevent (layout) scews in the tables |
---|
301 | c: overview graphs are no longer downsized |
---|
302 | |
---|
303 | a: (optional) column 'queued' (since) in overview |
---|
304 | a: search results (can) now have a SEARCH_RESULT_LIMIT |
---|
305 | this increases performance of the query's significantly! |
---|
306 | a: date/time format as displayed is now configurable through conf.php |
---|
307 | |
---|
308 | jobmond) |
---|
309 | |
---|
310 | a: now reports 'queued since' (or creation time) of jobs |
---|
311 | |
---|
312 | documentation) |
---|
313 | |
---|
314 | f: wrong e-mail adress in INSTALL (doh!) |
---|
315 | |
---|
316 | 0.1: |
---|
317 | |
---|
318 | - First public release |
---|