Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 454736

Summary: Locks not working as expected on perftests
Product: [Eclipse Project] Platform Reporter: David Williams <david_williams>
Component: RelengAssignee: Platform-Releng-Inbox <platform-releng-inbox>
Status: CLOSED WONTFIX QA Contact:
Severity: major    
Priority: P3    
Version: 4.5   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard: stalebug
Bug Depends on: 455161    
Bug Blocks:    

Description David Williams CLA 2014-12-10 09:36:51 EST
As of bug 454576 I added "locks" to the performance machine, and 2 executors, so that some tiny fast jobs could run "at any time", while the long running performance jobs would still run "one at a time" (so as to not interfere with each others statistics). 

But, this morning I looked at status and discovered two performance jobs running! 

ep45I-perf-lin64-baseline and 
ep45ILR-perf-lin64-baseline

not good. I went ahead and canceled ep45ILR-perf-lin64-baseline, but let ep45I-perf-lin64-baseline continue, mostly since it was nearly done. But, even then, for both jobs, by that point, they would have stored some data in the data base, thus, potentially "contaminating" the data. 

I looked at the configuration of each, and each had the "lock set", so am not sure why they were both allowed to run. Either "locks" don't work quite right ... or, I misunderstand what they do! 

TODO; open another bug on the need to be able to "clean up" the database. This might required storing of more data, to make it easier, such as "job name and number".
Comment 1 David Williams CLA 2014-12-10 09:41:39 EST
Meant to document the job parameters, so they could be "re-ran": 

#8 ep45ILR-perf-lin64-baseline
I20141209-1115
4.5.0
3ad3319f1a4aff6a4c74bae2b61f41c08297f55b
otherPerformance

#18 ep45I-perf-lin64-baseline
I20141209-2000
4.5.0
74d2d5354b2a777911b5ce3081cfc010405141a8
selectPerformance


The latter was the one I allowed to continue running, the former canceled, about half way through.
Comment 2 David Williams CLA 2014-12-10 13:40:03 EST
Happened again and I killed 

ep45ILR-perf-lin64 #8
I20141208-2000
4.5.0
2ea3d8b168d30d17911c388ba4c8ff2d85c975d5
otherPerformance
Comment 3 David Williams CLA 2014-12-10 16:18:09 EST
I saw a configuration items to "block while upstream/downstream job is building" ... but, then saw some that seemed "stuck in que" because they *thought* a downstream build was running ... but, it had finished several minutes earlier. 

Not sure how often Hudson checks .. but ... seems not often enough. 

Plus, I feature part of the complication is that one job "triggers" the next job. Not sure if/how considers those "separate jobs" of if part of Hudson sees it as "the same job"?
Comment 4 David Williams CLA 2014-12-10 16:20:35 EST
an alternative approach, while more complicated, might be that once we "collect the results", then trigger the next job, from that "collect script" on the build machine. 

(It is even a little cleaner, since by then we would have "fetched results", so that small amount of "IO" would not taint any initial performance tests.
Comment 5 Eclipse Genie CLA 2020-03-31 13:31:57 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.