| Summary: | Different executors on same slave can share the same display | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Technology] Hudson | Reporter: | Bob Foster <bobfoster> | ||||||
| Component: | Plugins | Assignee: | Bob Foster <bobfoster> | ||||||
| Status: | RESOLVED FIXED | QA Contact: | Latha Amujuri <lamujuri> | ||||||
| Severity: | major | ||||||||
| Priority: | P3 | CC: | bobfoster, david_williams, eclipse.org, lamujuri, lidiam, malaperle, mygwaymark, rovarghe | ||||||
| Version: | 3.0.0 | ||||||||
| Target Milestone: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Attachments: |
|
||||||||
|
Description
Bob Foster
Oops, xxx should be Marc-Andre Laperle. Sorry. Further comment from Marc-Andre Laperle: I also noticed there was a subsequent fix related to this problem: jenkinsci/xvnc-plugin@3627859 Created attachment 248474 [details]
xvnc plugin with display allocator per node
The fix is to keep a static map of node to DisplayAllocator, so that active xvnc sessions on the same node never get assigned the same display number. This one might be tricky to test. In the previous code, display numbers were randomly assigned from the allowable range. Rarely, it could happen that two executors on the same node drew the same random number while they were both building. Hi Bob. What are the next steps? Would you like me to help by testing the new plugin? Marc, please do. The more eyes the better. Latha, it should be easy to test the end case. Simply set the minimum and maximum display number to the same value, e.g., 10, and build two different xvnc-using jobs at the same time on the same slave or master. Latha: It's looking good in that case. I just gave it a try after setting min and max display numbers to the same value. Released as 1.13-h-2. Mind if I ask, where is the source for this plugin? I looked "on eclipse.org", in http://git.eclipse.org/c/hudson/ But it didn't seem to be there? I ask because on an Eclipse.org Hudson instance, I "suddenly" started to get this error (see bug 455161 for long rambling issues). FATAL: null java.lang.NullPointerException at hudson.plugins.xvnc.Xvnc.doSetUp(Xvnc.java:83) at hudson.plugins.xvnc.Xvnc.setUp(Xvnc.java:73) at hudson.model.Build$RunnerImpl.doRun(Build.java:129) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:524) at hudson.model.Run.run(Run.java:1493) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44) at hudson.model.ResourceController.execute(ResourceController.java:82) at hudson.model.Executor.run(Executor.java:137) At first I assume some "config error" or "corruption" due to having "two instances running, but then lo and behold I started to get the same error on my home test machine (after a reboot of whole machine). So ... just wanted to see the source, to better know what to look at, to debug the issue. One thing that's common between the two setups (the perf1test machine at eclipse.org, and my home test machine), is that neither have "slaves", only "master". But, obviously, could be many other things as well. Home system is Hudson 3.2.1, and perf1test at eclipse.org is 3.1.2 -- but, I *think* another thing in common is that I had recently updated both (using "auto update") to update the Xvnc plugin to 1.13-h-2. Thanks for any pointers (to source, and/or "how to debug"). I am re-opening, because I think the fix is "incomplete" in some way -- perhaps only for "master only" installations? The NPE I mentioned in comment 8 "went away" if I downgraded back to Xvnc plugin level 1.13-h-1. I wanted to mention, too, that after I back leveled (and restarted everything) I got a message about "junk" in the config, and it mentioned hudson.model.Hudson CannotResolveClassException: hudson.plugins.xvnc.DisplayAllocator$Property I suspect this is from the 1.13-h-2 version but am surprised plugins do not "clean up" after themselves (if that's the right word) with in turn makes me wonder if the 1.13-h-2 version updates itself properly. Perhaps the bug leading to an NPE is in the "update" process/code, rather than the plugin code, per se? The source for the plugin is at https://github.com/hudson3-plugins/xvnc-plugin Seriously, this should be fixed. (In reply to Bob Foster from comment #11) > The source for the plugin is at > https://github.com/hudson3-plugins/xvnc-plugin > > Seriously, this should be fixed. "should" be? Was it tested on a Hudson with "master only"? Any ideas what else might lead to the NPE reported above, after updating to this version? And, the NPE going away after down leveling? 'just ask'in It isn't at all obvious to me how an NPE on line 83 of Xvnc in the current source is possible. Nor that it has anything at all to do with "master only". Please do look at the source. Looking again at this class:
/*package*/ static final class Property extends NodeProperty<Node> {
private transient final DisplayAllocator allocator = new DisplayAllocator();
/*package*/ DisplayAllocator getAllocator() {
return allocator;
}
}
"private transient final" is a bad combination. If this were saved and restored by XStream, allocator would be null. And sure enough this bug is fixed in latest Jenkins version.
We just started seeing the NPE on our Hudson instance, I created bug 458602. Created attachment 250315 [details]
Xvnc plugin with private transient final bug fix
I have attached an xvnc.hpi with a fix for the private transient final bug. I'm kind of tied up right now. If someone could test this version and verify it fixes the NPE problem, I can release it forthwith. If you want to wait for me to test it, will take a few days. The new hpi fixes the issue for me. We also hit this issue: java.lang.NullPointerException at hudson.plugins.xvnc.Xvnc.doSetUp(Xvnc.java:83) and the attached xvnc plugin patch worked for us too. Fixed the NPE. See bug 458602. |