Community
Participate
Working Groups
The window manager is missing or died.
See e.g. http://download.eclipse.org/eclipse/downloads/drops4/I20130219-1600/testresults/linux.gtk.x86_6.0/org.eclipse.ui.workbench.texteditor.tests.ScreenshotTest.testWindowsTaskManagerScreenshots1.png
http://download.eclipse.org/eclipse/downloads/drops4/I20130219-1600/testresults/consolelogs/linux.gtk.x86_64_6.0_consolelog.txt says: --------------------------------------------------------------------------- Check if any window managers are running (xfwm|twm|metacity|beryl|fluxbox|compiz): Window Manager processes: wtpBuild 25007 1 0 Feb11 ? 00:00:09 metacity --display=:9 --replace --sm-disable 55011 29040 1 0 12:14 ? 00:00:01 metacity --replace --sm-disable Existing window manager found running, so did not force start of metacity Current metacity processes running (check for accumulation): wtpBuild 25007 1 0 Feb11 ? 00:00:09 metacity --display=:9 --replace --sm-disable 55011 29040 1 0 12:14 ? 00:00:01 metacity --replace --sm-disable --------------------------------------------------------------------------- But later, there are console entries like this: Xlib: extension "RANDR" missing on display ":100.0". Maybe you were running tests in parallel on the same machine but different displays (:9 vs. :100)? Does the check at the beginning of the test script ensure the window manager process is running for the right display? If the test script doesn't start a window manager because one is running on another display, then our tests can be executed without a window manager.
> But later, there are console entries like this: > > Xlib: extension "RANDR" missing on display ":100.0". > > > Maybe you were running tests in parallel on the same machine but different > displays (:9 vs. :100)? Does the check at the beginning of the test script > ensure the window manager process is running for the right display? If the > test script doesn't start a window manager because one is running on another > display, then our tests can be executed without a window manager. The check will start metacity if none are found, but, that logic is really just a hold over ... from when I used to run UI tests using only xvfb. For it, you did have to start it in a specific display, and start your own window manager. For Xvnc, that Hudson uses, it is supposed to manage all that for you, drawing from a pool of Display's. I imagine if we followed it, we'd see a different "display" number each time. So, my guess is, it just died this run, for some reason? Do you want me to re-run those linux tests? It should simply "replace" the test results from that I20130219-1600 build.
> > Do you want me to re-run those linux tests? It should simply "replace" the > test results from that I20130219-1600 build. Since we want a "final PDE build" to be relatively clean, I have restarted those tests. So, if you specifically do NOT want the current test results replaced, be sure to say to in the next 4 or 5 hours!
Either way is fine for me. I don't know whether other builds are running on the same machine, nor whether you can control that. I still assume this is the root of the problem. For all other recent builds (I, N, CBI), the GTK log said: "No window managers processes found running, so will start metacity" So it looks like it's still necessary to start a window manager manually. Maybe runtests.sh should use "wmctrl -m" to see if a window manager is running. But I don't know whether that also lists all running WMs or just the one for the current $DISPLAY.
(In reply to comment #4) > > > > Do you want me to re-run those linux tests? It should simply "replace" the > > test results from that I20130219-1600 build. > > Since we want a "final PDE build" to be relatively clean, I have restarted > those tests. So, if you specifically do NOT want the current test results > replaced, be sure to say to in the next 4 or 5 hours! Did they run? The tests are still indicated as failed.
(In reply to comment #6) > (In reply to comment #4) > > > > > > Do you want me to re-run those linux tests? It should simply "replace" the > > > test results from that I20130219-1600 build. > > > > Since we want a "final PDE build" to be relatively clean, I have restarted > > those tests. So, if you specifically do NOT want the current test results > > replaced, be sure to say to in the next 4 or 5 hours! > > Did they run? The tests are still indicated as failed. They ran, but did not finish before "build.eclipse.org" (and Hudson) started having trouble due to hardware failure. I plan to try again later today, but unless you say otherwise, don't plan to delay moving to cbi builds.
(In reply to comment #5) > > So it looks like it's still necessary to start a window manager manually. > Maybe runtests.sh should use "wmctrl -m" to see if a window manager is > running. But I don't know whether that also lists all running WMs or just > the one for the current $DISPLAY. wmctrl looks interesting and potentially useful, but doesn't seem to be part of "standard installs". So, I hate to ask for it to be installed on each Hudson, unless this turns out to be a frequent problem.
From the number of failures, I see that build I20130222-2000 had similar problem. http://download.eclipse.org/eclipse/downloads/drops4/I20130222-2000/testResults.php For the next build, I tried changing slaves from '2' , to '1' and see if it makes a difference ... but a) that machine seems very slow b) its initial log indicated one was "already running", so I'd expect it to be a problem again. I think I'll try two things. I guess in two steps, just to be systematic. First, I'll put a simply 'env' in that file so all environment variables are captured in log. Maybe that'll give some insight. Second, what I think might solve it, is not to "check if one is running", but to simply always call it, but without the --replace option. That way, if there really is one running, metacity call should fail saying there already is one for that display. Else it will start one up. I suppose the only thing that won't solve, is if there actually is a window manger running, even for our DISPLAY, such as twm, and it is just that twm is inadequate for our tests? If that is the case, then we will need to "replace", in which case a check of ENV would help (make sure we have a DISPLAY) and perhaps even let metacity fail once, and if it fails, then call it with --replace. Besides removing the "if" logic, the call would become metacity --display=$DISPLAY --sm-disable & I'm sure "current display" is default, I'm just including it so it shows up in later ps query. Oh, I also turned on "capture screen at end", in Hudson. That only stays on build machine (currently) but might give some insight if there is something funny about our exiting state ... such as, I wonder if "we" are leaving a window open, even after our tests complete? Comments welcome. The "env" log should be in 0224-2000 build.
Meant to mention ... I do see these failures on my home setup too. But, doesn't seem like every time (not really studied that systematically). And my "env" is fairly different than build.eclipse.org (Hudson 3.0, Ubuntu 12/04) and from the console log, I think the "check if already running logic" is simply the wrong approach for Xvnc environment on Hudson (Its obviously listing my "desktop", but the tests do not run on my desktop ... I never see them :) -- unless I turn off Xvnc. If I have time, I'll try the new logic on my home system, and if seems to work, then put in the fixes for Sunday night's build. = = = = Check if any window managers are running (xfwm|twm|metacity|beryl|fluxbox|compiz): Window Manager processes: davidw 3334 3262 0 02:32 ? 00:11:08 compiz davidw 3456 3334 0 02:32 ? 00:00:00 /bin/sh -c /usr/bin/compiz-decorator Existing window manager found running, so did not force start of metacity Current metacity processes running (check for accumulation): Triple check if any window managers are running (at least metacity should be!): Window Manager processes: davidw 3334 3262 0 02:32 ? 00:11:08 compiz davidw 3456 3334 0 02:32 ? 00:00:00 /bin/sh -c /usr/bin/compiz-decorator
The changes I tried didn't help the tests (on my local machine) ... http://git.eclipse.org/c/platform/eclipse.platform.releng.aggregator.git/commit/?id=bb5ca7de83ecf9e2b3355c533aeec1194b5329a8 But might help the diagnosis on Hudson. (And ... might even help there?) First try, just listing the variables, and take at a screen shot at end, didn't change tests much (as expected), but the final screen shot had a "pure console" with a message about "not being able to run the screen saver". Thought that might be a clue, so turned off screen saver completely. Net run, where I also started a WM without --replace did in fact start one (i.e. none were running) ... but, still has same failures. This time, though, the "final screen shot" had a clear "desktop" in the image, with a modal dialog about "you need to enter your password for keyring". (Often the case for "VNC sessions".) In theory, that might have been "left over" from before? When trying to unlock screen saver or something. In both runs, it was using "Display :10" on my local system. So ... I'll leave the "runTests" script as is, and see if we get better diagnostics, if nothing else. (I do think its "more correct" the way it is). The "runTests" script is only the one we use on production machine, not the one that's "shipped" in test framework zip. I also recalled why I wrote code like this to begin with ... In the past, when running tests on my own machine, not using XVnc or xvfb, the window manager would always be replaced, even if I already had one, like "unity" running, so even if I could then see the tests run ... it'd "mess up" my desktop until I could restart. Hope this analysis helps. Advice welcome.
GRRRR ... I forgot, Hudson slaves do not support "take snapshot at end" (bug 389378) and Hudson has the most unfriendly reaction to "fail the build"! (bug 389451). At this time of night, easier for me to restart with that turned off, rather than "tweeze out" the test results by hand, but, there were 191 fewer failures, so I assume simply staring WM (without --replace) helped. You might peek at Hudson's overall log. https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/ep4-unit-lin64/505/consoleFull With so many "gnome errors" is there a better window manager to use (in my experience, though, these are pretty common ... usually not so many ... but then again, I'm not running 1000's of windows :). You can see our normal "console log" directly on Hudson, https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/ep4-unit-lin64/505/artifact/workarea/I20130224-2000/eclipse-testing/results/consolelogs/linux.gtk.x86_64_6.0_consolelog.txt And, even the test results https://hudson.eclipse.org/hudson/view/Eclipse%20and%20Equinox/job/ep4-unit-lin64/505/testReport/ If you want a "quick peek" before the next run completes and is published in familiar summary form.
Looks good on latest CBI build. Closing for now. If it happens again, we can find a more durable solution.
For the record, that problem I was seeing on my local machine (running its own version of Hudson) the default install of Hudson (all I ever use :) "automatically" finds my ordinary VNC which, while configured to be accessible only from local host (no pw) it is configured to start my normal desktop, which is why the "key ring password" was being required. Just so I better know what to look for in future.
FYI, for 0309-1500 build I added a "bit bucket pipe" for metacity, in hopes it will prevent the many (thousands) of warnings written to Hudson logs. http://git.eclipse.org/c/platform/eclipse.platform.releng.aggregator.git/commit/?id=bdd50da3b1e3cb00b9c7ab65bf88ade6f5d59c0a