| Summary: | NPE starting nested Eclipse | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | Ed Willink <ed> | ||||
| Component: | UI | Assignee: | Platform-UI-Inbox <Platform-UI-Inbox> | ||||
| Status: | RESOLVED FIXED | QA Contact: | |||||
| Severity: | major | ||||||
| Priority: | P3 | CC: | apupier, borlander, emoffatt, ob1.eclipse, pascal, pwebster, steffen.pingel | ||||
| Version: | 4.2 | ||||||
| Target Milestone: | --- | ||||||
| Hardware: | PC | ||||||
| OS: | Windows Vista | ||||||
| Whiteboard: | stalebug | ||||||
| Attachments: |
|
||||||
|
Description
Ed Willink
After upgrading an installation from RC1 to RC2 I'm now getting this just starting Eclipse. I cannot start. And after reverting back to RC1 that won't open either. Thank you very much Mylyn; I don't use you but you break me again when I can least afford to be. Correction: my RC1 startup problems are due to this problem My RC2 startup problems don't give a log message so they're just magic. I don't have any mylyn plugin installed on RC2 so they're definitely not mylyn's fault. Deleting .metadata/.plugins/org.eclipse.e4.workbench/workbench.xmi and/or deltas_42M7migration.xml seems to be the solution to a locked out workspace. The Mylyn exception is happening on shutdown and is a result of the incomplete startup. As far as I can tell the relevant problem is in GMF: Caused by: java.lang.NullPointerException at org.eclipse.ui.internal.Workbench.createWorkbenchWindow(Workbench.java:1228) at org.eclipse.ui.internal.Workbench.getActiveWorkbenchWindow(Workbench.java:1221) at org.eclipse.gmf.runtime.common.ui.util.UIModificationValidator$1.run(UIModificationValidator.java:119) at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35) Moving to GMF runtime, I don't see any relationship to GMF-T here. Effectively, it seems more related to GMF-Runtime than GMF-Tooling. Should it be possible that the GMF UIModificationValidator is called before the Application Workbench Window has started? Just by seeing the stack it makes me think to this issue. Any special interactions with GMF code? Is there a GMF editor opened before the Eclipse application exists untidly? based on:
at org.eclipse.swt.graphics.Device.dispose(Device.java:295)
at
org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:140)
Even this is happening during a shutdown event. Are there more logs to indicate what triggered the display.dispose() call in IDEApplication? That dispose is spinning the event loop and calling UIModificationValidator$1.run(*)
PW
I think three may be three problems here. 1) org.eclipse.gmf.runtime.common.ui.util.UIModificationValidator does not comply with e4 startup timing. 2) e4 is intolerant of non-compliant applications leading to a corrupt workspace.xmi 3) e4 cannot recover from workspace.xmi corruption I saw this on a variety of workspaces after RC1 to RC2 upgrade, some probably empty. I think mylyn has a start up timing problem too. I'm probably being 'nasty' to e4 because in the +1 to EPP limbo, I create new installations with latest ZIPs and fill in some gaps from juno/releases. So I have a mix of current/previous versions and also some missing products because my new installation may often omit one or two products that crept into the previous installation. Each installation also evolves with a few uninstalls as some product is too horrible, or because inappropriate P2 dependencies force an uninstall. For each new installation, I migrate my workspace by cloning the Windows short to point at the new installation and the old workspace. Can you elaborate on what you mean by "start up timing" problems? It was a traditional 3.x problem to get SWT complaints about trying to use widgets after disposal or get a workspace after closure. e4 seems to introduce a new ability to do things before the workspace has been created which seems to be able to NPE in a way that stops the start ever happening. (In reply to comment #10) > It was a traditional 3.x problem to get SWT complaints about trying to use > widgets after disposal or get a workspace after closure. All of these logs are on shutdown (well, an exception in the app is calling display.dispose()). We need some logs of the problem that's happening while within createAndRunWorkbench(*). PW Created attachment 216764 [details]
ZIP of startup failure
Attached contains a .log demonstrating a startup refusal for a nested Eclipse,
and the workbench.xmi and deltas_42M7migration.xml which if removed allow a startuop after all.
The workbench.xmi Application doesn't have any windows, so the createAndRunWorkbench(*) loop runs once and exits with no errors. That makes the problem: Why did we write out a workbench.xmi with no windows? PW (In reply to comment #13) > The workbench.xmi Application doesn't have any windows, so the > createAndRunWorkbench(*) loop runs once and exits with no errors. We just exit, but because we went through our loop once we haven't yet spun the event loop. So the display.dispose() clears events and generates the other errors. We should still have an NPE guard in getActiveWorkbenchWindow() ... the MWindow is null, we shouldn't pass it to the create call. PW How are we ending up with a workbench.xmi file with no MTrimmedWindows in it ? This seems to be the crux of this matter. We fully expect that there be at least one WBW available on startup and its lack could potentially lead to all sorts of unexpected failures. Is there any commonality between this and bug 381555 as far as loaded bundles... ? Can anybody give me repeatable steps to get into this state ? Ed, you seem to be able to get there, is there a pattern you can identify? I'm afraid I can't tell how I got there; I have lots of problems. For instance I haven't seen a JDT completion in any of my workspaces since perhaps M2; perhaps my Workspace is 'corrupt'. More probably a combination of modeling packages improves JDT; Xtext is perhaps likely. A variety of modeling applications mis-behave: GMF and Papyrus seem troublesome; many long standing tool bar NPEs. I'm always suspicious of mylyn, even though I only use Wikitext. Too often mylyn is on my crash stack traces. Generally with this problem it's difficult to distinguish cause and effect. (In reply to comment #15) > How are we ending up with a workbench.xmi file with no MTrimmedWindows in it > ? In the absence of any repro, it would seem sensible to instrument the Save to a) detect the absence of any MTrimmedWindows and maximize the Console Log content b) create an MTrimmedWindow so that the user is not screwed Ed, I'm far more interested in figuring out how we can possibly get *into* this state rather than putting a bandage over the symptoms. Does this happen to you with only particular workspaces ? My workspaces are fairly similar. I have a faint suspicion that the problem arises when problems with over-enthusiastic GIT/modeling builders means that I cannot exit and go to bed. The normal close down procedures tickle GIT and so shut down becomes very very busy. Variously I or Windows takes brutal action and the following morning I'm in trouble. I'm pretty sure I have related bugs open against GIT, Papyrus, Acceleo and QVTo. I'm also suspicious that Xtext/MWE indexing can be very unfriendly. All in all Eclipse has got really bad in the last couple of years. I'm just trying to get a handle on why more folks don't run across this; you do seem to have quite a few features installed, can you give me a list ? As far as the general comment goes, we're doing everything we can to make it better. Note that the initial reasons for *needing* a 4.0 release are still true, you would have seen the bug fix rate drop to near 0 in the 3.x codebase anyways, at least we now have a code base that we have a chance to improve. (In reply to comment #20) > I'm just trying to get a handle on why more folks don't run across this; you > do seem to have quite a few features installed, can you give me a list ? Acceleo, EMF, GIT, MWE, OCL, Papyrus, QVTo, Subversion, UML2, Wikitext, Xtext > As far as the general comment goes, we're doing everything we can to make it > better. Note that the initial reasons for *needing* a 4.0 release are still > true, you would have seen the bug fix rate drop to near 0 in the 3.x > codebase anyways, at least we now have a code base that we have a chance to > improve. Note that I excluded e$ from my earlier list. For a couple of milestones I switched back to 3.x and found little difference; still very poor. From my observations the major e4 issue for me is inconsistent between the combined editor/view activation and the original behaviour. My major performance issue is a build and build and build almost livelock for which GIT seems to be the primary trigger with poor modeling builders and over-enthusiastic indexers a major multiplier. The platform seems to assume that builders can be cancelled at will; far too many builders run to a prolonged completion again and again andagain. This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie. e4 has come along way since 4.2. The problem has gone away probably fixed. |