Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 329871

Summary: [Commands] Stackoverflow in BindingSystem
Product: [Eclipse Project] Platform Reporter: Thomas Schindl <tom.schindl>
Component: UIAssignee: Paul Webster <pwebster>
Status: VERIFIED FIXED QA Contact: Paul Webster <pwebster>
Severity: major    
Priority: P3 CC: bokowski, pwebster
Version: 3.6.1   
Target Milestone: 3.7 M4   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Bug Depends on:    
Bug Blocks: 330285    
Attachments:
Description Flags
The stack
none
action set source provider NPE guard v01 none

Description Thomas Schindl CLA 2010-11-10 03:40:17 EST
Created attachment 182788 [details]
The stack

We've just upgraded our RCP Application from 3.5.1 and it fails 50% of times on 3.6.1 with an Stackoverflow which destroys the command/handler system so that our RCP is left in a quite unuseable state.

Our observation is that it has something do with Dialogs (Progress Dialog, MessageDialog, Preference Dialog, ...). At the moment they are opened which means the workbench window is loosing its focus the stackoverflow is observed in the back.

This is a serious problem because there are upstream dependencies who force us more or less to go with 3.6.x so we can't easily stay in 3.5.x.

I've currently no clue where I could start debugging this very problem to prevent it from happening - naturally there's the possibility that we are doing something illegal which only worked by chance. 

Nevertheless I've marked the bug as major because I think it marks a regression from 3.5.x.
Comment 1 Thomas Schindl CLA 2010-11-11 05:39:46 EST
We tracked down the root cause which leads to the stack overflow. There are situations when our workbench gets up and has an NPE:

---------8<---------
!ENTRY org.eclipse.ui.workbench 4 2 2010-11-11 11:33:58.251

!MESSAGE Beim Aufrufen des Codes vom Plug-in sind Fehler aufgetreten: "org.eclipse.ui.workbench".

!STACK 0

java.lang.NullPointerException

      at org.eclipse.ui.internal.Workbench.updateActiveWorkbenchWindowMenuManager(Workbench.java:3286)

      at org.eclipse.ui.internal.Workbench.access$0(Workbench.java:3238)

      at org.eclipse.ui.internal.Workbench$2.bindingManagerChanged(Workbench.java:3224)

      at org.eclipse.jface.bindings.BindingManager.fireBindingManagerChanged(BindingManager.java:900)

      at org.eclipse.jface.bindings.BindingManager.setActiveBindings(BindingManager.java:2176)

      at org.eclipse.jface.bindings.BindingManager.recomputeBindings(BindingManager.java:1762)

      at org.eclipse.jface.bindings.BindingManager.contextManagerChanged(BindingManager.java:689)

      at org.eclipse.core.commands.contexts.ContextManager.fireContextManagerChanged(ContextManager.java:165)

      at org.eclipse.core.commands.contexts.ContextManager.addActiveContext(ContextManager.java:109)

      at org.eclipse.ui.internal.contexts.ContextAuthority.updateContext(ContextAuthority.java:756)

      at org.eclipse.ui.internal.contexts.ContextAuthority.activateContext(ContextAuthority.java:173)

      at org.eclipse.ui.internal.contexts.ContextAuthority.checkWindowType(ContextAuthority.java:245)

      at org.eclipse.ui.internal.contexts.ContextAuthority.updateEvaluationContext(ContextAuthority.java:791)

      at org.eclipse.ui.internal.services.ExpressionAuthority.sourceChanged(ExpressionAuthority.java:288)

      at org.eclipse.ui.internal.services.ExpressionAuthority.addSourceProvider(ExpressionAuthority.java:111)

      at org.eclipse.ui.internal.contexts.ContextService.addSourceProvider(ContextService.java:138)

      at org.eclipse.ui.internal.Workbench$57.run(Workbench.java:2344)

      at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:42)

      at org.eclipse.ui.internal.Workbench.startSourceProviders(Workbench.java:2336)

      at org.eclipse.ui.internal.Workbench.access$18(Workbench.java:2321)

      at org.eclipse.ui.internal.Workbench$30.runWithException(Workbench.java:1553)

      at org.eclipse.ui.internal.StartupThreading$StartupRunnable.run(StartupThreading.java:31)

      at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)

      at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:134)

      at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:4041)

      at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3660)

      at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2548)

      at org.eclipse.ui.internal.Workbench.access$4(Workbench.java:2438)

      at org.eclipse.ui.internal.Workbench$7.run(Workbench.java:671)

      at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332)

      at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:664)

      at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149)

      at com.bizerba.basic.rcpapp.internal.Application.start(Application.java:81)

      at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)

      at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)

      at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)

      at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369)

      at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)

      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

      at java.lang.reflect.Method.invoke(Method.java:597)

      at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:619)

      at org.eclipse.equinox.launcher.Main.basicRun(Main.java:574)

      at org.eclipse.equinox.launcher.Main.run(Main.java:1407)

      at org.eclipse.equinox.launcher.Main.main(Main.java:1383)
---------8<---------

The stack overflow is a problem happing afterwards whenever a Dialog is opened, ...
Comment 2 Paul Webster CLA 2010-11-11 08:17:06 EST
The problem is that the source providers aren't instantiated until the workbench is up, but this code path during startup seems to be activating them.

I can fix this in 3.7 by adding NPE gaurds or making sure these services are added early.

PW
Comment 3 Thomas Schindl CLA 2010-11-15 03:22:00 EST
Our first tests show that guarding with NPE checks is fixing the problem and the system behaves like 3.5.2 did.
Comment 4 Thomas Schindl CLA 2010-11-15 07:13:05 EST
We have tested today with the NPE guard in place and can confirm that everything is back in good shape. Paul can we get the NPE-Guard fix into 3.6.2?
Comment 5 Paul Webster CLA 2010-11-15 11:12:19 EST
Created attachment 183131 [details]
action set source provider NPE guard v01

Tom, could you please test your usecase with this patch.  If it's OK, it'll be valid for 3.7 and 3.6.2.

PW
Comment 6 Paul Webster CLA 2010-11-15 13:15:27 EST
(In reply to comment #5)
> Created an attachment (id=183131) [details]
> action set source provider NPE guard v01

I've released it to 3.7, but will wait for feedback before I look at 3.6.2
PW
Comment 7 Thomas Schindl CLA 2010-11-15 13:58:36 EST
Your patch looks identical to what I did locally but to make sure I'll make the team test with your patch in place. Thanks
Comment 8 Boris Bokowski CLA 2010-11-16 14:22:21 EST
(In reply to comment #2)
> The problem is that the source providers aren't instantiated until the
> workbench is up, but this code path during startup seems to be activating them.

With the null checks in place, there is code that might not be executed now. Paul, could this cause other problems later on, or would instantiating the source providers bring them into a good state?
Comment 9 Paul Webster CLA 2010-11-16 14:26:01 EST
(In reply to comment #8)
> With the null checks in place, there is code that might not be executed now.
> Paul, could this cause other problems later on, or would instantiating the
> source providers bring them into a good state?

Yes, the code that won't be run is basically firing events into the missing source provider.  If that code were hit, it would also generate NPEs, and the other code has no side effects.

PW
Comment 10 Thomas Schindl CLA 2010-11-16 14:37:24 EST
We've been running with this fix today in our selfhosting envs and can confirm that it fixes our problem. We have not found any strange things happening (e.g. in correct editor activations, ...)
Comment 11 Paul Webster CLA 2010-11-16 15:00:06 EST
Released to HEAD
PW
Comment 12 Paul Webster CLA 2010-12-07 14:16:02 EST
In I20101206-1800
PW