Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 345298 - [breakpoints] BreakpointManager deadlocks trying to restore breakpoints
Summary: [breakpoints] BreakpointManager deadlocks trying to restore breakpoints
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Debug (show other bugs)
Version: 3.7   Edit
Hardware: PC Windows 7
: P3 major with 3 votes (vote)
Target Milestone: 3.8 M4   Edit
Assignee: Michael Rennie CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 351832 355182 357152 362399 362613 365256 365392 367150 403897 (view as bug list)
Depends on:
Blocks: 364418 365256
  Show dependency tree
 
Reported: 2011-05-10 12:37 EDT by Jim Garrison CLA
Modified: 2013-03-21 02:46 EDT (History)
18 users (show)

See Also:
pawel.1.piech: review+


Attachments
Screenshot (47.89 KB, image/png)
2011-05-10 12:39 EDT, Jim Garrison CLA
no flags Details
Screenshot (115.12 KB, image/png)
2011-05-10 12:40 EDT, Jim Garrison CLA
no flags Details
Screenshot (36.96 KB, image/png)
2011-05-10 12:41 EDT, Jim Garrison CLA
no flags Details
Thread dump (15.31 KB, text/plain)
2011-05-10 14:51 EDT, Jim Garrison CLA
no flags Details
Thread dump (18.34 KB, text/plain)
2011-05-10 14:52 EDT, Jim Garrison CLA
no flags Details
Patch with suggested fix. (1.89 KB, patch)
2011-10-18 16:27 EDT, Pawel Piech CLA
no flags Details | Diff
Updated patch. (2.20 KB, patch)
2011-10-28 00:16 EDT, Pawel Piech CLA
no flags Details | Diff
patch (3.59 KB, patch)
2011-11-15 14:40 EST, Michael Rennie CLA
no flags Details | Diff
performance tests (13.58 KB, patch)
2011-11-15 15:08 EST, Michael Rennie CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Garrison CLA 2011-05-10 12:37:16 EDT
Build Identifier: Indigo 20110505-1223; JSDT 1.3.0.v201103031824

With a .js file in the current editor window, attempting to launch the debugger immediately hangs Eclipse hard.  Eclipse stops responding with the debug menu still displayed.  This happens every time, immediately after either:

1) Click on the drop-down arrow next to the debug icon in the tool bar. As soon as the menu appears Eclipse stops responding (see screenshot)

2) Click on the Run menu, then click "Debug As".  As soon as the submenu appears, Eclipse stops responding (see screenshot).

Clicking on the window "Close" button brings up Win7's "This program has stopped responding" dialog.  Selecting "Close program" pops up a JVM crash window (see screenshot)

Reproducible: Always
Comment 1 Jim Garrison CLA 2011-05-10 12:39:38 EDT
Created attachment 195241 [details]
Screenshot

Eclipse hang state after clicking on toolbar debug drop-down arrow (circled)
Comment 2 Jim Garrison CLA 2011-05-10 12:40:41 EDT
Created attachment 195244 [details]
Screenshot

Eclipse hang state after clicking "Debug As" on Run menu
Comment 3 Jim Garrison CLA 2011-05-10 12:41:35 EDT
Created attachment 195245 [details]
Screenshot

JVM Crash window that appears after forcibly closing Eclipse
Comment 4 Michael Rennie CLA 2011-05-10 12:53:03 EDT
Interesting, can you attach any log entries? and take a stackdump? - http://wiki.eclipse.org/How_to_report_a_deadlock

Do you have any other tooling installed other than Eclipse + JSDT?

Can you provide a sample js file that produces the problem?

Which one of the debug launches were you trying to do (Rhino or Remote JavaScript)?
Comment 5 Jim Garrison CLA 2011-05-10 14:51:53 EDT
Created attachment 195260 [details]
Thread dump

Thread dump before hang
Comment 6 Jim Garrison CLA 2011-05-10 14:52:18 EDT
Created attachment 195261 [details]
Thread dump

Thread dump after hang
Comment 7 Jim Garrison CLA 2011-05-10 15:05:04 EDT
Looks like it's hanging in BreakpointManager. The relevant part of the thread dump is 

"Worker-4" prio=6 tid=0x0000000008b3a000 nid=0x225c waiting for monitor entry [0x000000000f9be000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints0(BreakpointManager.java:399)
    - waiting to lock <0x0000000111f120c0> (a org.eclipse.debug.internal.core.BreakpointManager)
    at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints(BreakpointManager.java:385)
    at org.eclipse.debug.core.model.Breakpoint.<clinit>(Breakpoint.java:47)
    at org.eclipse.wst.jsdt.debug.internal.core.JavaScriptPreferencesManager.createSuspendOnException(JavaScriptPreferencesManager.java:168)
    at org.eclipse.wst.jsdt.debug.internal.core.JavaScriptPreferencesManager.access$2(JavaScriptPreferencesManager.java:166)
    at org.eclipse.wst.jsdt.debug.internal.core.JavaScriptPreferencesManager$StartJob.run(JavaScriptPreferencesManager.java:80)
    at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)

   Locked ownable synchronizers:
    - None


As to whether it's a remote or Rhino config: There is one remote javascript config defined, but I can't seem to touch anything that even loads the launch configuration without hanging.  Even just opening the "Run/Debug Configuration..." dialog hangs Eclipse.
Comment 8 Jim Garrison CLA 2011-05-10 16:49:49 EDT
Additional info.  The problems were being encountered when opening a HELIOS workspace in INDIGO M7.  I created a new workspace in Indigo and re-imported my projects there, and am not seeing the same problem.
Comment 9 Michael Rennie CLA 2011-05-10 17:26:39 EDT
Thanks for reporting back so quickly Jim. I'm glad that the problem went away for you, but we need to look into why the breakpoint manager is dead-locking trying to add a new breakpoint.
Comment 10 Jim Garrison CLA 2011-05-10 17:57:30 EDT
Looks like I spoke too soon.... my Indigo workspace is now exhibiting the same behavior.  Do you have any other requests for debugging information I can provide?
Comment 11 Jim Garrison CLA 2011-05-10 18:09:27 EDT
Here's the wedged worker thread (looks identical to the other ones):

"Worker-10" prio=6 tid=0x000000003abe1800 nid=0x181c waiting for monitor entry [0x0000000040f7e000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints0(BreakpointManager.java:399)
	- waiting to lock <0x0000000032801358> (a org.eclipse.debug.internal.core.BreakpointManager)
	at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints(BreakpointManager.java:385)
	at org.eclipse.debug.core.model.Breakpoint.<clinit>(Breakpoint.java:47)
	at org.eclipse.wst.jsdt.debug.internal.core.JavaScriptPreferencesManager.createSuspendOnException(JavaScriptPreferencesManager.java:168)
	at org.eclipse.wst.jsdt.debug.internal.core.JavaScriptPreferencesManager.access$2(JavaScriptPreferencesManager.java:166)
	at org.eclipse.wst.jsdt.debug.internal.core.JavaScriptPreferencesManager$StartJob.run(JavaScriptPreferencesManager.java:80)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)

   Locked ownable synchronizers:
	- None
Comment 12 Jim Garrison CLA 2011-05-10 18:29:19 EDT
Now it's working again... three hangs in a row, then on the fourth time it again worked.  Tomorrow I will try to duplicate this with some very simple test code (not including all the extra stuff I have loaded in my workspace).
Comment 13 Michael Rennie CLA 2011-09-01 15:11:58 EDT
*** Bug 355182 has been marked as a duplicate of this bug. ***
Comment 14 Michael Rennie CLA 2011-09-12 09:17:51 EDT
*** Bug 351832 has been marked as a duplicate of this bug. ***
Comment 15 Michael Rennie CLA 2011-09-12 09:23:33 EDT
*** Bug 357152 has been marked as a duplicate of this bug. ***
Comment 16 David Woldrich CLA 2011-09-12 13:47:41 EDT
Michael Rennie, thanks for marking my bug as a dupe.

I took a look at the thread dumps here and in the dupe bug reports.  There typically seems to be two threads making concurrent calls to BreakpointManager.getBreakpoints0(), which is a synchronized method.  The first thread, main, always seems to own the lock and is in RUNNABLE state looks like this:

"main" prio=6 tid=0x000000000021b800 nid=0x23e8 in Object.wait() [0x000000000251d000]
   java.lang.Thread.State: RUNNABLE
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at java.lang.Class.newInstance0(Unknown Source)
    at java.lang.Class.newInstance(Unknown Source)
    at org.eclipse.core.internal.registry.osgi.RegistryStrategyOSGI.createExecutableExtension(RegistryStrategyOSGI.java:184)
    at org.eclipse.core.internal.registry.ExtensionRegistry.createExecutableExtension(ExtensionRegistry.java:905)
    at org.eclipse.core.internal.registry.ConfigurationElement.createExecutableExtension(ConfigurationElement.java:243)
    at org.eclipse.core.internal.registry.ConfigurationElementHandle.createExecutableExtension(ConfigurationElementHandle.java:55)
    at org.eclipse.debug.internal.core.BreakpointManager.createBreakpoint(BreakpointManager.java:527)
    at org.eclipse.debug.internal.core.BreakpointManager.loadBreakpoints(BreakpointManager.java:264)
    at org.eclipse.debug.internal.core.BreakpointManager.initializeBreakpoints(BreakpointManager.java:431)
    at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints0(BreakpointManager.java:400)
    - locked <0x0000000111f120c0> (a org.eclipse.debug.internal.core.BreakpointManager)

What is happening in that Class.newInstance() in RegistryStrategyOSGI.createExecutableExtension that is stalling, seemingly consumes no cpu, and is still runnable?  Some sort of sleeping spinlock?

I figured I might be able to help unravel this mystery, so I brought my stalled Eclipse into another eclipse debugger and tried to figure out what that main thread was doing.  The FQCN that main thread's newInstance() is trying to instantiate is "org.eclipse.wst.jsdt.debug.internal.core.breakpoints.JavaScriptLineBreakpoint".

I got the sources for that file and its superclasses, and I can't understand what's going on or how it could have even compiled.  So JavaScriptLineBreakpoint extends org.eclipse.wst.jsdt.debug.internal.core.breakpoints.JavaScriptBreakpoint.  And JavaScriptBreakpoint extends org.eclipse.wst.jsdt.debug.internal.rhino.debugger.Breakpoint.  

Thing is, it looks like RegistryStrategyOSGI.createExecutableExtension is calling the no-arg constructor on JavaScriptLineBreakpoint, but there is no no-arg constructor on org.eclipse.wst.jsdt.debug.internal.rhino.debugger.Breakpoint.  Its only constructor looks like:

public Breakpoint(Long breakpointId, ScriptSource script, Integer lineNumber, String functionName, String condition, Long threadId)

No no-arg constructor means that Breakpoint could not be extended and instantiated in this manner?  I almost feel like the main thread is somehow getting zombified inside Java during object construction, still holding the lock, and hard locking the UI?  Something smells wrong with this... 

Sorry I can't provide more insight, I don't know much about OSGI which I suppose is integral to this bug.
Comment 17 Michael Rennie CLA 2011-09-12 14:16:56 EDT
(In reply to comment #16)
> I figured I might be able to help unravel this mystery, so I brought my stalled
> Eclipse into another eclipse debugger and tried to figure out what that main
> thread was doing.  The FQCN that main thread's newInstance() is trying to
> instantiate is
> "org.eclipse.wst.jsdt.debug.internal.core.breakpoints.JavaScriptLineBreakpoint".
> 
> I got the sources for that file and its superclasses, and I can't understand
> what's going on or how it could have even compiled.  So
> JavaScriptLineBreakpoint extends
> org.eclipse.wst.jsdt.debug.internal.core.breakpoints.JavaScriptBreakpoint.  And
> JavaScriptBreakpoint extends
> org.eclipse.wst.jsdt.debug.internal.rhino.debugger.Breakpoint.  

JavaScriptBreakpoint extends org.eclipse.debug.core.model.Breakpoint, not the Rhino breakpoint.

> Thing is, it looks like RegistryStrategyOSGI.createExecutableExtension is
> calling the no-arg constructor on JavaScriptLineBreakpoint, but there is no
> no-arg constructor on

Each of the classes that extend JavaScriptBreakpoint have a no-arg constructor exactly for this purpose (persist / restore), see JavaScriptLineBreakpoint, JavaScriptLoadBreakpoint, JavaScriptExceptionBreakpoint and JavaScriptFunctionBreakpoint for examples.
 
> Sorry I can't provide more insight, I don't know much about OSGI which I
> suppose is integral to this bug.

No problem, thank you for digging into this to try and figure it out!
Comment 18 Michael Rennie CLA 2011-09-12 16:57:42 EDT
It looks like the BreakpointManager is waiting for RegistryStrategyOSGI.createExecutableExtension to return, which never seems to be happening. The JSDT worker thread is simply waiting for the breakpoint manager.

Tom, have you ever seen RegistryStrategyOSGI.createExecutableExtension not return before?
Comment 19 David Woldrich CLA 2011-09-12 18:50:24 EDT
(In reply to comment #17)
 
> JavaScriptBreakpoint extends org.eclipse.debug.core.model.Breakpoint, not the
> Rhino breakpoint.

Could you double check that on your end in the actual debugger rather than looking at source?  I know this sounds crazy, because the source I have disagrees with what the Eclipse Java debugger thinks.  

To traverse the hierarchy, I had eclipse open the .class files of the main thread, and then I used F3 on the class name in the "extends" clause to traverse to the super class.  It would open the corresponding .class file and I would associate a source jar with that .class so I could look at source.

In my eclipse debugger, it apparently thought that the super class for JavaScriptBreakpoint is org.eclipse.wst.jsdt.debug.internal.rhino.debugger.Breakpoint.  Don't ask me how the debugger came to that conclusion, because I agree with you that the sources I downloaded do not support that.  Perhaps some magic class synthesis code that is too smart by half and has a bug somehow?  Or maybe this is an emergent sentient AI is being borne out of Eclipse???  I, for one, welcome my new IDE overlord.

My build ID of eclipse is:  20110615-0604
Comment 20 Terry Parker CLA 2011-10-17 19:58:06 EDT
We are seeing the same deadlock when creating a Java exception breakpoint:

"Worker-40" prio=10 tid=0x09110400 nid=0xf93 waiting for monitor entry [0x77b62000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints0(BreakpointManager.java:402)
	- waiting to lock <0x82310f08> (a org.eclipse.debug.internal.core.BreakpointManager)
	at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints(BreakpointManager.java:388)
	at org.eclipse.debug.core.model.Breakpoint.<clinit>(Breakpoint.java:47)
	at (C/C++) [Unknown frame (generated stub/JIT)]([Unknown source])
	at org.eclipse.jdt.debug.core.JDIDebugModel.createExceptionBreakpoint(JDIDebugModel.java:431)
	at org.eclipse.jdt.internal.debug.ui.JavaDebugOptionsManager$InitJob.run(JavaDebugOptionsManager.java:152)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)


"main" prio=10 tid=0x080b9000 nid=0x7921 in Object.wait() [0xff7fb000]
   java.lang.Thread.State: RUNNABLE
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
	at java.lang.Class.newInstance0(Class.java:372)
	at java.lang.Class.newInstance(Class.java:325)
	at org.eclipse.core.internal.registry.osgi.RegistryStrategyOSGI.createExecutableExtension(RegistryStrategyOSGI.java:170)
	at org.eclipse.core.internal.registry.ExtensionRegistry.createExecutableExtension(ExtensionRegistry.java:874)
	at org.eclipse.core.internal.registry.ConfigurationElement.createExecutableExtension(ConfigurationElement.java:243)
	at org.eclipse.core.internal.registry.ConfigurationElementHandle.createExecutableExtension(ConfigurationElementHandle.java:51)
	at org.eclipse.debug.internal.core.BreakpointManager.createBreakpoint(BreakpointManager.java:530)
	at org.eclipse.debug.internal.core.BreakpointManager.loadBreakpoints(BreakpointManager.java:267)
	at org.eclipse.debug.internal.core.BreakpointManager.initializeBreakpoints(BreakpointManager.java:434)
	at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints0(BreakpointManager.java:403)
	- locked <0x82310f08> (a org.eclipse.debug.internal.core.BreakpointManager)
	at org.eclipse.debug.internal.core.BreakpointManager.hasBreakpoints(BreakpointManager.java:879)
	at org.eclipse.debug.internal.ui.actions.breakpoints.RemoveAllBreakpointsAction.isEnabled(RemoveAllBreakpointsAction.java:44)


Here is the sequence that goes bad:
1) User creation of a breakpoint starts a background thread.  That background thread starts running a static initialization block in org.eclipse.debug.core.model.Breakpoint.  That block has a single line, invoking DebugPlugin.getDefault().getBreakpointManager().getBreakpoints(), which in turn invokes the synchronized getBreakpoints0() function.
2) Before getBreakpoints0() is reached in the background thread, the UI thread starts running and makes a call to org.eclipse.debug.internal.core.BreakpointManager.hasBreakpoints(), which in turn calls the synchronized getBreakpoints0() function.  This causes existing breakpoints to get loaded and invokes the default JavaExceptionBreakpoint ctor.  Since JavaExceptionBreakpoint derives from Breakpoint, the UI thread hangs waiting for Breakpoint's static initialization block in the background thread to complete.
3) The background thread gets control again and tries to complete the static initialization block by making a call to org.eclipse.debug.internal.core.getBreakpoints0(), but the UI thread is blocked inside that function, so Eclipse is hung.

I'm not familiar enough with the code to know what the best solution is here.  Breakpoint's static initialization block was added to fix https://bugs.eclipse.org/bugs/show_bug.cgi?id=54993.  It is quite a bit of work to do in a static initialization block, but I'm not sure that it is avoidable.

Does checking if markers exist really require loading all of the breakpoint markers, or can it be derived from some other state?
Comment 21 Terry Parker CLA 2011-10-17 20:33:19 EDT
There is nothing JavaScript specific about this bug, it should get moved to platform debug, but I don't see that I can do that through Bugzilla.

Pawel, should I open a new bug there?
Comment 22 Pawel Piech CLA 2011-10-18 00:52:37 EDT
(In reply to comment #21)
> Pawel, should I open a new bug there?

I agree that this is a BreakpointManager bug, though I also don't have a suggestion on how to fix it at the moment.
Comment 23 Terry Parker CLA 2011-10-18 13:13:32 EDT
The fix for https://bugs.eclipse.org/112553 introduced synchronization of the getBreakpoints0() function in 1.85 of BreakpointManager.  The rationale was to guarantee that the breakpoint vector, which is created lazily, should be synchronized to prevent use before initialization.  Using double-checked locking via an internal holder class seems like a more lightweight solution that making getBreakpoints0() synchronized:

private static class BreakpointsHolder {
  // Requires changing initializeBreakpoints to return the vector
  public static Vector fBreakpoints = initializeBreakpoints();
}
 
private Vector getBreakpoints0() {
  return BreakpointsHolder.fBreakpoints;
}	

That reduces the scope of the lock but doesn't solve the deadlock.

The static initialization block in Breakpoint is problematic because loading breakpoints requires creating new breakpoint objects, so two threads in contention to load breakpoints are going to hit the static initialization block's lock.  That lock needs not to happen for breakpoint ctors.  Breakpoint.fMarker is only ever directly accessed through getMarker() and setMarker(), so maybe removing the static initialization block and adding the call to getBreakpointManager().getBreakpoints() in getMarker() would get the initialization to happen at the appropriate time?
Comment 24 Pawel Piech CLA 2011-10-18 14:56:47 EDT
(In reply to comment #23)
> The static initialization block in Breakpoint is problematic because loading
> breakpoints requires creating new breakpoint objects, so two threads in
> contention to load breakpoints are going to hit the static initialization
> block's lock.  That lock needs not to happen for breakpoint ctors. 
> Breakpoint.fMarker is only ever directly accessed through getMarker() and
> setMarker(), so maybe removing the static initialization block and adding the
> call to getBreakpointManager().getBreakpoints() in getMarker() would get the
> initialization to happen at the appropriate time?

I think this is a very reasonable suggestion.  Most breakpoint implementations do create their marker in their constructor.  However, they're also required to have a no-arg constructor that is meant to be called by BreakpointManager's getBreakpoints().  Instead of putting the call to initialize into getMarker(), I'd put it into the default constructor for Breakpoint class.  (I'll make a patch once I conquer my git problem of the day.)
Comment 25 Terry Parker CLA 2011-10-18 16:23:18 EDT
(In reply to comment #24)
> (In reply to comment #23)
> > The static initialization block in Breakpoint is problematic because loading
> > breakpoints requires creating new breakpoint objects, so two threads in
> > contention to load breakpoints are going to hit the static initialization
> > block's lock.  That lock needs not to happen for breakpoint ctors. 
> > Breakpoint.fMarker is only ever directly accessed through getMarker() and
> > setMarker(), so maybe removing the static initialization block and adding the
> > call to getBreakpointManager().getBreakpoints() in getMarker() would get the
> > initialization to happen at the appropriate time?
> 
> I think this is a very reasonable suggestion.  Most breakpoint implementations
> do create their marker in their constructor.  However, they're also required to
> have a no-arg constructor that is meant to be called by BreakpointManager's
> getBreakpoints().  Instead of putting the call to initialize into getMarker(),
> I'd put it into the default constructor for Breakpoint class.  (I'll make a
> patch once I conquer my git problem of the day.)

It isn't clear to me how that would work.  If the no argument ctor is only invoked when loading breakpoints, moving the call inside that ctor doesn't cause the breakpoints to be loaded (and any other external call to getBreakpoints()  would result in infinite recursion.)

Is it possible that Breakpoint's static initialization block is no longer needed?  Bug 54993 is quite old, JavaDebugOptionsManager isn't an IResourceChangeListener anymore.
Comment 26 Pawel Piech CLA 2011-10-18 16:27:29 EDT
Created attachment 205458 [details]
Patch with suggested fix.
Comment 27 Pawel Piech CLA 2011-10-18 16:28:43 EDT
Hi Mike, Do you see any downside to this potentially risky change?
Comment 28 Pawel Piech CLA 2011-10-18 16:57:56 EDT
(In reply to comment #25)
> It isn't clear to me how that would work.  If the no argument ctor is only
> invoked when loading breakpoints, moving the call inside that ctor doesn't
> cause the breakpoints to be loaded (and any other external call to
> getBreakpoints()  would result in infinite recursion.)
Since it's the base class for all breakpoints and currently has no other constructor, it should be invoked by any breakpoint constructor.  Fortunately BreakpointManager.initializeBreakpoints() calls setBreakpoints() on its first statement, so recursion should be avoided.  Although a comment or to this point, or a more explicit locking would be helpful.

> Is it possible that Breakpoint's static initialization block is no longer
> needed?  Bug 54993 is quite old, JavaDebugOptionsManager isn't an
> IResourceChangeListener anymore.

I don't think I'd want to take that chance.  Another fix option may be to force BreakpointManager to be loaded when it's created.  However, if a breakpoint was the first thing to force the plugin to activate, this deadlock would still be a problem.  Plus there's the performance impact.
Comment 29 Pawel Piech CLA 2011-10-18 17:14:56 EDT
(In reply to comment #25)
> It isn't clear to me how that would work. 
I guess the main point is that removing the static section also removes one of the locks involved in the deadlock: the lock to load the class.
Comment 30 Terry Parker CLA 2011-10-18 20:42:00 EDT
(In reply to comment #29)
> (In reply to comment #25)
> > It isn't clear to me how that would work. 
> I guess the main point is that removing the static section also removes one of
> the locks involved in the deadlock: the lock to load the class.

Yes, removing the static section will clear out the deadlock.  I think I've got my head wrapped around the initialization sequence now, thanks.

FYI in the patch you create BreakpointManager.ensureBreakpointsInitialized() but don't use it in the new Breakpoint ctor.
Comment 31 Pawel Piech CLA 2011-10-28 00:16:31 EDT
Created attachment 206104 [details]
Updated patch.
Comment 32 Michael Rennie CLA 2011-11-04 09:28:20 EDT
*** Bug 362613 has been marked as a duplicate of this bug. ***
Comment 33 Pawel Piech CLA 2011-11-04 17:48:01 EDT
I committed the fix.  Mike please review (the change is simple, but the logic is not).  Also, I'm not sure whether you'd like to target this for 3.7.2.
Comment 35 Michael Rennie CLA 2011-11-07 11:01:12 EST
(In reply to comment #34)
> http://git.eclipse.org/c/platform/eclipse.platform.debug.git/commit/?id=366ebde49ed41a56e1684ea7be65cf58cafd72ed

I have no problem with how the fix looks, it makes sense, I am most worried that there was some edge case out there that relied on the static initializer code.
Comment 36 Michael Rennie CLA 2011-11-07 12:45:22 EST
(In reply to comment #35)
> I have no problem with how the fix looks, it makes sense, I am most worried
> that there was some edge case out there that relied on the static initializer
> code.

What if instead of moving the init code into the no-arg constructor, we got rid of the static initializer code and initialized the breakpoint manager when the debug plugins loads - i.e. add startup() / shutdown() to the manager and call them in start() / stop() from DebugPlugin.

We would take a very small performance hit when the plug-in loads, but at least then the manager would be initialized prior to anyone wanting to access it.

Thoughts?
Comment 37 Pawel Piech CLA 2011-11-07 13:18:39 EST
(In reply to comment #36)
I refrained from going this far because I thought moving the initialization to the breakpoint constructor was a smaller and thus less risky change.  But other than that I don't see anything wrong with initializing breakpoint manager upfront (except for the possible performance hit as mentioned).
Comment 38 Michael Rennie CLA 2011-11-07 14:44:30 EST
(In reply to comment #37)
> But other
> than that I don't see anything wrong with initializing breakpoint manager
> upfront (except for the possible performance hit as mentioned).

The first step to test if startup / shutdown is worth it would be to create a performance test to see how long it takes to init the manager.

Reopening.
Comment 39 Dani Megert CLA 2011-11-10 04:07:33 EST
(In reply to comment #36)
> (In reply to comment #35)
> > I have no problem with how the fix looks, it makes sense, I am most worried
> > that there was some edge case out there that relied on the static initializer
> > code.
> 
> What if instead of moving the init code into the no-arg constructor, we got rid
> of the static initializer code and initialized the breakpoint manager when the
> debug plugins loads - i.e. add startup() / shutdown() to the manager and call
> them in start() / stop() from DebugPlugin.
> 
> We would take a very small performance hit when the plug-in loads, but at least
> then the manager would be initialized prior to anyone wanting to access it.
> 
> Thoughts?

I would also prefer this if the overhead is not too big.

BreakpointManager.ensureBreakpointsInitialized() looks wrong to me. This should not be the task of the caller. I would already ensure this inside the BreakpointManager constructor. Also note that as soon as we DebugUI is started we create the BreakpointOrganizerManager which asks for the breakpoint manager.
Comment 40 Michael Rennie CLA 2011-11-15 14:40:28 EST
Created attachment 207048 [details]
patch

This patch changes the solution to have the BreakpointManager initialize when the Debugplugin loads.
Comment 41 Michael Rennie CLA 2011-11-15 15:08:19 EST
Created attachment 207052 [details]
performance tests

This patch provides three new performance tests for the breakpoint manager. The tests assume that markers have been restored in the workspace (the bp manager relies on the platform to persist / restore IMarkers) and run for 6500 iterations. One test is for 50 bps, one for 100 bps and one for 200 bps.
Comment 43 Pawel Piech CLA 2011-11-16 00:14:45 EST
(In reply to comment #42)
The changes look fine, and the tests certainly show that a performance hit is not significant.  I only have one doubt left: is it possible that performing the breakpoint manager during the plugin loading process would reintroduce the very bug that we're trying to fix.  

Suppose that the Breakpoint class is the first class to be loaded out of the debug bundle (as in the stack trace in comment #7).  Then creating the breakpoint will cause the class loader to wait for the debug.core bundle to start up.  This thread may be in a race condition with another thread to load the debug plugin and initialize the breakpoint manager.  At this point, the breakpoint manager will lock waiting for the Breakpoint class to be loaded (as in the analysis in comment #16), but the Breakpoint class will never finish loading as its waiting for the DebugPlugin.start() to complete.

This is at the limit of my knowledge of osgi and java, so I'm not sure if the above scenario is possible, maybe Dani could refute my threory?  

Dani do you know if when a class triggers loading of a plugin, would another thread block before loading the same class?
Comment 44 Michael Rennie CLA 2011-11-16 16:20:01 EST
(In reply to comment #43)
 
> This is at the limit of my knowledge of osgi and java, so I'm not sure if the
> above scenario is possible, maybe Dani could refute my threory?  

I think we should be fine because we have removed the static initializer section, so when the Breakpoint class is loaded for the first time (as per comment#16) we do not try to call back in to getBreakpoints0.
Comment 45 Pawel Piech CLA 2011-11-16 16:28:08 EST
(In reply to comment #44)
After a brief IM clarification from Mike, I'm good :-)
Comment 46 Dani Megert CLA 2011-11-17 09:14:07 EST
> Dani do you know if when a class triggers loading of a plugin, would another
> thread block before loading the same class?
Yes, it would but I think we wouldn't deadlock since we no longer lock the class.
Comment 47 Michael Rennie CLA 2011-11-21 16:46:54 EST
Marking fixed. Opened bug 364418 for back-porting the fix / performance tests to 3.7.x
Comment 48 Michael Rennie CLA 2011-12-01 12:52:57 EST
*** Bug 365256 has been marked as a duplicate of this bug. ***
Comment 49 Michael Rennie CLA 2011-12-05 13:19:26 EST
*** Bug 365392 has been marked as a duplicate of this bug. ***
Comment 50 Michael Rennie CLA 2011-12-14 13:20:35 EST
*** Bug 362399 has been marked as a duplicate of this bug. ***
Comment 51 pjv pjv CLA 2011-12-26 05:06:12 EST
Would it be possible to back-port this to the distro-stable 3.6 series wrt. bug 365256?
Comment 52 Dani Megert CLA 2011-12-27 02:09:38 EST
(In reply to comment #51)
> Would it be possible to back-port this to the distro-stable 3.6 series wrt. bug
> 365256?

There are no further 3.6 maintenance builds.
Comment 53 pjv pjv CLA 2011-12-29 10:21:21 EST
Thanks for the pretty awesome support, people! Keep up the good work. I'll be updating in a few weeks when 3.7.2 appears on http://download.eclipse.org/eclipse/downloads/eclipse3x.php. The reason I have to beg for backports is that - I double-checked - Gentoo Linux only has Eclipse 3.5.1 as latest stable version: https://bugs.gentoo.org/325271?id=325271 . Distro problems with p2 as well as a lot of dependencies (tomcat, jetty) seem to hold off newer versions. Just FYI.
Comment 54 Michael Rennie CLA 2012-01-10 14:38:03 EST
*** Bug 367150 has been marked as a duplicate of this bug. ***
Comment 55 Terry Parker CLA 2012-01-18 13:45:33 EST
I'm backporting this patch to 3.7.1 (using the patch in 364418) and have found a problem.  Currently my team implements an org.eclipse.core.filesystem.filesystems extension point, which gets invoked very early in the startup process, before the workspace is fully loaded.  The plug-in providing the EFS implementation transitively depends on the debug.core plug-in.  When run with the patch for this bug applied, the call to config.createExecutableExtension() in BreakpointManager.createBreakpoint() generates an NPE when trying to load existing Java breakpoints, due to Workspace.propertyManager not being initialized.  Here is the trimmed stack trace for the main thread:

Thread [main] (Suspended)  
  BreakpointManager.createBreakpoint(IMarker) line: 556  
  BreakpointManager.loadBreakpoints(IResource, boolean) line: 262  
  BreakpointManager.initializeBreakpoints() line: 451  
  BreakpointManager.getBreakpoints0() line: 420  
  BreakpointManager.start() line: 406  
  DebugPlugin.start(BundleContext) line: 682  
        ...
  Main.main(String[]) line: 1386  


Here is the trimmed error log:

!ENTRY org.eclipse.osgi 4 0 2012-01-18 09:09:11.594
!MESSAGE An error occurred while automatically activating bundle org.eclipse.jdt.core (275).
!STACK 0
org.osgi.framework.BundleException: Exception in org.eclipse.jdt.core.JavaCore.start() of bundle org.eclipse.jdt.core.
  at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:734)
        ...
  at org.eclipse.jdt.internal.debug.core.JDIDebugPlugin.start(JDIDebugPlugin.java:266)
        ...
  at org.eclipse.debug.internal.core.BreakpointManager.createBreakpoint(BreakpointManager.java:547)
  at org.eclipse.debug.internal.core.BreakpointManager.loadBreakpoints(BreakpointManager.java:262)
  at org.eclipse.debug.internal.core.BreakpointManager.initializeBreakpoints(BreakpointManager.java:451)
  at org.eclipse.debug.internal.core.BreakpointManager.getBreakpoints0(BreakpointManager.java:420)
  at org.eclipse.debug.internal.core.BreakpointManager.start(BreakpointManager.java:406)
  at org.eclipse.debug.core.DebugPlugin.start(DebugPlugin.java:682)
        ...
  at com.google.eclipse.common.core.CommonCorePlugin.start(CommonCorePlugin.java:95)
        ...
  at com.google.eclipse.filesystem.core.FilteredLocalFileSystem.<init>(FilteredLocalFileSystem.java:193)
        ...
  at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:124)
  at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
  at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
  at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
  at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:344)
  at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:616)
  at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:622)
  at org.eclipse.equinox.launcher.Main.basicRun(Main.java:577)
  at org.eclipse.equinox.launcher.Main.run(Main.java:1410)
  at org.eclipse.equinox.launcher.Main.main(Main.java:1386)
Caused by: java.lang.NullPointerException
  at org.eclipse.core.internal.resources.Resource.getPersistentProperty(Resource.java:1173)
  at org.eclipse.jdt.internal.core.JavaModelManager.loadVariablesAndContainers(JavaModelManager.java:3220)
  at org.eclipse.jdt.internal.core.JavaModelManager.startup(JavaModelManager.java:4975)
  at org.eclipse.jdt.core.JavaCore.start(JavaCore.java:5144)
  at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711)
  at java.security.AccessController.doPrivileged(Native Method)
  at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702)
  ... 163 more
Root exception:
java.lang.NullPointerException
  at org.eclipse.core.internal.resources.Resource.getPersistentProperty(Resource.java:1173)
  at org.eclipse.jdt.internal.core.JavaModelManager.loadVariablesAndContainers(JavaModelManager.java:3220)
  at org.eclipse.jdt.internal.core.JavaModelManager.startup(JavaModelManager.java:4975)
  at org.eclipse.jdt.core.JavaCore.start(JavaCore.java:5144)
        ...
Comment 56 Dani Megert CLA 2012-01-19 11:08:19 EST
(In reply to comment #55)
Hi Terry. Can you please file a new bug and if possible attach a demo plug-in that allows to reproduce the NPE? Thanks.
Comment 57 Pawel Piech CLA 2012-01-20 00:07:03 EST
(In reply to comment #55)
> I'm backporting this patch to 3.7.1 (using the patch in 364418) and have found
> a problem. 

Another important question: what is the side effect of the exception?  Does it render breakpoints unusable?  I.e. is it worse than the deadlock?
Comment 58 Terry Parker CLA 2012-01-20 03:14:48 EST
I filed 369177 and added an attachment with projects that recreate the issue.

The issue is bad--it prevents the JDT from loading.  Presumably it would do the same for other language breakpoints, since loading them likely assumes full workbench initialization.  

The condition only gets triggered if an extension point that is invoked before the workbench is loaded manages to activate the DebugPlugin.  I'm not sure what extension points other than org.eclipse.core.filesystem.filesystems fall into that category.  Our EFS plug-in's initialization accesses a utility plug-in whose activator just happens to set up a org.eclipse.debug.core.ILaunchesListener.  We can pretty easily isolate that to work around the issue.

So severity is high but frequency may be limited to a few paths in early platform startup.  We can continue the discussion in 369177.
Comment 59 Pawel Piech CLA 2013-03-21 02:46:45 EDT
*** Bug 403897 has been marked as a duplicate of this bug. ***