| Summary: | [WorkingSets] AIOOBE when trying to remove breakpoint by doubleclick on nearby line | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | Markus Keller <markus.kell.r> | ||||||||
| Component: | UI | Assignee: | Boris Bokowski <bokowski> | ||||||||
| Status: | VERIFIED FIXED | QA Contact: | Hitesh <hsoliwal> | ||||||||
| Severity: | normal | ||||||||||
| Priority: | P3 | CC: | bokowski, darin.eclipse, emoffatt, Michael_Rennie, ob1.eclipse, pwebster | ||||||||
| Version: | 3.5 | Flags: | ob1.eclipse:
review+
|
||||||||
| Target Milestone: | 3.5 RC1 | ||||||||||
| Hardware: | PC | ||||||||||
| OS: | Windows XP | ||||||||||
| Whiteboard: | |||||||||||
| Attachments: |
|
||||||||||
And when I select the whole static initializer and then doubleclick any of the selected lines, I always get the AIOOBE. looks like this is coming from our call to: PlatformUI.getWorkbench().getWorkingSetManager().getWorkingSets(); sending to Platform UI for comment > PlatformUI.getWorkbench().getWorkingSetManager().getWorkingSets(); Wheew, that sounds scary and potentially hard to reproduce. Since I could reproduce in my setup, I connected a debugger and saw that comparison failed for random pairs of working set names. I verified that ICU works fine for those strings. Then I've set a breakpoint on the static field org.eclipse.ui.internal.WorkingSetComparator.INSTANCE and found that the same comparator is being used concurrently, once via the stack trace from comment 0, and the other one in the UI thread, see below. The problem is that the private Collator fCollator = Collator.getInstance(); in the shared WorkingSetComparator is not thread-safe. Thread [main] (Suspended (access of field INSTANCE in WorkingSetComparator)) WorkingSetManager(AbstractWorkingSetManager).getWorkingSets() line: 261 BreakpointSetOrganizer.getCategories(IBreakpoint) line: 82 BreakpointOrganizerExtension.getCategories(IBreakpoint) line: 97 BreakpointsContentProvider.reorganize() line: 184 BreakpointsViewEventHandler$1.run() line: 79 RunnableLock.run() line: 35 UISynchronizer(Synchronizer).runAsyncMessages(boolean) line: 134 Display.runAsyncMessages(boolean) line: 3855 Display.readAndDispatch() line: 3476 Workbench.runEventLoop(Window$IExceptionHandler, Display) line: 2393 Workbench.runUI() line: 2357 Workbench.access$4(Workbench) line: 2209 Workbench$5.run() line: 499 Realm.runWithDefault(Realm, Runnable) line: 332 Workbench.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 492 PlatformUI.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 149 IDEApplication.start(IApplicationContext) line: 113 EclipseAppHandle.run(Object) line: 194 EclipseAppLauncher.runApplication(Object) line: 110 EclipseAppLauncher.start(Object) line: 79 EclipseStarter.run(Object) line: 368 EclipseStarter.run(String[], Runnable) line: 179 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 39 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25 Method.invoke(Object, Object...) line: 597 Main.invokeFramework(String[], URL[]) line: 559 Main.basicRun(String[]) line: 514 Main.run(String[]) line: 1287 Main.main(String[]) line: 1263 Hitesh, can you investigate this one please? Markus is Created attachment 133734 [details]
snippet to reproduce the bug
It is indeed due to concurrent access to WorkingSetComparator.collator.
I was able to reproduce this, and it has definite steps. Some code (especially the breakpoint removal) in ToggleBreakpointAction and ToggleBreakpointAdapter (See toggleBreakpoints, toggleLineBreakpoints) , under repeated execution within a small enough timespan results in concurrent access to WorkingSetComparator.collator.
See attachment.I get the error in every run, but you may need a few runs. I started out writing a test based on ToggleBreakpointAction, but this turned out to be easier.
The WorkingSetManager is not really thread-safe. Not sure if only serializing access to collator will solve this completely. This might be a good opportunity to add a little more thread safety. Notice add*,remove*,get*,etc. methods. Any thoughts on this ?
(In reply to comment #5) > Markus is > Opps!! Sorry about that, some firefox plug-in or probably an unintentional hit on 'Enter'. I'm trying to determine whether we should fix this this late... How likely is this to be hit under normal eclipse use? What is the downside if it happens (besides he AIOOB being logged)? How safe would the code to add some thread safety here be? I think I saw this problem already a few times before I filed the bug (but mostly after heavy hot code replacing, where I wasn't sure if the problem didn't come from the VM). I never had issues like lost or corrupted working sets, so I guess the AIOOBE doesn't cause additional harm (at least not in this scenario). If a simple fix like putting WorkingSetComparator#fCollator into a ThreadLocal is enough to fix this problem, then I would put it in. For more elaborate synchronization, it's too late IMO. Rather than making things thread safe on our side, would it be possible to change the code in Debug to not call methods on the working set manager outside of the UI thread? Tentatively marking this bug for RC1. We'd have to do synch exec's for this. The debug platform provides breakpoint working sets for all debuggers - and they can call us from any thread. Sync exec's are always a bad idea. I'd prefere to see a thread safe working set manager. Do you have other cases like this, where you call Platform UI API from a non-UI thread? The implicit assumption on our side is that clients can only make calls on the UI thread, unless the JavaDoc explicitly allows calls from a non-UI thread. (In reply to comment #12) > Do you have other cases like this, where you call Platform UI API from a > non-UI thread. First I'll file a feature request against API tools... get back to you soon :-) Off the top of my head I do not know. I know we have found cases in the past and addressed them on a case by case basis. The code in question here has been around since 3.1, so it hasn't caused too much trouble. (In reply to comment #13) > The code in question here has been > around since 3.1, so it hasn't caused too much trouble. Good to know! :) Created attachment 134784 [details]
patch
This should fix the problem, using a ThreadLocal.
Oleg, could you please review the patch? Created attachment 134785 [details] Another possible patch (In reply to comment #15) > Created an attachment (id=134784) [details] > patch > > This should fix the problem, using a ThreadLocal. > IMHO, threadlocal is a bit expensive(, more so for a comparator in a sort). It would a good idea to use a new instance of the comparator - as done in this patch. The patch addresses the AIOOB part or the comparator part of the problem, but the method still remains unsafe and prone to other errors or a malfunction. Thoughts? > IMHO, threadlocal is a bit expensive
Why do you think that's expensive? It's basically just a lookup in a hash map, isn't it?
I actually proposed ThreadLocal because I think creation of (ICU?) collators has once been identified as a performance bottleneck. I didn't do any measurements, though.
(In reply to comment #18) > > IMHO, threadlocal is a bit expensive > > Why do you think that's expensive? It's basically just a lookup in a hash map, > isn't it? > I'm I can't take that question fully right now- getting late here -my apologies. It certainly adds to the cost. I am not too sure about the complete internals, but do remember having a discussion sometime back and also reading it somewhere. > I actually proposed ThreadLocal because I think creation of (ICU?) collators > has once been identified as a performance bottleneck. I didn't do any > measurements, though. > I understand from data posted on Bug 275170 and code inspection, the creation of a new instance is hardly a concern, it is the compare. We are better off using a new instance in the current context. Later. Hitesh I cannot imagine that using ThreadLocal is more expensive than going through the code behind Collator.getInstance(). Note that the overhead of ThreadLocal is only incurred whenever WorkingSetComparator.getInstance() is called. Once we are in the call to sort (or in the loop that fills the TreeSet), we are using a particular instance of Collator. (In reply to comment #18) (In reply to comment #20) > Note that the overhead of ThreadLocal > is only incurred whenever WorkingSetComparator.getInstance() is called. > A hasty glance at the patch, and for some inexplicable reason I had been seeing the collator as being ThreadLocal-ed all this while instead of the instance. Sorry for the noise. My mistake. The patch is good. But this alone is insufficient. This should be addressed fully.In case the method executes incorrectly and an error is not thrown, we may have a really hard-to-track, hard-to-reproduce bug. If only the AIOOBE is to be fixed, as a temporary measure, then I suggest opening a new bug for the break point action related code or keeping this bug open.What do you think? +1 for the Boris's patch. On the more general matter, it seems that this problem comes from the RuleBasedCollator from ICU not being thread-safe. Looking at its Javadoc I don't see any mention if it is intended to be thread-safe. Anybody knows if ICU classes are supposed to be thread-safe? > Anybody knows if ICU classes are supposed to be thread-safe? They're not, unless explicitly specified. See e.g. http://userguide.icu-project.org/collation/architecture : "ICU Collator instances cannot be shared among threads." Patch (attachment 134748 [details]) released.
(In reply to comment #24) > Patch (attachment 134748 [details]) released. Sorry, that should have been: attachment 134784 [details] (In reply to comment #21) > But this alone is insufficient. This should be addressed fully.In case the > method executes incorrectly and an error is not thrown, we may have a really > hard-to-track, hard-to-reproduce bug. If only the AIOOBE is to be fixed, as a > temporary measure, then I suggest opening a new bug for the break point action > related code or keeping this bug open.What do you think? Hitesh, I am not sure I understand. The AIOOBE should not happen anymore because we are no longer using ICU4J's collator from multiple threads. If you think there is a remaining issue, could you please open a new bug? Thanks. (In reply to comment #23) > > Anybody knows if ICU classes are supposed to be thread-safe? > They're not, unless explicitly specified. See e.g. > http://userguide.icu-project.org/collation/architecture : > "ICU Collator instances cannot be shared among threads." Thanks Markus, that's good to know. Verified using I20090514-2000. |
I20090415-1348 - have: public class Try { String s= "Hello" + "World"; static { String foo= "bar"; System.out.println(foo); } } - doubleclick in vertical ruler on line ' + "World";' - doubleclick again - ... and on the third doubleclick, I get the AIOOBE below. My Breakpoints view is visible and grouped by Breakpoint Working Sets. When I do the same on the empty line before 'String foo= "bar";', I get a similar exception. !ENTRY org.eclipse.debug.core 4 2 2009-04-16 18:43:15.127 !MESSAGE Problems occurred when invoking code from plug-in: "org.eclipse.debug.core". !STACK 0 java.lang.ArrayIndexOutOfBoundsException: -1 at com.ibm.icu.text.RuleBasedCollator.append(RuleBasedCollator.java:3492) at com.ibm.icu.text.RuleBasedCollator.doPrimaryCompare(RuleBasedCollator.java:3280) at com.ibm.icu.text.RuleBasedCollator.compareRegular(RuleBasedCollator.java:2299) at com.ibm.icu.text.RuleBasedCollator.compare(RuleBasedCollator.java:1292) at org.eclipse.ui.internal.WorkingSetComparator.compare(WorkingSetComparator.java:48) at java.util.TreeMap.put(TreeMap.java:530) at java.util.TreeSet.add(TreeSet.java:238) at org.eclipse.ui.internal.AbstractWorkingSetManager.getWorkingSets(AbstractWorkingSetManager.java:265) at org.eclipse.debug.internal.ui.views.breakpoints.BreakpointSetOrganizer.breakpointsRemoved(BreakpointSetOrganizer.java:242) at org.eclipse.debug.internal.core.BreakpointManager$BreakpointsNotifier.run(BreakpointManager.java:992) at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:37) at org.eclipse.debug.internal.core.BreakpointManager$BreakpointsNotifier.notify(BreakpointManager.java:1014) at org.eclipse.debug.internal.core.BreakpointManager.fireUpdate(BreakpointManager.java:868) at org.eclipse.debug.internal.core.BreakpointManager.removeBreakpoints(BreakpointManager.java:476) at org.eclipse.debug.internal.core.BreakpointManager.removeBreakpoint(BreakpointManager.java:455) at org.eclipse.jdt.internal.debug.ui.actions.BreakpointLocationVerifierJob.manageLineBreakpoint(BreakpointLocationVerifierJob.java:234) at org.eclipse.jdt.internal.debug.ui.actions.BreakpointLocationVerifierJob.run(BreakpointLocationVerifierJob.java:176) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)