Community
Participate
Working Groups
On the latest build I got the following error when shutting down the PE resource manager. I don't think this is PE specific however. java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$ValueIterator.next(HashMap.java:822) at org.eclipse.ptp.internal.core.elements.PJob.removeProcessesByJobRanks(PJob.java:381) at org.eclipse.ptp.internal.core.elements.PQueue.removeJobs(PQueue.java:181) at org.eclipse.ptp.rmsystem.AbstractResourceManager.removeQueues(AbstractResourceManager.java:1145) at org.eclipse.ptp.rmsystem.AbstractResourceManager.cleanUp(AbstractResourceManager.java:815) at org.eclipse.ptp.rmsystem.AbstractResourceManager.shutdown(AbstractResourceManager.java:580) at org.eclipse.ptp.ui.actions.StopResourceManagersObjectActionDelegate.run(StopResourceManagersObjectActionDelegate.java:60) at org.eclipse.ptp.ui.actions.AbstractResourceManagerSelectionActionDelegate.runWithEvent(AbstractResourceManagerSelectionActionDelegate.java:51) at org.eclipse.ui.internal.PluginAction.runWithEvent(PluginAction.java:241) at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection(ActionContributionItem.java:584) at org.eclipse.jface.action.ActionContributionItem.access$2(ActionContributionItem.java:501) at org.eclipse.jface.action.ActionContributionItem$5.handleEvent(ActionContributionItem.java:411) at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:84) at org.eclipse.swt.widgets.Display.sendEvent(Display.java:3764) at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1343) at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1366) at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1351) at org.eclipse.swt.widgets.Widget.notifyListeners(Widget.java:1163) at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:3610) at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3265) at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:2624) at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2588) at org.eclipse.ui.internal.Workbench.access$4(Workbench.java:2422) at org.eclipse.ui.internal.Workbench$7.run(Workbench.java:670) at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332) at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:663) at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149) at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:115) at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:619) at org.eclipse.equinox.launcher.Main.basicRun(Main.java:574) at org.eclipse.equinox.launcher.Main.run(Main.java:1407) at org.eclipse.equinox.launcher.Main.main(Main.java:1383)
Is this reliably reproducible? I can try to fix this, but with ConcurentModificationException's it can be hit or miss.
(In reply to comment #1) > Is this reliably reproducible? > > I can try to fix this, but with ConcurentModificationException's it > can be hit or miss. I cannot reproduce with OpenMPI RM.
I only saw it once.
The only way I can see this occurring is if getAttributeIndexSet is called during removeProcessesByJobRanks
So you probably need to synchronize processAttributesMap
(In reply to comment #5) > So you probably need to synchronize processAttributesMap I've synchronized the following PJob methods getAttributeIndexSet(D) getProcessAttributeKeys(BitSet) getProcessAttributes(BitSet) getProcessAttributes(String, BitSet) removeProcessesByJobRanks(BitSet) I've committed these.
Thanks. Let's hope that fixes the problem. I'll reopen if I see the problem again.
Unfortunately this now causes a deadlock in the model when the debugger launches. Currently launching more than a one process debug job deadlocks the UI in HEAD. The problem is that the resource manager calls PJob#addProcessesByJobRanks to add new processes to the job. This in turn calls PJob#fireNewProcesses to indicate the new processes have been created. The debug view has registered a handler on the job to get notified of process changes and is called when this happens. The handler in turn calls getAttribute() to find the status of the process (in order to display the correct icon). This calls PProcess#getAttribute which calls PJob#getProcessAttribute. Since both PJob#addProcessesByJobRanks and PJob#getProcessAttribute are synchronized, we get a deadlock. The synchronization used on PJob (and the other model element classes) needs to be checked to ensure that this kind of deadlock can't occur. I'd say you need to use finer grain synchronization than the whole method, particularly on methods that invoke handlers.
(In reply to comment #8) Do you think I can just exclude the firing of the handlers from the synchronization? > Unfortunately this now causes a deadlock in the model when the debugger > launches. Currently launching more than a one process debug job deadlocks the > UI in HEAD. > > The problem is that the resource manager calls PJob#addProcessesByJobRanks to > add new processes to the job. This in turn calls PJob#fireNewProcesses to > indicate the new processes have been created. The debug view has registered a > handler on the job to get notified of process changes and is called when this > happens. The handler in turn calls getAttribute() to find the status of the > process (in order to display the correct icon). This calls > PProcess#getAttribute which calls PJob#getProcessAttribute. Since both > PJob#addProcessesByJobRanks and PJob#getProcessAttribute are synchronized, we > get a deadlock. > > The synchronization used on PJob (and the other model element classes) needs to > be checked to ensure that this kind of deadlock can't occur. I'd say you need > to use finer grain synchronization than the whole method, particularly on > methods that invoke handlers.
(In reply to comment #9) If so I will just unsynchronize PJob#addProcessAttributes(BitSet processIds, AttributeManager attributes) PJob#removeProcessesByJobRanks(BitSet processJobRanks) and PJob#addProcessesByJobRanks(BitSet newProcessJobRanks, AttributeManager attrs) and do the main stuff in a synchronized (this) { block } followed by an unsynchronized call to the respective fireXXXProcesses(...) call. > (In reply to comment #8) > > Do you think I can just exclude the firing of the handlers > from the synchronization? >
I would have thought that you'd only want to synchronize access to the data structures, so calling the handlers outside the synchronization is definitely a good idea. It seems to have fixed the problem anyway.
Superceded by new RM framework.