|
Description
Paul Pazderski
New Gerrit change created: https://git.eclipse.org/r/141087 Created attachment 278386 [details] Not closable consoles Today's build shows 3 hanging consoles and two (not three!) hanging input jobs: eclipse.platform/ant/org.eclipse.ant.core/buildfiles/buildExtraJAR.xml eclipse.platform.resources/bundles/org.eclipse.core.resources/scripts/buildExtraJAR.xml rt.equinox.p2/bundles/org.eclipse.equinox.p2.jarprocessor/localbuild.xml There is definitely something wrong with short living process jobs / consoles for them. Additionally, the consoles can't be closed by clicking on "Remove launch" button, and "Remove all" is disabled. "Worker-13: Process Console Input Job" #57 prio=5 os_prio=0 cpu=50.03ms elapsed=1393.68s tid=0x00007ffe48001000 nid=0x1a9a9 in Object.wait() [0x00007ffe638ee000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base@11.0.2/Native Method) - waiting on <no object reference available> at java.lang.Object.wait(java.base@11.0.2/Object.java:328) at org.eclipse.ui.console.IOConsoleInputStream.waitForData(IOConsoleInputStream.java:142) at org.eclipse.ui.console.IOConsoleInputStream.read(IOConsoleInputStream.java:93) - waiting to re-lock in wait() <0x000000042552e8a0> (a org.eclipse.ui.console.IOConsoleInputStream) at org.eclipse.ui.console.IOConsoleInputStream.read(IOConsoleInputStream.java:115) at org.eclipse.debug.internal.ui.views.console.ProcessConsole$InputReadJob.run(ProcessConsole.java:752) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63) "Worker-38: Process Console Input Job" #82 prio=5 os_prio=0 cpu=45.77ms elapsed=1393.20s tid=0x00007ffecc05d800 nid=0x1a9c3 in Object.wait() [0x00007ffe61bd5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base@11.0.2/Native Method) - waiting on <no object reference available> at java.lang.Object.wait(java.base@11.0.2/Object.java:328) at org.eclipse.ui.console.IOConsoleInputStream.waitForData(IOConsoleInputStream.java:142) at org.eclipse.ui.console.IOConsoleInputStream.read(IOConsoleInputStream.java:93) - waiting to re-lock in wait() <0x0000000425603af0> (a org.eclipse.ui.console.IOConsoleInputStream) at org.eclipse.ui.console.IOConsoleInputStream.read(IOConsoleInputStream.java:115) at org.eclipse.debug.internal.ui.views.console.ProcessConsole$InputReadJob.run(ProcessConsole.java:752) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63) Thanks to https://git.eclipse.org/r/#/c/141013/ I was able to cancel the two input jobs. Note: this did not had any effect for hanging consoles, so the "hanging" consoles are unrelated to "hanging" jobs, common is only the root cause of fast terminated launches. Gerrit change https://git.eclipse.org/r/141087 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.debug.git/commit/?id=f0cc33023dda08a2bae26b32c8e22a2863729724 I still could not reproduce this problem but still searched in the code what could break your remove buttons. I think I found a race condition which could explain the disabled "remove all" button. (but you must have some bad luck if you actually hit it) If you see the disabled "remove all" button again I would also be interested in the enable state of the "Remove All Terminated" action in Console context menu and the "Remove All Terminated Launches" action in Debug View. For the not working "remove single" button I have still no clue who this can get broken at all. New Gerrit change created: https://git.eclipse.org/r/141174 Gerrit change https://git.eclipse.org/r/141174 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.debug.git/commit/?id=33e75e385ad0cf1f967ae7cf69c8b5cfd4adc257 (In reply to Paul Pazderski from comment #5) > I still could not reproduce this problem but still searched in the code what > could break your remove buttons. > > I think I found a race condition which could explain the disabled "remove > all" button. (but you must have some bad luck if you actually hit it) If you > see the disabled "remove all" button again I would also be interested in the > enable state of the "Remove All Terminated" action in Console context menu > and the "Remove All Terminated Launches" action in Debug View. I've setup my SDK IDE to start with debugger port open. Next time it happens I should be able to look into it. > For the not working "remove single" button I have still no clue who this can > get broken at all. I've also tried to understand what could it be, I assume it could be wrong process implementation of AntProcess. Somehow it probably manages to answer that it is not terminated, even if it is. (In reply to Andrey Loskutov from comment #8) > I assume it could be wrong process implementation of AntProcess. Had the same idea. But the only problem I could found was bug 546755. (In reply to Paul Pazderski from comment #9) > (In reply to Andrey Loskutov from comment #8) > > I assume it could be wrong process implementation of AntProcess. > > Had the same idea. But the only problem I could found was bug 546755. Note: the problems I observed are caused by ant launched in the same JVM, you patch seem to touch separated JVM case. (In reply to Andrey Loskutov from comment #10) > Note: the problems I observed are caused by ant launched in the same JVM, > you patch seem to touch separated JVM case. Correct. Also the other bug would had blocked the entire build to finish. OK, now I've got 5 consoles (and 3 input stream jobs waiting in org.eclipse.ui.console.IOConsoleInputStream.waitForData()), but in debugger I see that there is only one (!) launch known in org.eclipse.debug.internal.core.LaunchManager.getLaunches().
After removing this one, there are no launches left, which explains why ConsoleRemoveAllTerminatedAction is disabled and ConsoleRemoveLaunchAction doesn't work anymore: the code below simply does nothing, even if *current* launch of the console is not null:
public synchronized void run() {
ILaunch launch = getLaunch();
if (launch != null) {
ILaunchManager launchManager = DebugPlugin.getDefault().getLaunchManager();
launchManager.removeLaunch(launch);
}
}
The console removal code waits for the removal event from LaunchManager, which is of course not sent because LaunchManager doesn't remove any thing.
So the problem is in the way how launches are created / managed, not in the console action updates.
OK, I think I've got the root cause. It is not launch management issue, it is console management issue. All starts with this two: org.eclipse.debug.internal.ui.launchConfigurations.LaunchConfigurationManager.removeTerminatedLaunches(ILaunch) org.eclipse.debug.internal.ui.views.console.ProcessConsoleManager 1) If the preference to remove terminated launches is set (default), it will remove terminated launches on starting a new one. 2) If a org.eclipse.debug.ui plugin is activated (opened Breakpoints or Debug views), ProcessConsoleManager will watch for launch events to open/close consoles. 3) If a launch is *added*, ProcessConsoleManager listener which will also create the associated console. 4) If a launch is *removed*, ProcessConsoleManager listener which will also remove the associated console. The code to add/remove consoles in ProcessConsoleManager runs asynchronously to the launch / process lifecycle. This means: if the console creation thread is behind the launch execution, we will have consoles opened *after* their process/launch is gone and *removed* from LaunchManager list. Those consoles will never be removed! I have workspace with few projects using ant builders, so they started pretty much one after each other. The last one "wins" and can be closed, because no one runs after him who can trigger removeTerminatedLaunches(). All before are potential "victims" so they consoles may be opened after all events are already dispatched and they launches are gone. To reproduce, at least two projects are needed, each one containing ant build launch, and a breakpoint at ProcessConsoleManager.getConsole(IProcess) line: 166. Once breakpoint is set, trigger "Project->Build all" and let only the build job to access to consoles, but do not let "create console" job to do so. After the build, let the "create console" jobs proceed. Kaboom. Consoles are opened and will stay forever. Thread [Worker-32: Creating console for /data/runtime-New configuration/A/build.xml] (Suspended (breakpoint at line 166 in ProcessConsoleManager)) ProcessConsoleManager.getConsole(IProcess) line: 166 ProcessConsoleManager.getConsoleDocument(IProcess) line: 216 ProcessConsoleManager$ConsoleCreation.run(IProgressMonitor) line: 70 Worker.run() line: 63 Thread [Worker-30: Building Workspace] (Suspended (breakpoint at line 166 in ProcessConsoleManager)) ProcessConsoleManager.getConsole(IProcess) line: 166 ProcessConsoleManager.removeProcess(IProcess) line: 150 ProcessConsoleManager.removeLaunch(ILaunch) line: 137 ProcessConsoleManager.launchRemoved(ILaunch) line: 130 LaunchManager$LaunchNotifier.run() line: 430 SafeRunner.run(ISafeRunnable) line: 45 LaunchManager$LaunchNotifier.notify(ILaunch, int) line: 417 LaunchManager.fireUpdate(ILaunch, int) line: 1039 LaunchManager.removeLaunch(ILaunch) line: 2223 LaunchConfigurationManager.removeTerminatedLaunches(ILaunch) line: 334 LaunchConfigurationManager.launchAdded(ILaunch) line: 320 LaunchManager$LaunchNotifier.run() line: 427 SafeRunner.run(ISafeRunnable) line: 45 LaunchManager$LaunchNotifier.notify(ILaunch, int) line: 417 LaunchManager.fireUpdate(ILaunch, int) line: 1039 LaunchManager.addLaunch(ILaunch) line: 679 LaunchConfiguration.launch(String, IProgressMonitor, boolean, boolean) line: 811 LaunchConfiguration.launch(String, IProgressMonitor, boolean) line: 720 LaunchConfiguration.launch(String, IProgressMonitor) line: 715 ExternalToolBuilder.launchBuild(int, ILaunchConfiguration, Map<String,String>, IProgressMonitor) line: 183 ExternalToolBuilder.doBuildBasedOnScope(IResource[], int, ILaunchConfiguration, Map<String,String>, IProgressMonitor) line: 171 ExternalToolBuilder.build(int, Map<String,String>, IProgressMonitor) line: 90 BuildManager$2.run() line: 833 SafeRunner.run(ISafeRunnable) line: 45 BuildManager.basicBuild(int, IncrementalProjectBuilder, Map<String,String>, MultiStatus, IProgressMonitor) line: 220 BuildManager.basicBuild(IBuildConfiguration, int, IBuildContext, ICommand[], MultiStatus, IProgressMonitor) line: 263 BuildManager$1.run() line: 316 SafeRunner.run(ISafeRunnable) line: 45 BuildManager.basicBuild(IBuildConfiguration, int, IBuildContext, MultiStatus, IProgressMonitor) line: 319 BuildManager.basicBuildLoop(IBuildConfiguration[], IBuildConfiguration[], int, MultiStatus, IProgressMonitor) line: 371 BuildManager.build(IBuildConfiguration[], IBuildConfiguration[], int, IProgressMonitor) line: 392 Workspace.buildInternal(IBuildConfiguration[], int, boolean, IProgressMonitor) line: 504 Workspace.build(int, IProgressMonitor) line: 404 GlobalBuildAction$1.run(IProgressMonitor) line: 180 Worker.run() line: 63 Great! Now I finally could reproduce it. I was so close but failed to imagine this specific thread order. Also I tested multiple ant targets, multiple ant builders but not multiple projects... Should I work on a fix or will you do it yourself? (In reply to Paul Pazderski from comment #14) > Great! Now I finally could reproduce it. I was so close but failed to > imagine this specific thread order. Also I tested multiple ant targets, > multiple ant builders but not multiple projects... > > Should I work on a fix or will you do it yourself? If you have time & interst, feel free to push a patch ;) New Gerrit change created: https://git.eclipse.org/r/141288 New Gerrit change created: https://git.eclipse.org/r/141289 Writing the patch was quite easy. Writing a test too. But figuring out why my test was so unstable was hard. All this started in bug 546641 with the JobManager complaining about a still running input read job on shutdown. On shutdown the LaunchManager removes all launches and as consequence all consoles and therefore all input read jobs. We found a few problems there but also on shutdown the ConsoleManager removes all consoles (again) and therefore ends all input read jobs. Now I found a new combination. To initialize and dispose consoles the AbstractConsole registers a listener on ConsoleManager. The managers listener list is initialized lazy without synchronization. If two consoles are created simultaneous (which is possible due to creation done in a job) it can happen that only one of the listeners is added. Then one of the consoles had started the input read job in constructor but never stops due to not received notifications. Gerrit change https://git.eclipse.org/r/141289 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.debug.git/commit/?id=72f672fe7e6becae36c083ab3f9162f47476b128 Gerrit change https://git.eclipse.org/r/141288 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.debug.git/commit/?id=f09f1e9f05795fb820e3c859273faae755ce94e2 Not closing yet, will check tomorrow if my Linux setup (which was affected) is fine or if there is still some leaks. (In reply to Andrey Loskutov from comment #21) > Not closing yet, will check tomorrow if my Linux setup (which was affected) > is fine or if there is still some leaks. Looks good. Thanks Paul. @Paul, Please verify this bug with the new I build. Verified with Build id: I20190520-1805 |