Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 339005

Summary: [Pin&Clone] TimeoutException when pinning with GDB >= 7.1
Product: [Tools] CDT Reporter: Marc Khouzam <marc.khouzam>
Component: cdt-debug-dsf-gdbAssignee: Project Inbox <cdt-debug-dsf-gdb-inbox>
Status: NEW --- QA Contact: Jonah Graham <jonah>
Severity: normal    
Priority: P3 CC: anders, bruno.do.medeiros, cdtdoug, jonah, marc.dumais, pawel.1.piech, torbjorn.svensson, vinod.appu
Version: 8.0   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
See Also: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180690
Whiteboard:
Attachments:
Description Flags
Hack for the issue none

Description Marc Khouzam CLA 2011-03-05 07:25:57 EST
I get a timeoutException in GdbPinProvider when using GDB >= 7.1:

java.util.concurrent.TimeoutException
	at org.eclipse.cdt.dsf.concurrent.Query.get(Query.java:128)
	at org.eclipse.cdt.dsf.gdb.internal.ui.GdbPinProvider.getData(GdbPinProvider.java:165)
	at org.eclipse.cdt.dsf.gdb.internal.ui.GdbPinProvider.getCombinedLabels(GdbPinProvider.java:190)
	at org.eclipse.cdt.dsf.gdb.internal.ui.GdbPinProvider.isPinnedTo(GdbPinProvider.java:288)
	at org.eclipse.cdt.debug.internal.ui.pinclone.PinCloneUtils.getPinElementColorDescriptor(PinCloneUtils.java:219)
	at org.eclipse.cdt.dsf.gdb.internal.ui.viewmodel.launch.ThreadVMNode.updatePropertiesInSessionThread(ThreadVMNode.java:222)
	at org.eclipse.cdt.dsf.debug.ui.viewmodel.launch.AbstractThreadVMNode$3.run(AbstractThreadVMNode.java:167)
	at org.eclipse.cdt.dsf.concurrent.DefaultDsfExecutor$TracingWrapperRunnable.run(DefaultDsfExecutor.java:371)

To reproduce:
1- run a program with two threads with GDB >= 7.1 and non-stop
2- pin to the second thread
3- select the process and resume it

GdbPinProvider#getData() calls IProcesses.getExecutionData in a Query.  We were aware that this method is usually called outside the executor, but could be called within the executor, which is why we use the ImmediateInDsfExecutor.  In turns out that this is still not safe.

The implementation of IProcesses.getExecutionData() for GDB <= 7.0 is entirely local so does not need to send a command to GDB.  However, starting with GDB 7.1, we send a command to GDB to obtain the information about cores.  Sending a command to the backend will use the Dsf executor.  Therefore, GdbPinProvider calls a Query on the executor and blocks the executor, while the method the query is waiting for, IProcesses.getExecutionData() also needs the executor complete.  Deadlock.  Thanks to the timeout of 2 seconds in the query, we actually recover, but it takes 6 seconds for the Debug view to properly display.  

I have to think about the best way to fix this.
Comment 1 Marc Dumais CLA 2017-02-03 14:25:19 EST
Alternate way to reproduce this issue: 

1- run a program that creates threads, with GDB >= 7.1 and non-stop
2- pin the variable view to a thread
3- step over a line of code that creates a new thread 

Note: a very similar exception happens if we instead pin another debug view: Expression and Registers, at least. The mechanism is probably exactly the same as described in first comment, just triggered in a different VMNode. For example, if the Expressions view is pinned instead of the variables view: 


java.util.concurrent.TimeoutException
	at org.eclipse.cdt.dsf.concurrent.Query.get(Query.java:126)
	at org.eclipse.cdt.dsf.gdb.internal.ui.GdbPinProvider.getData(GdbPinProvider.java:156)
	at org.eclipse.cdt.dsf.gdb.internal.ui.GdbPinProvider.getCombinedLabels(GdbPinProvider.java:181)
	at org.eclipse.cdt.dsf.gdb.internal.ui.GdbPinProvider.isPinnedTo(GdbPinProvider.java:283)
	at org.eclipse.cdt.debug.internal.ui.pinclone.PinCloneUtils.isPinnedTo(PinCloneUtils.java:254)
	at org.eclipse.cdt.dsf.gdb.internal.ui.viewmodel.launch.ContainerVMNode.updatePropertiesInSessionThread(ContainerVMNode.java:312)
	at org.eclipse.cdt.dsf.debug.ui.viewmodel.launch.AbstractContainerVMNode$2.run(AbstractContainerVMNode.java:138)
	at org.eclipse.cdt.dsf.concurrent.DefaultDsfExecutor$TracingWrapperRunnable.run(DefaultDsfExecutor.java:374)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Comment 2 Vinod Appu CLA 2021-05-05 13:37:50 EDT
Is this still an open issue?
Comment 3 Jonah Graham CLA 2021-05-05 14:14:49 EDT
(In reply to Vinod Appu from comment #2)
> Is this still an open issue?

I would expect so. Have you hit a similar issue?
Comment 4 Vinod Appu CLA 2021-05-05 14:23:31 EDT
Yes, I'm looking into it. I can see due to some reason the below if is not always good enough to return null (Inside GdbPinProvider.java, line 129) when terminate button is pressed. At times the session is active and it goes further down to a wait(2 second) which won't be notified.


	private IThreadDMData getData(final IThreadDMContext threadDmc) {
		if (threadDmc == null || !fSession.isActive())
			return null;

		............
                ............


Any thoughts?
Comment 5 Jonah Graham CLA 2021-05-05 15:05:33 EDT
I am not 100% sure I follow - Is the case at line 130 that the session is active, but by the time the query runs the session has terminated? 

If so, then this should just be a handled error, not even logged.

Are you getting a TimeoutException or a RejectedExecutionException? 

Are you hitting the deadlock described in Comment 0? That deadlock is not at session termination, but during the session IIUC.
Comment 6 Jonah Graham CLA 2021-05-05 15:14:27 EDT
(In reply to Jonah Graham from comment #5)
> Are you getting a TimeoutException or a RejectedExecutionException? 

There is a race condition between immediateExecutor.execute(query); and data = query.get(2, TimeUnit.SECONDS); that means you may not get the RejectedExecutionException and will just timeout when you do the get.
Comment 7 Vinod Appu CLA 2021-05-06 05:28:23 EDT
I'm getting timeout exception only. I got confused since I couldn't able to reproduce when I tried with only Eclipse + CDT where as I'm always getting it from our own custom product. Now I understand that it's a matter of timing to get the issue.

The problem we are facing is this issue is hitting us in a multicore debugging session with view pinned => End user has to wait ~6seconds or more on any of the debug operations.

Any way out you think?
Comment 8 Vinod Appu CLA 2021-05-06 12:52:16 EDT
Created attachment 286336 [details]
Hack for the issue

I'm planning the above hack to get rid of the issue at the moment. Am I going to miss something with this?
Comment 9 Jonah Graham CLA 2021-05-07 12:41:32 EDT
(In reply to Vinod Appu from comment #8)
> Created attachment 286336 [details]
> Hack for the issue
> 
> I'm planning the above hack to get rid of the issue at the moment. Am I
> going to miss something with this?

Please provide this as a gerrit and I can review it properly. As a picture I can't even copy&paste it into Eclipse.
Comment 10 Vinod Appu CLA 2021-05-14 11:35:04 EDT
Getting "fatal: http://git.eclipse.org:29418/cdt/org.eclipse.cdt.git/info/refs not valid: is this a git repository?" when try git push gerrit HEAD:refs/for/master

I've tried to follow https://kichwacoders.com/2017/05/03/getting-started-with-gerrit-on-eclipse-cdt/

This is my first ever commit, please guide me.
Comment 11 Jonah Graham CLA 2021-05-14 12:48:07 EDT
I answered a longer answer with screenshots/etc in https://mattermost.eclipse.org/eclipse/

Quick answer is it looks like the URL is incorrect. Get the correct URL from https://git.eclipse.org/r/admin/repos/cdt/org.eclipse.cdt while logged on.
Comment 12 Eclipse Genie CLA 2021-05-17 11:25:02 EDT
New Gerrit change created: https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/180690
Comment 13 Jonah Graham CLA 2021-05-17 16:34:42 EDT
(In reply to Marc Khouzam from comment #0)
> The implementation of IProcesses.getExecutionData() for GDB <= 7.0 is
> entirely local so does not need to send a command to GDB.  However, starting
> with GDB 7.1, [...]

This is no longer true - since Bug 497592 even GDB 7.0 can make requests. However the change in 497592 ameliorates the case because the command is likely to be cached, meaning no further roundtrip to GDB needed and that means the Query can be run in the executor thread.

BTW I cannot reliably reproduce this bug, I think because of Bug 497592. I have seen an occasional timeout when fast stepping or similar, but even then I am not getting a UI freeze, only a Executor freeze. (Although an executor freeze can  cause a UI freeze if any Query is made on the main thread while the executor is frozen.)
Comment 14 Jonah Graham CLA 2021-05-17 16:34:58 EDT
It generally is a bad idea to use Query (or anything blocking) in executor thread. 

I believe the proper fix is to refactor the code so that the Query is always outside the executor thread or otherwise removed. This is somewhat easy to do, simply create a new IDsfPinProvider that has all the methods with an async style that may eventually call getExecutionData. The methods in IPinProvider then effectively become ThreadSafeAndProhibitedFromDsfExecutor (can't actually apply that annotation because of cyclic dependency) and the methods in IDsfPinProvider become ConfinedToDsfExecutor. Only the methods called in the executor thread need to be added to IDsfPinProvider -- AFAICT that is only isPinnedTo.

@Vinod - does that make sense?
Comment 15 Vinod Appu CLA 2021-05-17 23:59:04 EDT
Yes, I thought of get rid of Query first then I go for the hack for a quick solution. I'll try your suggestion and give an update.