Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 314447

Summary: Lockup in CSourceNotFoundDescriptionFactory
Product: [Tools] CDT Reporter: James Blackburn <jamesblackburn+eclipse>
Component: cdt-debug-dsf-gdbAssignee: Marc Khouzam <marc.khouzam>
Status: RESOLVED FIXED QA Contact: Marc Khouzam <marc.khouzam>
Severity: critical    
Priority: P3 CC: john.cortell, pawel.1.piech
Version: 7.0Flags: john.cortell: review+
Target Milestone: 7.0.1   
Hardware: PC   
OS: Linux-GTK   
Whiteboard:
Attachments:
Description Flags
backtrace
none
assertions to stderr
none
Stack trace and gdb traces
marc.khouzam: iplog-
Fix marc.khouzam: iplog-

Description James Blackburn CLA 2010-05-26 08:27:04 EDT
Created attachment 169976 [details]
backtrace

While using DSF it locked up completely :(

Build based on 2010-05-01 19:09:26

Backtrace attached.
Comment 1 Pawel Piech CLA 2010-05-26 11:29:46 EDT
Hi James,
Is the problem reproducable? 
If so how?
If you can reproduce it, could you enable assertions (-ea) and see if there's anything in the error log?
Comment 2 James Blackburn CLA 2010-05-26 12:01:06 EDT
Created attachment 170029 [details]
assertions to stderr

(In reply to comment #1)
> Is the problem reproducable? 
> If so how?
> If you can reproduce it, could you enable assertions (-ea) and see if there's
> anything in the error log?

I'll have a quick go. I was attempting to use the DSF GDB process attach launch to debug a gdb server implementation being used as a backend for an already running debug session... 

I haven't yet been able to readily reproduce, but I switched back to CDI to debug the issue as DSF doesn't appear to be setting all my breakpoints -- I have more than 10 breakpoints in the BP view, DSF only seems to -break-insert one of them, whereas CDI inserts all of them... (They're all ordinary line number breakpoints.)

I do run with -ea, and have been getting a bunch of assertions to stderr (attached).
Comment 3 Pawel Piech CLA 2010-05-26 12:20:19 EDT
Thanks, the exceptions in stderr indicate a problem when launching, I don't think they are directly responsible for the hanging call to CSourceNotFoundDescriptionFactory$1.getDescription.  If you manage to reproduce the problem please capture the errors.  Also if you had -ea on while the bug occurred, maybe the exception is still in your .log file?
Comment 4 James Blackburn CLA 2010-05-26 12:44:26 EDT
(In reply to comment #3)
> Also if you had -ea on while the bug
> occurred, maybe the exception is still in your .log file?

Unfortunately there's nothing interesting in the error log at the time of the crash
Comment 5 Marc Khouzam CLA 2010-05-26 21:14:30 EDT
(In reply to comment #2)
> I do run with -ea, and have been getting a bunch of assertions to stderr
> (attached).

Do you have the corresponding 'gdb traces'?  Maybe some of my assumptions about interrupting the target were wrong.

Were both your host and target Linux? 
I gather your target was not running the FSF gdbserver, but your own implementation?

As Pawel said, this is probably not related to the lockup though.
Comment 6 James Blackburn CLA 2010-05-27 03:25:48 EDT
(In reply to comment #5)
> Do you have the corresponding 'gdb traces'?  Maybe some of my assumptions about
> interrupting the target were wrong.

It was a complete UI lockup. Is there anyway to get the GDB traces without using the console?

It was a runtime Eclipse being run under the PDE, but the sessions is now long gone :(.  I looked at the backtrace and couldn't make much sense of it to gather more detail. Is there anything else to grab should this happen again?

> Were both your host and target Linux? 
> I gather your target was not running the FSF gdbserver, but your own
> implementation?
> As Pawel said, this is probably not related to the lockup though.

I agree this is likely unrelated. Everything was running locally; I was using CDI with my GDB server (remote simulator) and was using DSF attach to debug the remote server. Seemed like a good idea at the time :)

If there's not enough information to reproduce / track down, then do close, I can always reopen if I see it again.  UI lockups are scary, it would be nice if they weren't possible even if things go badly wrong in external processes.
Comment 7 Marc Khouzam CLA 2010-05-27 09:05:10 EDT
(In reply to comment #6)
> (In reply to comment #5)
> > Do you have the corresponding 'gdb traces'?  Maybe some of my assumptions about
> > interrupting the target were wrong.
> 
> It was a complete UI lockup. Is there anyway to get the GDB traces without
> using the console?


The assert error should probably happen without the UI lockup.  But just in case, you can start your eclipse with "-debug $HOME/dsf.debug.options" and have the file dsf.debug.options contain the lines

org.eclipse.cdt.dsf/debugCache = true
org.eclipse.cdt.dsf.gdb/debug = true

Chasing those errors is not a high priority right now since they are not the cause of the UI lockup, so let's wait for that part until after Helios.
Comment 8 Marc Khouzam CLA 2010-06-15 09:59:09 EDT
Created attachment 171931 [details]
Stack trace and gdb traces

I can reproduce the deadlock!

I run a multi-thread program in non-stop mode.
I first resume the program after main(), then interrupt the first thread, select the a couple of stack frames, resume the thread, then interrupt the last thread, and BOOM!

I'm not sure which of those steps are really necessary, but it does reproduce the problem.

Attached is the stack trace and gdb traces.
Comment 9 Marc Khouzam CLA 2010-06-15 10:16:45 EDT
Created attachment 171935 [details]
Fix

My bad.
This should fix the deadlock.
Comment 10 Marc Khouzam CLA 2010-06-15 10:20:14 EDT
Can someone review?  Pawel, John?
Comment 11 John Cortell CLA 2010-06-15 10:31:38 EDT
(In reply to comment #10)
> Can someone review?  Pawel, John?

Looks good to me. It also poses no chance of regression, IMO, so safe to put in at the last second.
Comment 12 Marc Khouzam CLA 2010-06-17 11:55:21 EDT
Committed to both HEAD and 7.0.1