Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 348043

Summary: GDB (DSF) Hardware Debugging Launcher fails to complete launch
Product: [Tools] CDT Reporter: John Dallaway <john>
Component: cdt-debug-dsf-gdbAssignee: Marc Khouzam <marc.khouzam>
Status: RESOLVED FIXED QA Contact: Marc Khouzam <marc.khouzam>
Severity: major    
Priority: P3 CC: ajin, cdtdoug, elaskavaia.cdt, pawel.1.piech
Version: 8.0Flags: marc.khouzam: review? (elaskavaia.cdt)
Target Milestone: 8.0   
Hardware: PC   
OS: Linux   
Whiteboard:
Attachments:
Description Flags
Proposed fix
marc.khouzam: iplog-
Improved fix marc.khouzam: iplog-

Description John Dallaway CLA 2011-06-01 23:48:06 EDT
Build Identifier: Eclipse I20110526-1708 (3.7RC3) + CDT HEAD

When attempting to launch a GDB (DSF) Hardware Debugging session with Eclipse 3.7RC3 and CDT HEAD, the final launch sequence defined by GDBJtagDSFFinalLaunchSequence is executed but the debug session then does nothing. The various views are not populated and debugging is not possible. The "Debug" view shows only the root node of the debug session and the "gdb" process node as a child. The last few lines of the GDB trace:

080,846 9tbreak cyg_user_start
080,846 (gdb) 
080,846 &"\n"
080,846 ^done
080,846 (gdb) 
080,848 &"tbreak cyg_user_start\n"
080,849 ~"Temporary breakpoint 1 at 0xc568: file ../twothreads.c, line 25.\n"
080,850 9^done
080,850 (gdb) 
080,850 &"\n"
080,851 ^done
080,852 (gdb) 

I can launch the same code for debugging using the same launch settings with the Standard GDB Hardware Debugging Launcher (CDI) without issue in 3.7RC3. I can also launch without problems using the GDB (DSF) Hardware Debugging Launcher in Helios SR2.

Perhaps there is an additional step now required somewhere in the final launch sequence which has not been added to GDBJtagDSFFinalLaunchSequence?

Reproducible: Always
Comment 1 Marc Khouzam CLA 2011-06-02 10:11:27 EDT
Can you post the entire GDB traces?
If you could post the ones from Helios too, that would help me compare the two.

I don't have a setup for hardware debugging so I need help to fix this.

Thanks.
Comment 2 John Dallaway CLA 2011-06-02 11:23:42 EDT
(In reply to comment #1)

> Can you post the entire GDB traces?

I will try to generate soon, but cannot right now. I can tell you that there is a one-to-one correspondence in GDB/MI traffic during launch with two exceptions:

a) Indigo RC3 does not issue "-environment-directory"
b) Indigo RC3 does not issue "-thread-list-ids",
     this is issued immediately before "load" with Helios SR2
     [ I have "Load Image" and "Load Symbols" checked ]

Otherwise, the GDB/MI traffic looks sane up to the end of the launch sequence. The issue is with what happens next. Helios SR2 CDT starts to populate various views resulting in further GDB/MI traffic. Indigo RC3 CDT just sits there. Perhaps the missing "-thread-list-ids" is the key. Is it possible that CDT does nothing because it thinks there are no threads?

> I don't have a setup for hardware debugging so I need help to fix this.

I have both Helios SR2 CDT and Indigo RC3 CDT running under debug and am happy to help with diagnosis, but it's not (yet) clear to me what triggers the _initial_ refresh of views.
Comment 3 Marc Khouzam CLA 2011-06-02 11:33:06 EDT
(In reply to comment #2)
> (In reply to comment #1)
> 
> > Can you post the entire GDB traces?
> 
> I will try to generate soon, but cannot right now. I can tell you that there is
> a one-to-one correspondence in GDB/MI traffic during launch with two
> exceptions:
> 
> a) Indigo RC3 does not issue "-environment-directory"

I don't expect that to be the problem.

> b) Indigo RC3 does not issue "-thread-list-ids",
>      this is issued immediately before "load" with Helios SR2
>      [ I have "Load Image" and "Load Symbols" checked ]

Which version of GDB are you using?  -thread-list-ids is only used for older versions <= 6.8).  It has been replaced by -list-thread-groups

> Otherwise, the GDB/MI traffic looks sane up to the end of the launch sequence.
> The issue is with what happens next. Helios SR2 CDT starts to populate various
> views resulting in further GDB/MI traffic. Indigo RC3 CDT just sits there.
> Perhaps the missing "-thread-list-ids" is the key. Is it possible that CDT does
> nothing because it thinks there are no threads?
> 
> > I don't have a setup for hardware debugging so I need help to fix this.
> 
> I have both Helios SR2 CDT and Indigo RC3 CDT running under debug and am happy
> to help with diagnosis, but it's not (yet) clear to me what triggers the
> _initial_ refresh of views.

I believe it is the sending of the DataModelInitializedEvent from the launch sequence.  Does that code get hit?
Comment 4 John Dallaway CLA 2011-06-02 11:49:43 EDT
(In reply to comment #3)

> > b) Indigo RC3 does not issue "-thread-list-ids",
> >      this is issued immediately before "load" with Helios SR2
> >      [ I have "Load Image" and "Load Symbols" checked ]
> 
> Which version of GDB are you using?  -thread-list-ids is only used for older
> versions <= 6.8).  It has been replaced by -list-thread-groups

arm-eabi-gdb 6.8.50.20080706
 
> > Otherwise, the GDB/MI traffic looks sane up to the end of the launch sequence.
> > The issue is with what happens next. Helios SR2 CDT starts to populate various
> > views resulting in further GDB/MI traffic. Indigo RC3 CDT just sits there.
> > Perhaps the missing "-thread-list-ids" is the key. Is it possible that CDT does
> > nothing because it thinks there are no threads?
> > 
> > > I don't have a setup for hardware debugging so I need help to fix this.
> > 
> > I have both Helios SR2 CDT and Indigo RC3 CDT running under debug and am happy
> > to help with diagnosis, but it's not (yet) clear to me what triggers the
> > _initial_ refresh of views.
> 
> I believe it is the sending of the DataModelInitializedEvent from the launch
> sequence.  Does that code get hit?

Yes, that step in the final launch sequence is executed.
Comment 5 John Dallaway CLA 2011-06-02 12:03:37 EDT
(In reply to comment #4)

> > > b) Indigo RC3 does not issue "-thread-list-ids",
> > >      this is issued immediately before "load" with Helios SR2
> > >      [ I have "Load Image" and "Load Symbols" checked ]
> > 
> > Which version of GDB are you using?  -thread-list-ids is only used for older
> > versions <= 6.8).  It has been replaced by -list-thread-groups
> 
> arm-eabi-gdb 6.8.50.20080706

If I switch to arm-eabi-gdb 7.0 the reported problem is not observed. I can launch and debug normally. It looks like support for GDB Hardware Debugging with DSF and GDB 6.8.x is broken in CDT HEAD.
Comment 6 Marc Khouzam CLA 2011-06-02 13:01:21 EDT
(In reply to comment #5)
> (In reply to comment #4)
> 
> > > > b) Indigo RC3 does not issue "-thread-list-ids",
> > > >      this is issued immediately before "load" with Helios SR2
> > > >      [ I have "Load Image" and "Load Symbols" checked ]
> > > 
> > > Which version of GDB are you using?  -thread-list-ids is only used for older
> > > versions <= 6.8).  It has been replaced by -list-thread-groups
> > 
> > arm-eabi-gdb 6.8.50.20080706
> 
> If I switch to arm-eabi-gdb 7.0 the reported problem is not observed. I can
> launch and debug normally. It looks like support for GDB Hardware Debugging
> with DSF and GDB 6.8.x is broken in CDT HEAD.

Can you get the gdb traces for both to see what is the difference
Comment 7 John Dallaway CLA 2011-06-02 13:18:01 EDT
(In reply to comment #6)

> Can you get the gdb traces for both to see what is the difference

With GDB 7.0, there is a "-list-thread-groups" and "-list-thread-groups --available" immediately after the connection to the target and before the user-supplied initialisation commands:

400,268 6target remote 192.168.0.149:3333
400,271 &"target remote 192.168.0.149:3333\n"
400,272 ~"Remote debugging using 192.168.0.149:3333\n"
400,300 7-list-thread-groups
400,305 =thread-group-created,id="42000"
400,306 =thread-created,id="1",group-id="42000"
400,306 8-list-thread-groups --available
400,321 ~"cyg_user_start () at ../twothreads.c:25\n"
400,324 ~"25\t  printf(\"Entering twothreads' cyg_user_start() function\\n\");\n"
400,324 *stopped,frame={addr="0x0000c568",func="cyg_user_start",args=[],file="../twothreads.c",fulln\
ame="/home/jld/runtime-helios-cdt/twothreads-eb40a-jtag/twothreads.c",line="25"},thread-id="1",stopp\
ed-threads="all"
400,324 ~"Current language:  auto\n"
400,324 ~"The current source language is \"auto; currently c\".\n"
400,324 6^done
400,324 (gdb) 
400,324 &"\n"
400,324 ^done
400,324 (gdb) 
400,324 7^done,groups=[{id="42000",type="process",pid="42000"}]
400,325 (gdb) 
400,325 8^error,msg="Can not fetch data now.\n"
400,326 (gdb) 
400,342 9monitor reset init
400,343 &"monitor reset init\n"

With GDB 6.8.50.x, the "-list-thread-groups" commands are absent, but there is a message back from GDB following "target remote" that there is a single thread in the system (=thread-created):

800,744 5target remote 192.168.0.149:3333

800,748 &"target remote 192.168.0.149:3333\n"
800,750 ~"Remote debugging using 192.168.0.149:3333\n"
800,772 =thread-created,id="1"
800,794 ~"cyg_user_start () at ../twothreads.c:25\n"
800,795 ~"25\t  printf(\"Entering twothreads' cyg_user_start() function\\n\");\n"
800,796 *stopped
800,796 ~"Current language:  auto; currently c\n"
800,796 5^done
800,797 (gdb) 
800,797 &"\n"
800,797 ^done
800,797 (gdb) 
800,800 6thread
800,800 7monitor reset init
800,802 &"thread\n"
800,802 ~"[Current thread is 1 (Thread <main>)]\n"
800,803 6^done
800,803 (gdb) 
800,805 &"monitor reset init\n"

Perhaps the launcher should be responding to the "=thread-created" message?
Comment 8 Marc Khouzam CLA 2011-06-02 14:09:00 EDT
ok, I can re-produce it on my laptop.
More to come
Comment 9 John Dallaway CLA 2011-06-02 14:38:59 EDT
(In reply to comment #8)

> ok, I can re-produce it on my laptop.
> More to come

Great. I guess an understanding of why the Helios SR2 implementation issues "-thread-list-ids" with GDB 6.8.x will lead to the solution.

Thanks for looking at this, Mark.
Comment 10 Marc Khouzam CLA 2011-06-02 14:40:18 EDT
Created attachment 197261 [details]
Proposed fix

For GDB >= 7.0, GDB tells us a process has been created with an MI event.
400,305 =thread-group-created,id="42000"
In turn, we use that event to mark that we are connected to a process.

However, for GDB <= 6.8, GDB does not have that event, so we are not being told we are connected to a process.  This causes a problem in GDBProcess.getProcessesBeingDebugged where fConnect is false and we don't show any processes in the Debug view.

This used to work, because the fConnected flag was defaulting to true in Helios.

However, this is not a problem for normal remote debugging, which got me wondering what the difference was.  In MIRunControlEventProcessor.commandDone, when getting a ^connect answer, we trigger the ContainerStarted event ourselves (just like we do when we get =thread-group-created), which would let GDBprocesses know that we are connected to a process.

Why doesn't this work for hardware debug?  It turns out that when using "-target-select remote", GDB answers with ^connected, but when using "target remote", GDB answers with ^done.  We cannot know a new process has been started without the ^connected answer.

I have tested this behavior all the way back to GDB 6.5 (I don't have any older versions).  Therefore, this patch changes "target remote" for "-target-select remote" and everything seems to work.

John, can you try it.

I don't know if this is a safe solution though because it also affects the CDI hardware debug.  John, can you try that too?
Comment 11 John Dallaway CLA 2011-06-02 15:13:10 EDT
Mark, your patch fixes the reported issue but, unfortunately, it breaks the CDI hardware debugging launcher:

!ENTRY org.eclipse.cdt.debug.gdbjtag.core 4 150 2011-06-02 20:06:50.131
!MESSAGE Failed command
!STACK 0
org.eclipse.cdt.debug.mi.core.MIException: Undefined command: "-target-select".  Try "help".
	at org.eclipse.cdt.debug.mi.core.command.Command.throwMIException(Command.java:105)
	at org.eclipse.cdt.debug.mi.core.command.Command.getMIInfo(Command.java:79)
	at org.eclipse.cdt.debug.gdbjtag.core.GDBJtagDebugger.executeGDBScript(GDBJtagDebugger.java:342)
	at org.eclipse.cdt.debug.gdbjtag.core.GDBJtagDebugger.doStartSession(GDBJtagDebugger.java:187)
	at org.eclipse.cdt.debug.mi.core.AbstractGDBCDIDebugger.createSession(AbstractGDBCDIDebugger.java:84)
	at org.eclipse.cdt.debug.gdbjtag.core.GDBJtagDebugger.createSession(GDBJtagDebugger.java:64)
	at org.eclipse.cdt.debug.gdbjtag.core.GDBJtagLaunchConfigurationDelegate.launch(GDBJtagLaunchConfigurationDelegate.java:52)
	at org.eclipse.debug.internal.core.LaunchConfiguration.launch(LaunchConfiguration.java:854)
	at org.eclipse.debug.internal.core.LaunchConfiguration.launch(LaunchConfiguration.java:703)
	at org.eclipse.debug.internal.ui.DebugUIPlugin.buildAndLaunch(DebugUIPlugin.java:928)
	at org.eclipse.debug.internal.ui.DebugUIPlugin$8.run(DebugUIPlugin.java:1132)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)
Comment 12 Marc Khouzam CLA 2011-06-02 16:10:06 EDT
(In reply to comment #11)
> Mark, your patch fixes the reported issue but, unfortunately, it breaks the CDI
> hardware debugging launcher:
> 
> !ENTRY org.eclipse.cdt.debug.gdbjtag.core 4 150 2011-06-02 20:06:50.131
> !MESSAGE Failed command
> !STACK 0
> org.eclipse.cdt.debug.mi.core.MIException: Undefined command: "-target-select".
>  Try "help".
>     at
> org.eclipse.cdt.debug.mi.core.command.Command.throwMIException(Command.java:105)
>     at org.eclipse.cdt.debug.mi.core.command.Command.getMIInfo(Command.java:79)
>     at
> org.eclipse.cdt.debug.gdbjtag.core.GDBJtagDebugger.executeGDBScript(GDBJtagDebugger.java:342)

The problem is that GDBJtagDebugger.executeGDBScript treats every command as a CLICommand and adds a space between the token and the command.  So, we end up with
5 -target-select
instead of
5-target-select
and GDB can't handle that.

I'm trying to see how we can fix this.
Comment 13 Marc Khouzam CLA 2011-06-02 16:22:57 EDT
Created attachment 197270 [details]
Improved fix

This patch is the same as before but it also allows GDBJtagDebugger.executeGDBScript to be ready to have MI commands.

The problem is that the CDI hardware launch for GDB 6.8 does not seem to work anyway.  I get two threads instead of one.  With GDB 7.2 is looks good.

John, can you try the patch for DSF and CDI?
And can you let me know if with HEAD only (no patches), CDI/GDB 6.8 works?
Comment 14 John Dallaway CLA 2011-06-03 04:37:01 EDT
(In reply to comment #13)

> The problem is that the CDI hardware launch for GDB 6.8 does not seem to work
> anyway.  I get two threads instead of one.  With GDB 7.2 is looks good.

CDI hardware launch works OK for me with GDB 6.8.x unpatched. Just one thread seen. Ref: comment #0 (original description)
 
> John, can you try the patch for DSF and CDI?

With your improved patch, both DSF and CDI hardware launchers now work for me with both GDB 6.8.x and GDB 7.0. All 4 permutations tested.
Comment 15 Marc Khouzam CLA 2011-06-03 05:35:17 EDT
The last build for Indigo is this morning (barring any re-spins).
Although the solution changes some CDI code, DSF-GDB being the default debugger, I feel that we really should get this fixed for Indigo.

I've committed the solution to HEAD
Comment 16 Marc Khouzam CLA 2011-06-03 05:50:08 EDT
Elena can you review?

Thanks John for reporting the bug and your quick tests to get it resolved!
Comment 18 John Dallaway CLA 2011-06-03 17:06:38 EDT
(In reply to comment #15)

> The last build for Indigo is this morning (barring any re-spins).
> Although the solution changes some CDI code, DSF-GDB being the default
> debugger, I feel that we really should get this fixed for Indigo.
> 
> I've committed the solution to HEAD

Thanks again, Mark.
Comment 19 Elena Laskavaia CLA 2011-06-03 22:30:23 EDT
sorry I am kind of late here - does this command -target-remote exists in gdb 6.8?
I am kind of lost track why are you changing CLI command to MI?
Comment 20 Marc Khouzam CLA 2011-06-04 20:41:25 EDT
(In reply to comment #19)
> sorry I am kind of late here - does this command -target-remote exists in gdb
> 6.8?
> I am kind of lost track why are you changing CLI command to MI?

From comment #10:
> Why doesn't this work for hardware debug?  It turns out that when using
> "-target-select remote", GDB answers with ^connected, but when using "target
> remote", GDB answers with ^done.  We cannot know a new process has been started
> without the ^connected answer.

So I changed to the MI version to get the ^connected reply.

> I have tested this behavior all the way back to GDB 6.5 (I don't have any older
> versions).  Therefore, this patch changes "target remote" for "-target-select
> remote" and everything seems to work.

We're ok at least back to 6.5, maybe even more.