Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 340697

Summary: [multi-process] Auto attach to processes forked by the process being debugged
Product: [Tools] CDT Reporter: Sergey Prigogin <eclipse.sprigogin>
Component: cdt-debug-dsf-gdbAssignee: Marc Khouzam <marc.khouzam>
Status: RESOLVED FIXED QA Contact: Marc Khouzam <marc.khouzam>
Severity: enhancement    
Priority: P3 CC: cdtdoug, marc.khouzam, pawel.1.piech
Version: 8.0Flags: marc.khouzam: review? (eclipse.sprigogin)
Target Milestone: 8.0   
Hardware: PC   
OS: Mac OS X - Carbon (unsup.)   
Whiteboard:
Attachments:
Description Flags
Prototype
marc.khouzam: iplog-
gdb traces showing some bp errors
marc.khouzam: iplog-
Potential fix
marc.khouzam: iplog-
Improved fix marc.khouzam: iplog-

Description Sergey Prigogin CLA 2011-03-22 14:59:21 EDT
Some programs, e.g. Chromium web browser, span a large number of processes that typically have to be debugged together.  It would be convenient to have an option to automatically attach debugger to all processes spawned by the process being debugged.
Comment 1 Marc Khouzam CLA 2011-04-05 23:13:02 EDT
Ok, so things look promising.

GDB supports this feature by doing the following
  set detach-on-fork off

I tried it out with CDT by typing this command on the gdb console in eclipse right after we launch a GDB 7.2 non-stop session.  After the fork, I automatically got both processes in the debug view!  That was excellent.

However, since we didn't trigger the attach ourselves, we are still in an inconsistent state.  For example, we are not able to set breakpoint on the new process because we never told the MIBreakpointsManager of this new process.

This comes down to supporting auto-attaching to processes, which we will need for the Global Breakpoints feature anyway.  I'll look into it.

The question now is: do we always set the 'detach-on-fork' option to off by default, or do we give the user the choice through a preference?  One can imagine debugging a process that forks hundreds of sub-processes and not caring about those sub-processes.  So, I'm leaning towards a preference, a launch preference in fact.

Opinions?

Also, when not attaching to forked processes, GDB has a 'follow-fork-mode' option to decide if we should keep debugging the parent or the child process.  I haven't tried this in CDT yet.
Comment 2 Sergey Prigogin CLA 2011-04-05 23:27:07 EDT
(In reply to comment #1)
> So, I'm leaning towards a preference, a launch preference in fact.

+1 for the launch preference.
Comment 3 Marc Khouzam CLA 2011-04-06 22:17:55 EDT
Created attachment 192688 [details]
Prototype

Here is a prototype that I expected would work.
However, I'm seeing issues when setting breakpoints.
Either I have a bug in the fix to Bug 337893, or GDB is not behaving right.
The issues seem to be the same as what Sergey reported in Bug 337893 comment 13 (except that instead of SIGPWR, I saw SIGHANGUP).

I'll have to look more into it.

To try out the prototype, one should enable the new launch option in the debugger tab, use GDB 7.2 and set non-stop mode.
Comment 4 Marc Khouzam CLA 2011-04-06 22:24:44 EDT
Created attachment 192690 [details]
gdb traces showing some bp errors

Attached are the gdb traces of a session when I had the problem of setting breakpoints after the fork.

I use the following test program:

#include <stdio.h>
#include <unistd.h>

int main() {
    pid_t child;
    printf("Starting\n");

    child = fork();

    if (child == 0) {
        printf("I am the child\n");
        sleep(10);
        printf("child dying\n");
    } else {
        printf("I am the parent\n");
        sleep(5);
        printf("parent dying\n");
    }
    return 0;
}

And for the traces, I had breakpoints set in my workspace, then launched the program which stopped at main; then I resumed it past the fork().  GDB attaches to the new forked process and we try to set breakpoints on it.  This sometimes causes errors.  GDB is definitely partly at fault because I can see an -exec-continue return with an error that it could not set the breakpoint.  Again, that looks like the problem Sergey saw as well.
Comment 5 Sergey Prigogin CLA 2011-04-06 22:28:36 EDT
(In reply to comment #3)

I've found the cause of SIGPWR. The program had a watchdog thread that was killing it when execution of another thread was delayed due to a breakpoint.
Comment 6 Marc Khouzam CLA 2011-04-11 06:20:16 EDT
Here is my theory that I will have to confirm with the GDB folks.

1- We cannot set a breakpoint into a process that is running.  This I knew, and DSF-GDB handles that.

2- If we try to set a bp in a running process, GDB gives an error like:
  ^error,msg="Warning:\nCannot insert breakpoint 8.\n
  Error accessing memory address 0x8048b24: Input/output error.\n"
but seems to get stuck on that error so that all other bp commands and even runControl commands fail with that same error.  This is a GDB bug it seems.

3- We cannot set a breakpoint into a running _binary_.  I just noticed this.  It is more restrictive than #1 above.  In my tests, I always ran two different binaries, but with the fork() example, it is the same binary that is being used for both processes (parent and child).  It seems that if I want to set a bp for one of the two processes, then there can be no other running process using that same binary.

I'm going to have to find out if #3 is a GDB bug or not.  I tried running the same binary on two different GDBs and I was able to set breakpoint independently on each one, even if one process was running.  This makes me wonder why this does not work when two processes using the same binary are run under the same GDB.

I'll post a question to the GDB mailing list.

The workaround, I think, would be to interrupt a thread from _every_ process before trying to do a breakpoint operation.  This is not as bad as it sounds actually.  Currently, if I have multiple processes running and I create a new breakpoint, that breakpoint needs to be set for each process, which means we interrupt each process and try to set the breakpoint.  The problem is that when we create or attach to a new process, we only need to set breakpoints for that process, and that is when we leave the other ones running and run into this problem.  We could simply stop all processes in that case also.
Comment 7 Marc Khouzam CLA 2011-04-11 14:30:01 EDT
(In reply to comment #6)

> 3- We cannot set a breakpoint into a running _binary_.  I just noticed this. 
> It is more restrictive than #1 above.  In my tests, I always ran two different
> binaries, but with the fork() example, it is the same binary that is being used
> for both processes (parent and child).  It seems that if I want to set a bp for
> one of the two processes, then there can be no other running process using that
> same binary.

I've confirmed this is a current GDB limitation.  See http://sourceware.org/ml/gdb/2011-04/msg00032.html

I've reopened Bug 337893 to interrupt all processes before doing a breakpoint operation.
Comment 8 Marc Khouzam CLA 2011-04-11 21:34:04 EDT
Created attachment 192995 [details]
Potential fix

(In reply to comment #3)
> Here is a prototype that I expected would work.
> However, I'm seeing issues when setting breakpoints.
> Either I have a bug in the fix to Bug 337893, or GDB is not behaving right.
> The issues seem to be the same as what Sergey reported in Bug 337893 comment 13
> (except that instead of SIGPWR, I saw SIGHANGUP).

Actually, Sergey report these problem is Bug 333284 comment 13

The first problem is about setting breakpoints and in my case, has been fixed when I re-opened and re-fixed Bug 337893.

The second was about a SIGHANGUP in my case, which turned out to be caused because the parent was dying before the child.

So, this patch seems to work.  I have to run the JUnit tests and do some more manual tests before committed.

Sergey, if you have a chance to try it out, it would confirm it works as expected.  Thanks.
Comment 9 Marc Khouzam CLA 2011-04-19 16:23:35 EDT
Created attachment 193620 [details]
Improved fix

I'm using the fixed GDB to keep trying this Eclipse fix, and I'm having more trouble with GDB and breakpoints.

What I'm seeing now is that GDB automatically installs all existing breakpoints when a new process is started with the same binary (this includes forked processes).  If CDT also tries to set these breakpoints then, on the third process with the same binary, GDB goes crazy.

So, I think CDT should not do anything in this auto-attach case.  The attach patch removes the auto-attach change.

I still have more tests to run
Comment 10 Marc Khouzam CLA 2011-04-20 14:09:52 EDT
(In reply to comment #9)
> Created attachment 193620 [details]
> Improved fix
> 
> I'm using the fixed GDB to keep trying this Eclipse fix, and I'm having more
> trouble with GDB and breakpoints.
> 
> What I'm seeing now is that GDB automatically installs all existing breakpoints
> when a new process is started with the same binary (this includes forked
> processes).  If CDT also tries to set these breakpoints then, on the third
> process with the same binary, GDB goes crazy.
> 
> So, I think CDT should not do anything in this auto-attach case.  The attach
> patch removes the auto-attach change.

Overall, this last patch seems to work.  I did manage to crash GDB more than once, but that will need to be fixed in GDB.  I was forking repeatedly and setting breakpoints at the same time, so I'm hoping this won't come up too often for our users.

I don't think there is much more I can do in CDT to improve on this patch at this time.  As we see GDB issues, we can try to report them; but as GDB is still evolving for multi-process/multi-core, I'm hoping some of these issues will get resolved anyway.

One semi-serious limitation of this solution.  CDT only sets breakpoints for the process which it starts; it does not set them for forked processes.  This is necessary because GDB automatically sets the breakpoints on the forked processes.  When we create a new breakpoint, CDT will set it only on the first process, and GDB and all the forked ones.  The problem that I expect (but didn't try) is that if the process that CDT started actually terminates, new breakpoints will not be set by eclipse (since the process is gone) and therefore by GDB, on the forked processes.  I don't have a simple solution for this. 

As the current solution is still an improvement, I'm going to commit this patch.
Comment 11 Marc Khouzam CLA 2011-04-20 14:14:35 EDT
Sergey, if you can try out this solution, we could confirm it is working well enough for your needs.  However, until you have a newer GDB (not released yet), you will keep seeing the original breakpoint problem you reported.  Options are:

1- wait for the GDB maintenance release.  This means we won't have time to fix any other issues you may find, for the Indigo release
2- workaround the problem by having more than one thread stopped in a process for which you are doing a step over.
3- check out a GDB from sources and build it yourself.  It is actually pretty easy, so if you decide to do that, I can give you pointers.
Comment 12 CDT Genie CLA 2011-04-20 14:23:11 EDT
*** cdt cvs genie on behalf of mkhouzam ***
Bug 340697: Auto attach to processes forked by the process being debugged

[*] CommandFactory.java 1.26 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/mi/service/command/CommandFactory.java?root=Tools_Project&r1=1.25&r2=1.26
[*] MIRunControlEventProcessor_7_0.java 1.13 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/mi/service/command/MIRunControlEventProcessor_7_0.java?root=Tools_Project&r1=1.12&r2=1.13

[*] IGDBLaunchConfigurationConstants.java 1.6 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/gdb/IGDBLaunchConfigurationConstants.java?root=Tools_Project&r1=1.5&r2=1.6

[+] MIGDBSetDetachOnFork.java  http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/mi/service/command/commands/MIGDBSetDetachOnFork.java?root=Tools_Project&revision=1.1&view=markup

[+] FinalLaunchSequence_7_2.java  http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/gdb/launching/FinalLaunchSequence_7_2.java?root=Tools_Project&revision=1.1&view=markup

[*] GDBControl_7_2.java 1.2 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/gdb/service/command/GDBControl_7_2.java?root=Tools_Project&r1=1.1&r2=1.2

[*] GdbDebuggerPage.java 1.8 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb.ui/src/org/eclipse/cdt/dsf/gdb/internal/ui/launching/GdbDebuggerPage.java?root=Tools_Project&r1=1.7&r2=1.8
[*] LaunchUIMessages.properties 1.9 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb.ui/src/org/eclipse/cdt/dsf/gdb/internal/ui/launching/LaunchUIMessages.properties?root=Tools_Project&r1=1.8&r2=1.9
Comment 13 Sergey Prigogin CLA 2011-04-20 16:43:27 EDT
(In reply to comment #11)
> 3- check out a GDB from sources and build it yourself.  It is actually pretty
> easy, so if you decide to do that, I can give you pointers.

I don't have time for it right now, but I may come back to it in May or June.
Comment 14 Sergey Prigogin CLA 2011-05-25 17:46:16 EDT
(In reply to comment #11)
> Sergey, if you can try out this solution, we could confirm it is working well
> enough for your needs.  However, until you have a newer GDB (not released yet),
> you will keep seeing the original breakpoint problem you reported.

Marc, do you have the GDB bug number for this?
Comment 15 Marc Khouzam CLA 2011-05-26 09:33:26 EDT
(In reply to comment #14)
> (In reply to comment #11)
> > Sergey, if you can try out this solution, we could confirm it is working well
> > enough for your needs.  However, until you have a newer GDB (not released yet),
> > you will keep seeing the original breakpoint problem you reported.
> 
> Marc, do you have the GDB bug number for this?

It was done the the mailing list.

http://sourceware.org/ml/gdb/2011-04/msg00060.html (1 email)
http://sourceware.org/ml/gdb-patches/2011-04/msg00261.html (3 emails)