Community
Participate
Working Groups
To reproduce: 1. Compile and link the following program and run it from a terminal window: #include <assert.h> #include <pthread.h> #include <unistd.h> #include <cstdio> void* RunInThread(void* arg) { for (int i = 0; i < 3600; i++) { sleep(1); printf("secondary thread: %d\n", i); } return NULL; } int main() { pthread_attr_t attr; assert(pthread_attr_init(&attr) == 0); pthread_t tid; assert(pthread_create(&tid, &attr, &RunInThread, NULL) == 0); assert(pthread_attr_destroy(&attr) == 0); for (int i = 0; i < 3600; i++) { sleep(1); printf("main thread: %d\n", i); } return 0; } 2. Create a C/C++ Attach to Application debug configuration and enable non-stop mode. 3. Start a debugging session. The Debug window shows that the main thread is suspended and the second thread is still running. test attach [C/C++ Attach to Application] tmp/test [11329] [cores: 0] Thread [2] 11332 [core: 0] (Running) Thread [1] 11329 [core: 0] (Suspended : User Request) nanosleep() at 0x7fe170f6effd sleep() at 0x7fe170ffe9e0 main() at test.cc:22 0x460642 gdb The state of the secondary thread is shown incorrectly since the thread doesn't print anything to the terminal window and therefore must be suspended. 4. Resume the main thread. Debug window shows that both threads are running, but, in fact, the secondary thread is still suspended. 5. Detach from the program. Now both threads are running again. 6. Repeat step #3 7. Select the program node (tmp/test) in the Debug view and click Resume. Both threads are resumed as indicated by their output to the terminal window. Attaching to a program without the non-stop mode appears to work fine.
Here is a fragment of gdb trace that indicates that gdb correctly reported both thread as "stopped": 887,383 17^done,groups=[{id="i2",type="process",pid="11979",executable="tmp/test",cores=["0","1"]},{id="i1",type="process"}] 887,383 (gdb) 887,417 18-list-thread-groups 887,417 18^done,groups=[{id="i2",type="process",pid="11979",executable="tmp/test",cores=["0","1"]},{id="i1",type="process"}] 887,417 (gdb) 887,424 19-list-thread-groups i2 887,425 19^done,threads=[{id="2",target-id="Thread 0x7f0bc8f50710 (LWP 11980)",frame={level="0",addr\ ="0x00007f0bc918affd",func="nanosleep",args=[],from="/usr/lib64/libc.so.6"},state="stopped",\ core="1"},{id="1",target-id="Thread 0x7f0bca0c1740 (LWP 11979)",frame={level="0",addr="0x00007f0bc91\ 8affd",func="nanosleep",args=[],from="/usr/lib64/libc.so.6"},state="stopped",core="0"}] 887,426 (gdb) 887,434 20-stack-info-depth --thread 1 11 887,434 20^done,depth="3" 887,435 (gdb) 887,435 21-stack-list-frames --thread 1 887,436 21^done,stack=[frame={level="0",addr="0x00007f0bc918affd",func="nanosleep",from="/usr/grte/v\ 2/lib64/libc.so.6"},frame={level="1",addr="0x00007f0bc921a9e0",func="sleep",from="/usr/lib64\ /libc.so.6"},frame={level="2",addr="0x0000000000460642",func="main",file="test.cc",line="22"}] 887,436 (gdb) 887,457 22-thread-info 1 887,458 22^done,threads=[{id="1",target-id="Thread 0x7f0bca0c1740 (LWP 11979)",frame={level="0",addr\ ="0x00007f0bc918affd",func="nanosleep",args=[],from="/usr/lib64/libc.so.6"},state="stopped",\ core="0"}] 887,458 (gdb) 887,587 23-thread-info 2 887,588 23^done,threads=[{id="2",target-id="Thread 0x7f0bc8f50710 (LWP 11980)",frame={level="0",addr\ ="0x00007f0bc918affd",func="nanosleep",args=[],from="/usr/lib64/libc.so.6"},state="stopped",\ core="1"}] 887,588 (gdb) 887,590 24-stack-list-frames --thread 1 0 2 887,591 24^done,stack=[frame={level="0",addr="0x00007f0bc918affd",func="nanosleep",from="/usr/grte/v\ 2/lib64/libc.so.6"},frame={level="1",addr="0x00007f0bc921a9e0",func="sleep",from="/usr/lib64\ /libc.so.6"},frame={level="2",addr="0x0000000000460642",func="main",file="test.cc",line="22"}] 887,591 (gdb)
Does anybody with debugging foo have spare cycles to take a look at this bug? This bug is pretty serious since it's almost guaranteed to confuse the hell out of an unsuspecting user.
I'll have a look sometime next week. Did you see the problem with CDT 7.0.1? Which GDB version where you using?
(In reply to comment #3) > I'll have a look sometime next week. > > Did you see the problem with CDT 7.0.1? I haven't tried with 7.0.1. > Which GDB version where you using? 7.2
The problem seems to be that we don't get a *stopped event for each of the threads, when we attach to the process. Currently, we rely solely on the *stopped events to mark a thread as suspended. We really should take into consideration the state reported by gdb in list-thread-groups i2 and in -thread-info to make sure we are aware of real current state. The problem never came up before because *stopped events are reliable, but for the first time of an attach, they seem to not be. I'll look into a solution.
Note that we may only need to fix this for the non-stop case, since it is the only case where threads don't all have the same state. That would be in GDBRunControl_7_0_NS (for information purpose).
Created attachment 186869 [details] Prototype for a new approach to attaching in non-stop mode Instead of a fix of the current problem, I wondered if we should take a different approach altogether. In non-stop mode, GDB allows the use of an asynchronous attach -target-attach <pid>& This would cause all threads to remain running when we attach to an application. This is much less intrusive than the current behavior. It would allow the application to only stop once it hits a breakpoint, which looks nicer. This patch illustrates the result. I _must_ be run with assertion off (no -ea flag) because there is a problem with setting the first breakpoints when launching such a session. But if this approach is the way we want to go, we can fix this problem. Sergey, what do you think of this idea? Anyone else? P.S. this asynchronous attach is even allowed in all-stop mode if we enable target-async mode. We may want to do that if we like this approach.
(In reply to comment #7) I really like this approach. Non-stop is supposed to be non-stop, isn't it :-)?
(In reply to comment #8) > (In reply to comment #7) > > I really like this approach. Non-stop is supposed to be non-stop, isn't it :-)? Right :-) But if you think about it, even in all-stop, when the user attaches to a process, why would they want to interrupt it immediately? Doesn't it make more sense to only stop the process once we actually reach a point of interest (a breakpoint)? But that is a separate enhancement bug. This will actually be the approach for 'global breakpoints', where the user will be able to set a breakpoint in a piece of code, without even attaching to a process at all.
Created attachment 192429 [details] Fix Now that Bug 337893 is resolved, which was causing breakpoint problems in this case, we can fix the current bug. This patch uses -target-attach <pid> & when in non-stop mode, to avoid interrupting the target when attaching. This is of course for GDB >= 7.0 since that is the only ones that support non-stop. A side-effect of this fix is that the thread states are now show properly, which was the real issue with this bug. We cannot use this form of -target-attach for all-stop mode because we currently don't use 'target-async on' when using all-stop, which is a prerequisite. But that is ok, because we didn't have any thread-state issues for all-stop. There are some minor backwards-compatible API changes which I feel are worth adding to fix this. Committed to HEAD.
Fixed. Sergey, can you review?
*** cdt cvs genie on behalf of mkhouzam *** Bug 333284: Thread state is shown incorrectly after attaching to an app in non-stop mode. Use non-interrupting attach when in non-stop to fix this issue, and to improved behavior when attaching. [*] MITargetAttach.java 1.5 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/mi/service/command/commands/MITargetAttach.java?root=Tools_Project&r1=1.4&r2=1.5 [*] GDBProcesses_7_2.java 1.9 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/gdb/service/GDBProcesses_7_2.java?root=Tools_Project&r1=1.8&r2=1.9 [*] GDBProcesses_7_0.java 1.45 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/gdb/service/GDBProcesses_7_0.java?root=Tools_Project&r1=1.44&r2=1.45 [*] CommandFactory.java 1.25 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf-gdb/org.eclipse.cdt.dsf.gdb/src/org/eclipse/cdt/dsf/mi/service/command/CommandFactory.java?root=Tools_Project&r1=1.24&r2=1.25
I can't judge the code, but attaching to a multi-threaded process in non-stop mode now works flawlessly. Few issues popped up in a subsequent debugging session. Please let me know if I should file separate bugs for them. 1. Attempts to step over function calls behaved as Step Into and triggered error messages like: Warning: Cannot insert breakpoint 0. Error accessing memory address 0x6ac2287: Input/output error. 2. Over time all worker threads stopped on (Suspended : Signal : SIGPWR:Power fail/restart) Suspension on SIGPWR does not happen until the first stepping action. Is there a way to disable thread suspension on SIGPWR?
(In reply to comment #13) > I can't judge the code, but attaching to a multi-threaded process in non-stop > mode now works flawlessly. Excellent. I'm happy about the bug fix, but also about the new behavior. I think it is much nicer for non-stop to not interrupt the process. > Few issues popped up in a subsequent debugging session. Please let me know if I > should file separate bugs for them. > > 1. Attempts to step over function calls behaved as Step Into and triggered > error messages like: > Warning: > Cannot insert breakpoint 0. > Error accessing memory address 0x6ac2287: Input/output error. Hm, setting a breakpoint when doing a step over? That may be GDB trying to set the breakpoint implicitly to step past the function. If that is the case, it would be a GDB error. Can you write a new bug, attach the 'gdb traces' console logs and, if possible a way to simply reproduce the problem? > 2. Over time all worker threads stopped on > (Suspended : Signal : SIGPWR:Power fail/restart) > > Suspension on SIGPWR does not happen until the first stepping action. > > Is there a way to disable thread suspension on SIGPWR? I never heard of SIGPWR. Why are threads getting that signal?
(In reply to comment #14) > I never heard of SIGPWR. Why are threads getting that signal? This could be caused by the application shutting down itself because it was not happy with something. Is there a way to disable suspension on signals in general?
(In reply to comment #15) > (In reply to comment #14) > > I never heard of SIGPWR. Why are threads getting that signal? > > This could be caused by the application shutting down itself because it was not > happy with something. Is there a way to disable suspension on signals in > general? http://sourceware.org/gdb/onlinedocs/gdb/Signals.html (gdb) help handle Specify how to handle a signal. Args are signals and actions to apply to those signals. Symbolic signals (e.g. SIGSEGV) are recommended but numeric signals from 1-15 are allowed for compatibility with old versions of GDB. Numeric ranges may be specified with the form LOW-HIGH (e.g. 1-5). The special arg "all" is recognized to mean all signals except those used by the debugger, typically SIGTRAP and SIGINT. Recognized actions include "stop", "nostop", "print", "noprint", "pass", "nopass", "ignore", or "noignore". Stop means reenter debugger if this signal happens (implies print). Print means print a message if this signal happens. Pass means let program see this signal; otherwise program doesn't know. Ignore is a synonym for nopass and noignore is a synonym for pass. Pass and Stop may be combined. Note that this is not supported in DSF-GDB but you can type it in the gdb console. It could be an enhancement request.
(In reply to comment #13) > 1. Attempts to step over function calls behaved as Step Into and triggered > error messages like: > Warning: > Cannot insert breakpoint 0. > Error accessing memory address 0x6ac2287: Input/output error. I found a minor problem in my fix to Bug 337893 which I committed a change for just now. I don't know if it might be the cause of the error you saw or not, but you may want to try it out again after updating your DSF-GDB code.
(In reply to comment #9) > (In reply to comment #8) > > (In reply to comment #7) > > > > I really like this approach. Non-stop is supposed to be non-stop, isn't it :-)? > > Right :-) > > But if you think about it, even in all-stop, when the user attaches to a > process, why would they want to interrupt it immediately? Doesn't it make more > sense to only stop the process once we actually reach a point of interest (a > breakpoint)? But that is a separate enhancement bug. > > This will actually be the approach for 'global breakpoints', where the user > will be able to set a breakpoint in a piece of code, without even attaching to > a process at all. I do love to see such feature that even in stop-mode, attaching to a process does not interrupt the running process (any threads) unless any threads run to breakpoints. Is this feature/enhancement in plan or already covered? Thanks great, Tim Jiang
(In reply to comment #18) > I do love to see such feature that even in stop-mode, attaching to a process > does not interrupt the running process (any threads) unless any threads run to > breakpoints. > > Is this feature/enhancement in plan or already covered? In Eclipse, we could automatically resume all threads after attaching, which would give the user the impression that nothing stopped. I personally have no plans to work on that since I am waiting for GDB's global breakpoints: http://sourceware.org/ml/gdb-patches/2011-06/msg00163.html But if someone else wants to contribute this feature, that would be fine.