Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 302927

Summary: RxThread leaks memory in oobList
Product: [Tools] CDT Reporter: James Blackburn <jamesblackburn+eclipse>
Component: cdt-debug-cdiAssignee: James Blackburn <jamesblackburn+eclipse>
Status: RESOLVED FIXED QA Contact: John Cortell <john.cortell>
Severity: normal    
Priority: P3 CC: marc.khouzam, nobody, pawel.1.piech
Version: 7.0Flags: jamesblackburn+eclipse: review? (nobody)
Target Milestone: 7.0   
Hardware: PC   
OS: All   
Whiteboard:
Attachments:
Description Flags
patch 1
jamesblackburn+eclipse: iplog-
DSF version of the patch.
none
patch 2 jamesblackburn+eclipse: iplog-

Description James Blackburn CLA 2010-02-16 04:43:02 EST
With the fix for bug 119370 the MI target output is added to the oobList for later processing by CDI (to handle in-band monitor responses). Unfortunately, in the interim, this leaks memory.

We have some long running tests which are verbose in their printed output. When launched in Eclipse and left running the session dies with OutOfMemory.
Comment 1 James Blackburn CLA 2010-02-16 05:45:11 EST
Created attachment 159162 [details]
patch 1

Simple patch which clears the oobList if we're not actually waiting for a response from a command (rxQueue.isEmpty()) (and the output wasn't parsed as a MIResult in the first place).
Comment 2 Pawel Piech CLA 2010-02-16 14:09:59 EST
Created attachment 159230 [details]
DSF version of the patch.
Comment 3 Pawel Piech CLA 2010-02-16 14:13:31 EST
I attached what I think is a DSF-equivalent of the fix.  Marc, could you have a look at it?
Comment 4 Marc Khouzam CLA 2010-02-16 16:05:13 EST
(In reply to comment #3)
> I attached what I think is a DSF-equivalent of the fix.  Marc, could you have a
> look at it?

Thanks Pawel for paying attention to this.
I think the fix is fine.  In fact, I always thought it was weird that we would accumulate all the OOB records, even the ones that did not belong to a CLI command.  It's nice to have this fixed.

Thanks to James also for the original fix.

I committed the DSF-GDB fix to HEAD. (I don't really think it is necessary for 6.0)
Comment 5 James Blackburn CLA 2010-02-17 02:41:38 EST
I've committed the CDI fix to HEAD.  

Thanks Pawel and Marc for fixing this in DSF before anyone has hit it!
Comment 6 Marc Khouzam CLA 2010-02-18 10:40:10 EST
(In reply to comment #5)
> I've committed the CDI fix to HEAD.  
> 
> Thanks Pawel and Marc for fixing this in DSF before anyone has hit it!

Since this was committed to CDI I'm seeing it always stop on Library loaded events.  I'll try to find the problem
Comment 7 Marc Khouzam CLA 2010-02-18 10:58:16 EST
It seems that the oobList is not only needed for CLI commands that are received, but also for out-of-band events.

For example, GDB will give the following:

[1,266,508,228,615] ~"Stopped due to shared library event\n"
[1,266,508,272,158] 111*stopped,thread-id="0"

and since the stopped event does not have a 'reason' field, CDI uses the oobList to look for the reason in the printout above.  Now that this bug clears those printouts, we don't know it is a shared library stop and we don't automatically resume.

I'm not sure how to fix this because I'm not familiar with James's usecase.  I will comment out the fix temporarily because I believe James is away for a couple of days.

Re-opening the bug.
Comment 8 James Blackburn CLA 2010-02-22 05:05:23 EST
(In reply to comment #7)
> I'm not sure how to fix this because I'm not familiar with James's usecase.  

Hmm... that's unfortunate, good catch Marc. The problem we've got is that we're leaving long running launches going, and this leak uses megabytes of memory per minute.

How much context do we need? Perhaps an alternative fix would be to keep the last 2(?) items of context when we're not waiting for an oob reply?
Comment 9 James Blackburn CLA 2010-02-22 05:15:17 EST
Created attachment 159758 [details]
patch 2

Attempt 2.

This keeps the last 4 oobList entries around; does this fix the shared library issue for you Marc?
Comment 10 Marc Khouzam CLA 2010-02-22 09:36:54 EST
(In reply to comment #9)
> Created an attachment (id=159758) [details]
> patch 2
> 
> Attempt 2.
> 
> This keeps the last 4 oobList entries around; does this fix the shared library
> issue for you Marc?

Yes, things work now.  It may be a bit of a risk to only keep 4 records.  I'd suggest keeping 10 or 20, just to be safe.  As it used to keep all of them, it is still better, and it will still fix your out-of-memory problem.  What do you think?

Also, if you could add a couple of comments to explain the problem or at least point to this bug.

Thanks
Comment 11 Marc Khouzam CLA 2010-02-22 09:37:41 EST
Oh, and we don't have the same problem in DSF-GDB because we don't stop on share-lib events.  But when we will (soon), we may have to pay attention to this.
Comment 12 James Blackburn CLA 2010-02-22 09:42:44 EST
(In reply to comment #10)
> Yes, things work now.  It may be a bit of a risk to only keep 4 records.  I'd
> suggest keeping 10 or 20, just to be safe.  

Sounds good. I've made this change and added a comment pointing at the bug. Resolving...
Comment 13 Nobody - feel free to take it CLA 2010-02-22 10:36:39 EST
To be honest I am not very happy with this solution.(In reply to comment #11)
> Oh, and we don't have the same problem in DSF-GDB because we don't stop on
> share-lib events.  But when we will (soon), we may have to pay attention to
> this.
You can use notify-async-output records instead. I can see that GDB 6.8 supports "library-loaded" events. It didn't work for GDB 5.2 :(
Comment 14 Nobody - feel free to take it CLA 2010-02-22 10:57:42 EST
I have been away last week and a bit rusty at the moment, but why we can't cleanup the oobList everytime the session stops?
Comment 15 Marc Khouzam CLA 2010-02-22 10:58:48 EST
(In reply to comment #13)
> To be honest I am not very happy with this solution.(In reply to comment #11)
> > Oh, and we don't have the same problem in DSF-GDB because we don't stop on
> > share-lib events.  But when we will (soon), we may have to pay attention to
> > this.
> You can use notify-async-output records instead. I can see that GDB 6.8
> supports "library-loaded" events. It didn't work for GDB 5.2 :(

I believe those were only introduced in GDB 7.0.  Are you sure you see it in 6.8?
Comment 16 James Blackburn CLA 2010-02-22 11:04:34 EST
(In reply to comment #14)
> I have been away last week and a bit rusty at the moment, but why we can't
> cleanup the oobList everytime the session stops?

Because the session isn't stopping.  For example we reuse GDB + built-in simulator for the backend of our Run mode launches. If we leave a run going overnight, in the morning 700Mb of heap's disappeared and Eclipse has crashed with OOM.
Comment 17 Nobody - feel free to take it CLA 2010-02-22 11:55:46 EST
(In reply to comment #15)
> I believe those were only introduced in GDB 7.0.  Are you sure you see it in
> 6.8?
Yes, you're right. I have been using CodeSourcery gdb 6.8 which apparently has these features.
Comment 18 Nobody - feel free to take it CLA 2010-02-22 12:02:03 EST
(In reply to comment #16)
> Because the session isn't stopping.  For example we reuse GDB + built-in
> simulator for the backend of our Run mode launches. If we leave a run going
> overnight, in the morning 700Mb of heap's disappeared and Eclipse has crashed
> with OOM.
Yes, the solution for https://bugs.eclipse.org/bugs/show_bug.cgi?id=119370 wasn't good. Does it make sense to have an attribute for the size of oobList? At least the clients can control it. Your simulator is a good example.