Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 332949 - File being edited seems to be stored 8 times in memory
Summary: File being edited seems to be stored 8 times in memory
Status: NEW
Alias: None
Product: TMF
Classification: Modeling
Component: Xtext Backlog (show other bugs)
Version: 2.0.0   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-20 09:47 EST by Mark Christiaens CLA
Modified: 2012-11-19 08:44 EST (History)
5 users (show)

See Also:


Attachments
Dump of profiling results (190.67 KB, text/html)
2010-12-20 09:49 EST, Mark Christiaens CLA
no flags Details
Dump of profiling results (442.23 KB, text/html)
2010-12-20 09:50 EST, Mark Christiaens CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Christiaens CLA 2010-12-20 09:47:59 EST
Build Identifier: I20101028-1441

I've opened a moderately sized file (114561 B) and using my profiler I see that there are 8 copies of this file in memory.  In the attachment (DuplicateChars.html) you see the path from the GC roots to the character arrays.  I know that Strings are often backed by the same array but I suspect that this is not the case here since the profiler does seem to pick up on this (see the reference from the "divmod" and "numeric_std" Strings).

I'm testing with files up to 2 MB.  That would result in 4 MB per to store on copy (Chars are 2 B).  That times 8 is 32 MB.  Quite a lot. 

In the same vain I see 16 large ConcurrentHashMaps (CHMHashEntry.html) that are identical according to the profiler.  I don't get yet what they are for but to me they are smelly.  You may want to take a look at those. 

Reproducible: Always
Comment 1 Mark Christiaens CLA 2010-12-20 09:49:44 EST
Created attachment 185545 [details]
Dump of profiling results

Note that I did remove part of the string representing the file content.  If not, this HTML would have been over a MB.
Comment 2 Mark Christiaens CLA 2010-12-20 09:50:46 EST
Created attachment 185546 [details]
Dump of profiling results
Comment 3 Samantha Chan CLA 2011-02-10 19:00:34 EST
I am seeing the same thing with XText 1.0.2
Comment 4 Sebastian Zarnekow CLA 2011-04-19 10:46:24 EDT
There is not much that can be done about this one with reasonable effort.
A freshly opened editor causes the complete string to be hold 3 times in memory.
There is the dirty state manager, that refers to the content of the resource, the resource's parse result and the last document event that the text viewer refers to.

A second editor with another resource that has a cross link to the first one, will cause the first one to be copied one more time.

The document itself will hold references to a number of substrings for each line of the input.

I could implement something (modifying existing APIs) which would save exactly one of the copies of the entire input. If the fact that the string is stored 4 times is not a show stopper on your side, I'm inclined to postpone this ticket. Please let me know if that is no option for your use case.
Comment 5 Mark Christiaens CLA 2011-04-19 10:57:09 EDT
I think that the duplicate files are not yet a show stopper.  Removing 1 of X duplicates is probably not useful anyway so I'm fine with postponing.