| Summary: | [scalability] Parser needs a lot of memory parsing large initializer expressions | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Tools] CDT | Reporter: | Anton Leherbauer <aleherb+eclipse> | ||||||||||||||||
| Component: | cdt-parser | Assignee: | Markus Schorn <mschorn.eclipse> | ||||||||||||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||||||||||||
| Severity: | normal | ||||||||||||||||||
| Priority: | P3 | CC: | cdtdoug, eclipse.sprigogin, filip.gustafsson, mikekucera, wbprio | ||||||||||||||||
| Version: | 5.0 | Flags: | cdtdoug:
iplog-
|
||||||||||||||||
| Target Milestone: | 5.0.2 | ||||||||||||||||||
| Hardware: | PC | ||||||||||||||||||
| OS: | Windows XP | ||||||||||||||||||
| Whiteboard: | |||||||||||||||||||
| Attachments: |
|
||||||||||||||||||
Created attachment 116932 [details]
Sample file
The problem is two-fold: * the parser keeps a reference to the first token in order to backtrack, if the initializer cannot be parsed, therefore all the tokens need to be held in memory at once. * The ast contains a lot of literal expressions. The indexer does not need to look at these expressions, we need an option for the ast creation that allows skipping trivial expressions in compound-initializers. Created attachment 117176 [details]
proposed fix
The fix consists of multiple changes:
(1) parsers no longer hold on to tokens when parsing aggregate initializers
(2) an additional option to skip literal expressions in aggregate initializers,
option is used by ASTProvider, CModelBuilder2 and indexer.
(3) changed ASTProvider such that it does not attempt to provided ast for
closed translation units, such that we don't generate AST in scalability
mode. New name for flag: WAIT_YES -> WAIT_IF_OPEN
(4) Reviewed and reduced the usage of WAIT_IF_OPEN flag, such that attempts to
generate AST for non-active editors are minimized.
Especially for the changes related to (3) and (4) it would be good to have
my changes reviewed:
2: CModelBuilder2, ASTCache
3: ASTProvider
4: CElementHyperlinkDetector, InactiveCodeHighliting,
SemanticHighlightingReconciler, CorrectionCommandHandler,
QuickAssistLightBulbUpdater, LinkedNamesAssistProposal
Some UI tests fail because the AST is requested when the working copy is not yet open, e.g. the HyperlinkTests. A simple fix for the tests is to make sure the reconciler has run: EditorTestHelper.joinReconciler(EditorTestHelper.getSourceViewer(editor), 100, 500, 10); Otherwise the changes look good. Created attachment 117193 [details]
the fix
Thanks Toni, I have updated the testcases to wait for the reconciler.
Created attachment 117194 [details]
the fix
patch with minor cleanup work, such as updated copy right headers
Created AST2Tests.testScalabilityOfLargeTrivialInitializer_Bug252970(), Fixed in 5.0.2 > 20081106. Created attachment 117293 [details]
follow up on the fix
I must avoid using a visitor on the ast before ambiguity resolution.
Created attachment 117297 [details]
further optimization for CDT 6.0
I improved interface + implementation of IASTLiteralExpression to avoid creation of superfluous string-objects. This patch cannot be applied to 5.0.x
Created attachment 117439 [details]
follow up: fixes detection of names in designators
*** Bug 258944 has been marked as a duplicate of this bug. *** (In reply to comment #7) > Created AST2Tests.testScalabilityOfLargeTrivialInitializer_Bug252970(), That's the wrong bug number. Fixed in Head. testScalabilityOfLargeTrivialInitializer_Bug253690 consistently fails in HEAD. On Windows with JRE 1.6.0-11 it gives 3254464 expected < 3120512, on Linux with JRE 1.6.0-1 - 3120552 expected < 3120512 (In reply to comment #13) > testScalabilityOfLargeTrivialInitializer_Bug253690 consistently fails in HEAD. > On Windows with JRE 1.6.0-11 it gives 3254464 expected < 3120512, on Linux with > JRE 1.6.0-1 - 3120552 expected < 3120512 Thanks, I use a weaker assertion now. Sorry for the delay but I have now tested this in CDT 6.0.0 and it works fine. |
Parsing the following sample char large_array[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, [repeat 100000 times] }; requires more than 256M of heap space.