Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 106279

Summary: DOM Parser parses parenthesized expressions incorrectly
Product: [Tools] CDT Reporter: mcabusao
Component: cdt-parserAssignee: Project Inbox <cdt-core-inbox>
Status: RESOLVED DUPLICATE QA Contact:
Severity: normal    
Priority: P3 CC: brownjasonj, john.camelon
Version: 3.0   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Attachments:
Description Flags
patch to fix incorrect identification of cast none

Description mcabusao CLA 2005-08-06 10:35:39 EDT
Hello,
There's a bug in the DOM parser that parses a parenthesized expression as a 
cast expression.  The following code shows a simple example:

<code>
int foux=0, bhar=0;
foux = (foux) - bhar;
</code>

The parenthesized "foux" gets parsed as an IASTCastExpression rather than an 
IASTUnaryExpression, and thus creates a ProblemBinding for the IASTName 
reference.
Comment 1 Jason Brown CLA 2006-10-02 03:34:25 EDT
Created attachment 51238 [details]
patch to fix incorrect identification of cast
Comment 2 Jason Brown CLA 2006-10-02 03:39:32 EDT
the patch i attached is for the latest head.  if someone wants the patch for 3.0 also I could provide it.
Comment 3 Andrew Niefer CLA 2006-10-02 11:40:58 EDT
Comment on attachment 51238 [details]
patch to fix incorrect identification of cast

After a quick look at this patch, I have to vote -1 :
1) the .qualifier needs to stay on the Bundle-Version in the manifest.  At most, the version would change to 3.1.1.qualifier
2) This TypeScopeStack would not work for C++.  This patch only applies to C and will break C++.

There already exists an ambiguity resolution mechanism that should be used here.  See IASTAmbiguityParent and CAmbiguity/CPPAmbiguity.  John would know the details of how to use this better than I.

See also bug 100641 which is the same kind of problem.
Comment 4 Doug Schaefer CLA 2006-10-02 12:00:31 EDT
I agree with Andrew. Without looking up what foux is, this expression is ambiguous. And, yes, John C is the only one who knows about this stuff.
Comment 5 Jason Brown CLA 2006-10-02 16:00:45 EDT
Is John C still working on the project.  the last time I  spoke to him he wasn't.  i'm certainly willing to learn here.

so how does one disambiguate in this case and bug 100641?  Should these two bugs be changed to be closed or not bugs?

alternatively, why does the stack not work in c++?  i've not used it in vain there.
Comment 6 Doug Schaefer CLA 2006-10-02 16:51:01 EDT
John chips in once in a while but not very regulary. Unfortulately you may have to learn directly from the source code. :(
Comment 7 Jason Brown CLA 2006-10-02 17:20:22 EDT
i took a look at IASTAmbiguityParent which just enables one to substitute AST nodes.  The problem with trying to fix this once the AST has been constructed is that it could require a lot of rewriting of the AST with re-intepretation. The example given requires the cast expression (foux) to be changed to a unary bracket expression and the unary '- bhar' to a binary expression!  This is a simple case.

Why is ambiguity easier to disambiguate after AST construction?  Or is this a question of complexity?

The CAmbiguity/CPPAmbiguity are not used in the case of cast, so the parser does not detect the ambiguity itself.  The parser interperets (foux) as a cast.

Would a version of the patch I posted that supports c++ be acceptable?
Comment 8 Doug Schaefer CLA 2006-10-02 21:22:49 EDT
Ambiguity nodes were created because we wanted to minimize the processing required during AST construction. Parsing has historically been too slow, and it still is. To speed things up, we moved all semantic processing to post parse time, and then only when necessary. If no one cares about the details, the ambiguity nodes do not need to be resolved.

Ambiguity nodes are hard to construct and resolve but they do save us in the end.
Comment 9 Andrew Niefer CLA 2006-10-03 11:06:36 EDT
> Would a version of the patch I posted that supports c++ be acceptable?
This strategy would not work for C++.  Scopes don't stack in the same way, you need full semantic resolution for the relevant nodes.  To get it right you would essentially be duplicating a lot of the binding resolution and merging it with the syntax parse.  We split semantic resolution out for performance and complexity reasons, this is the biggest difference between the old parser and the DOM parser.

> Why is ambiguity easier to disambiguate after AST construction?
See for example bug 90606, sometimes the required information appears later in the parse.

>The problem with trying to fix this once the AST has been constructed
is that it could require a lot of rewriting of the AST with re-intepretation.
I'm not sure you got this right.  We parse both possibilities and create full AST subtrees for each.  Any complexities with choosing the right subtree and replacing the ambiguity have already been solved.  Step through, for example, AST2CPPTests.testAmbiguity(), to get an idea. (or try the typeid vs unary expression from 90606).
Comment 10 Jason Brown CLA 2006-10-06 06:05:36 EDT
ok, i understand that this solution won't ever work for C++.

so, could you confirm for the follow (based on the example code)

1. an ambiguity node needs to be constructed for the cast (foux)
2. an ambiguity node is not currently constructed for the cast (in the example code)

if this is the case, what is still unclear to me is how to detect the ambiguity? it doesn't seem reasonable to create an ambiguity node for every cast.

i am very happy to resolve this problem, but the resolution is still unclear to me. i also haven't got a response back from JohnC yet.
Comment 11 Doug Schaefer CLA 2007-08-21 11:00:30 EDT
Future means you commit to fix it in the Future. Inboxes can't make committments. Moving to '--'.
Comment 12 Markus Schorn CLA 2008-02-13 08:36:26 EST
Created testcase AST2Tests.testBug100641_106279_castAmbiguity().
Fixed with bug 168924.

*** This bug has been marked as a duplicate of bug 168924 ***