Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 288601 - [parser] Lexer grammar parses Integer Dot incorrectly
Summary: [parser] Lexer grammar parses Integer Dot incorrectly
Status: CLOSED FIXED
Alias: None
Product: OCL
Classification: Modeling
Component: Core (show other bugs)
Version: 1.3.0   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: OCL Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-04 03:13 EDT by Ed Willink CLA
Modified: 2011-05-27 02:47 EDT (History)
1 user (show)

See Also:


Attachments
Elimination of parser/lexer workarounds (6.97 KB, patch)
2009-09-04 03:14 EDT, Ed Willink CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Willink CLA 2009-09-04 03:13:33 EDT
Attached fixes a problem with lexing Integer Dot (and also eliminates the unused octal and hex digit definitions).

The problem is that

"1. "

gets two thirds through "Integer Dot Integer" as a Decimal, but fails to complete so the lexer has no token to return, and returns a token of 0 kind.

This was worked around by special INTEGER_RANGE_START for Integer.. and various NUMERIC_OPERATION for Integer.-> etc.

The attached ensures that Integer Dot produces the INTEGER_LITERAL, DOT token sequence and so removes all the parser grammar workarounds.

[It is now possible to remove start/endOffset since there are no tokens that do not represent sensible ranges.]
Comment 1 Ed Willink CLA 2009-09-04 03:14:48 EDT
Created attachment 146476 [details]
Elimination of parser/lexer workarounds

Try agains
Comment 2 Ed Willink CLA 2009-09-04 03:29:53 EDT
AbstractOCLParser.createRangeStart should also be deleted.
Comment 3 Ed Willink CLA 2009-10-07 16:18:24 EDT
Ping.

This has been waiting for review for over a month.

It is a very simple grammar misunderstanding cleanup. It still applies successfully to HEAD.
Comment 4 Adolfo Sanchez-Barbudo Herrera CLA 2009-10-09 05:42:11 EDT
+1.

Trivial: Is there any real need to create a dotToken/dotDotToken ? if so, why not exploiting in the remaining lexer grammar ?

Cheers,
Adolfo.
Comment 5 Alexander Igdalov CLA 2009-10-09 05:56:01 EDT
Ed, your patch is extremely important. Not only does it remove the ugly workarounds but it seems to solve the problem below as well (I didn't check though):

`1. oclIsTypeOf(Integer)` couldn't be parsed since there has been no workaround for numeric operations starting with whitespace.

We seem to have had a problem no one has noticed before. E.g. for any other type:
`true.oclIsTypeOf(Boolean)` and `true. oclIsTypeOf(Boolean)` were both parsed successfully.
But though `1.oclIsTypeOf(Integer)` could be parsed well, its variant with a whitespace before the operation name produced a lexer error.

My +1.
Comment 6 Ed Willink CLA 2009-10-10 07:58:06 EDT
Changes committed to HEAD.

Re: #4 Trivial: Is there any real need to create a dotToken/dotDotToken ? if so, why not exploiting in the remaining lexer grammar ?

There is no dotDotToken; DotDotToken is used as in

	Token ::= IntegerLiteral DotDotToken
		/.$NoAction
		./

The location of the makeAction is slightly unusual because lexer actions cannot produce two tokens from one reduction. Once moved, other reductions get difficult which perhaps explains why the original author resorted to workarounds rather than solving the true problem.
Comment 7 Ed Willink CLA 2009-10-16 03:51:05 EDT
Resolved
Comment 8 Ed Willink CLA 2011-05-27 02:47:54 EDT
Closing after over 18 months in resolved state.