Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 331931 - [formatter/HiddenTokenRules] BaseFormatter assumes WS rule exists and is terminal
Summary: [formatter/HiddenTokenRules] BaseFormatter assumes WS rule exists and is term...
Status: CLOSED DUPLICATE of bug 335965
Alias: None
Product: TMF
Classification: Modeling
Component: Xtext (show other bugs)
Version: 1.0.1   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-06 13:23 EST by Henrik Lindberg CLA
Modified: 2011-03-14 19:01 EDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Henrik Lindberg CLA 2010-12-06 13:23:33 EST
The BaseFormatter assumes that if WS rule exists that it is a TerminalRule. 
In my case, WS is not a terminal rule and the result is a class cast exception (line 26).

The code has a comment that the name of the WS rule should be configurable.
My workaround was to rename my (non terminal WS) rule. It seems to work fine without a WS rule at all.
Comment 1 Moritz Eysholdt CLA 2011-01-11 05:26:21 EST
This can be configured by overriding the method getWSRule(). Overriding should be easy, since BaseFormatter is the super class of AbstractDeclarativeFormatter, which again is the base class for every language's formatter. I don't think we need a more prominent place for configuration. And there is no way that I know of to automatically determine the grammar's rule for whitespace.

However, we can
- remove the "promising" TODO comment ;)
- raise a more meaningful error message in case something goes wrong.
Comment 2 Henrik Lindberg CLA 2011-01-11 09:06:05 EST
Overriding does not really work since it is assumed that it is a TerminalRule that is returned. In the grammar in question WS and COMMENTS can not be terminal due to overlap.

I have (after much trial and error) come up with a "hack" around this. I configured the hidden token helper to report my non terminal WS/COMMENT rule as being "whitespace" (needed because I could not return it from getWSRule() and if not having this at all, the configurable formatter would not insert spaces (understandable since it does not know which rule to use). With this configuration I also needed to make sure that whitespace gets processed correctly - to do that, I did an override of the formatter's token stream checking in the writeSemantic method, if it got my semantic WS/Comment rule, and if so, split it up into WS and COMMENT tokens and instead calling writeHidden. 

At first I tried defining formatting rules for my WS/COMMENTS but that created a formatter that was >20x slower.

What I think I am asking for is an easier (and supported) way of dealing with a grammar where WS and COMMENTS are semantic (non hidden) while still being able to use the declarative formatter.

I can imagine that support for fragment terminals in 2.0 could be used if it is allowed to set fragments as hidden in the grammar.

Here is the relevant part of my grammar (showing WS, COMMENTs and use of OWS (optional whitespace/comments) in an expression. All single character rules (e.g. SP, CR, HASH, ASTERISK...) are terminals, and various character classes like SOURCE_CHARACTER_EXCEPT_xxx are data type rules built up from single character terminals, and character ranges for the parts where there are no "syntactic characters".

OWS 	: WHITE* ;
WHITE 	: ( WSR | ML_COMMENT | SL_COMMENT) ;
WSR		: WS_CHAR+ ;
WS_CHAR : ( SP | CR | NL | TAB ) ;

ML_COMMENT 
	: SLASH ASTERISK SLASH*
	  ( SOURCE_CHARACTER_EXCEPT__ASTERISK_SLASH 
	  | SOURCE_CHARACTER_EXCEPT__ASTERISK  SLASH 
	  | ASTERISK SOURCE_CHARACTER_EXCEPT__SLASH)*
	  ASTERISK+ SLASH
	;
	
SL_COMMENT 
	: HASH 
		( ANY_DIGIT
		| ANY_LETTER
		| PUNCTUATION
		| SP | TAB
		)*
	  CR? NL 	
	;


AdditiveOperator : PLUS | MINUS ;	
AdditiveExpression returns pp::Expression:
	MultiplicativeExpression ({pp::AdditiveExpression.leftExpr=current} 
		opName = AdditiveOperator OWS
		rightExpr = MultiplicativeExpression)*
	;
Comment 3 Henrik Lindberg CLA 2011-03-14 18:45:59 EDT
I am fine with this issue being closed as a won't fix, as the general approach I was taking was a big workaround and the problems described in this issue was only the tip of an iceberg of other issues anyway.
Comment 4 Henrik Lindberg CLA 2011-03-14 19:01:24 EDT
Bug 335965 pretty much states what would be needed to deal with the real issue. I am closing this issue as a duplicate even if that issue is newer as it states more clearly what would be a useful feature (than the workaround that I tried in this issue).

*** This bug has been marked as a duplicate of bug 335965 ***