| Summary: | support terminal fragments | ||
|---|---|---|---|
| Product: | [Modeling] TMF | Reporter: | Henrik Lindberg <henrik.lindberg> |
| Component: | Xtext | Assignee: | Project Inbox <tmf.xtext-inbox> |
| Status: | CLOSED DUPLICATE | QA Contact: | |
| Severity: | enhancement | ||
| Priority: | P3 | CC: | sebastian.zarnekow |
| Version: | 2.0.0 | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
Terminal fragments are available since 2.0M4. Please reopen if I missed your point. *** This bug has been marked as a duplicate of bug 327002 *** Great ! Was a miscommunication in the forum... I misread the comment about "not being planned" and thought it referred to terminal fragments - reading it again, I see it was about "lexer predicates". Sorry for the noise. From the forum... ---- Henrik, Marc, they are currently not planned since we never faced the need for them. Please file a feature request with a sample terminal rule that cannot be expressed with the currently available abstractions. Thanks, Sebastian -- Need professional support for Eclipse Modeling? Go visit: http://xtext.itemis.com Am 20.01.11 23:25, schrieb Henrik Lindberg: > I forgot to ask earlier, but I think lexer fragments are also in 2.0, > and I assume that those can be used with semantic predicates (like any > rule call) - if that is the case then a data rule with semantic > predicate and a lexer fragment would probably work. > > - henrik > > On 1/20/11 9:51 PM, Mark Christiaens wrote: >> Are predicates for the lexer planned? I have a lexer rule that could use >> it. |
Support for antlr's terminal fragments would be very useful to deal with grammars where lexical elements are not context free. It is my understanding that a terminal fragment would result in a more efficient implementation. As an example - a typical 'hash' single comment as a terminal or terminal fragment: terminal SL_COMMENT : '#' .* ('\r'? '\n') ; The same as a data rule: SL_COMMENT : HASH ( ANY_DIGIT | ANY_LETTER | PUNCTUATION | SP | TAB | NBSP )* CR? // CR optional depending on file format NL ; Where ANY_DIGIT, ANY_LETTER, PUNCTUATION etc. lists all possible characters. Strings, and Strings with complex escapes and expression interpolation become quite complex to handle as there is no ability to use NOT (. and ranges) to efficiently "slurp" characters thus requiring characters to be organized in various groups that include/exclude 'terminating' characters appropriately. The result is quite inefficient as the parser needs to deal with character tokens individually. I may have misunderstood the how terminal fragments can be used, but to me they look like "getting a token on demand from the lexer" and seem ideal for solving the issue.