Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 341815

Summary: [serializer] NodeModel's nodes sometimes not correctly associated with ParseTreeConstructors tokens.
Product: [Modeling] TMF Reporter: Moritz Eysholdt <moritz.eysholdt>
Component: XtextAssignee: Project Inbox <tmf.xtext-inbox>
Status: CLOSED WONTFIX QA Contact:
Severity: normal    
Priority: P3 CC: knut.wannheden, sven.efftinge
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Mac OS X - Carbon (unsup.)   
Whiteboard:
Bug Depends on: 333976    
Bug Blocks:    

Description Moritz Eysholdt CLA 2011-04-04 12:02:09 EDT
As reported in bug 340166, NodeModel's nodes are sometimes not correctly associated with the Tokens created by the ParseTreeConstructor. This seems to occur with complex expression grammars. 

The consequence of this bug is that when you run resource.save() with formatting being turned off, some whitespaces and comments may not be preserved. Furhthermore, intentionally ambiguous tokens (e.g. referenced objects that are visible under two names) may be replaced with their default.

Formatting in the editor (CTRL+SHIFT+F) is not affected by this.
Comment 1 Knut Wannheden CLA 2011-05-11 07:38:28 EDT
I have a case (using Xtext 2.0M7) where the serialization (with preserved whitespace enabled) results in spurious linewraps in the serialized result. The formatter has in these cases configured optional linewraps using e.g. setLinewrap(0, 1, 1).

I traced the problem to AbstractPrseTreeConstructor$WsMergerStream#writeComment(ILeafNode). Here writeWhitespacesSince(INode) is called, but it seems like at this point the nodeIterator field is out of sync: It's next node is a whitespace node two nodes earlier in the stream (with a non whitespace node inbetween). As a result no preserved whitespace is written and the configured default linewrap kicks in during formatting. I didn't take the analysis any further.

I don't quite understand what I'm describing here is what this bug is about. If it helps I can try to boil the grammar and formatting config down to a simple and easy to reproduce example. I can also file a new bug if you like.
Comment 2 Moritz Eysholdt CLA 2011-05-11 08:10:54 EDT
Hi Knut,

thx for the feedback, but the ParseTreeConstructur is a component that I hope to deprecate pretty soon. One reason is that association the old node model with the newly serialized text got too complex because we've been using "the wrong" abstractions. That's why I started to re-implement the serializer from ground up. The new one still has some glitches but it manages to serialize Xbase and Xtend2. Further details in bug 333976 and bug 342823.

If you feel like trying it, you can replace the "ParseTreeConstructorFragment" with the "SerializerFragment" in your workflow. And if you have feedback on it, I'd be happy to know about it and/or help out.
Comment 3 Knut Wannheden CLA 2011-05-11 09:32:45 EDT
Hi Moritz

I've just updated my workspace to the latest nightly build (201105101404; since then you've already made some more changes) and replaced the ParseTreeConstructorFragment with the new SerializerFragment.

There are a few minor problems with the code generation for which I will file bugs later today.

Also it seems like there is some kind of performance problem (or possibly infinite loop): My grammar is not overly complex, but when I call ISerlializer#serialize() it seems to get stuck in GenericSemanticSequencer#applydeterministicQuantities() (note the lowercase d in the method name) forever (I left the computer running a few minutes...).

Is that something you already know about? Should I open an issue for that with an example grammar?
Comment 4 Knut Wannheden CLA 2011-05-12 05:09:51 EDT
(In reply to comment #2)
> 
> thx for the feedback, but the ParseTreeConstructur is a component that I hope
> to deprecate pretty soon. One reason is that association the old node model
> with the newly serialized text got too complex because we've been using "the
> wrong" abstractions. That's why I started to re-implement the serializer from
> ground up. The new one still has some glitches but it manages to serialize
> Xbase and Xtend2. Further details in bug 333976 and bug 342823.
> 
> If you feel like trying it, you can replace the "ParseTreeConstructorFragment"
> with the "SerializerFragment" in your workflow. And if you have feedback on it,
> I'd be happy to know about it and/or help out.

I suppose the ParseTreeConstructor will be dereleased after the Xtext 2.0 release. Is that correct? If yes, do you intend to make bug fixes (such as for the regression described in comment #1) to it as part of Xtext 2.0?

The reason I'm asking is that the new serializer still seems to be quite flaky. Apart from the infinite looping (as described in comment #3) there also seem to be other differences which can cause formatting regressions. I will of course file corresponding issues for these problems.
Comment 5 Sven Efftinge CLA 2011-05-15 03:45:12 EDT
(In reply to comment #4)
> I suppose the ParseTreeConstructor will be dereleased after the Xtext 2.0
> release. Is that correct?

We will mark the old one deprecated as soon as the new implementation is ready for production.
Likely next summer.

> If yes, do you intend to make bug fixes (such as for
> the regression described in comment #1) to it as part of Xtext 2.0?

Generally yes, if they are show stoppers and a fix wouldn't mean to overhaul the whole architecture.
@Moritz Could you please estimate what a fix for the mentioned regression would take?

> The reason I'm asking is that the new serializer still seems to be quite flaky.

Yes, it still heavily under works and also for the 2.0 release it is meant to be something like an alpha version. But it's getting better and better and fixes in that area will get much higher precedence though ;-) I hope you can switch to the new serializer in a couple of months.

> Apart from the infinite looping (as described in comment #3) there also seem to
> be other differences which can cause formatting regressions. I will of course
> file corresponding issues for these problems.

This is very helpful. Thank you Knut!
Comment 6 Moritz Eysholdt CLA 2011-05-16 05:02:57 EDT
(In reply to comment #1)
> I have a case (using Xtext 2.0M7) where the serialization (with preserved
> whitespace enabled) results in spurious linewraps in the serialized result. The
> formatter has in these cases configured optional linewraps using e.g.
> setLinewrap(0, 1, 1).


After reading your comment again, I noticed that you specified "1" as the second parameter for setLineWrap(...). This tells the formatter to insert a line warp in case there is no node model (or in case the nodes are not associated correctly with the new tokens). So the behavior of the formatter is as expected. Is setLinewrap(0, 0, 1) and option for you?
Comment 7 Moritz Eysholdt CLA 2011-05-16 05:13:12 EDT
(In reply to comment #3)
> Hi Moritz
> 
> I've just updated my workspace to the latest nightly build (201105101404; since
> then you've already made some more changes) and replaced the
> ParseTreeConstructorFragment with the new SerializerFragment.
> 
> There are a few minor problems with the code generation for which I will file
> bugs later today.

Thanks a lot for the feedback - this is really helpful!

> Also it seems like there is some kind of performance problem (or possibly
> infinite loop): My grammar is not overly complex, but when I call
> ISerlializer#serialize() it seems to get stuck in
> GenericSemanticSequencer#applydeterministicQuantities() (note the lowercase d
> in the method name) forever (I left the computer running a few minutes...).
> 
> Is that something you already know about? Should I open an issue for that with
> an example grammar?

Yes, that would be great if you could attach an example grammar for this. It might be sufficient to know the value of "constraint.toString()" from org.eclipse.xtext.serializer.impl.GenericSequencer.createSequence(...). This is right below in the stack trace before GenericSemanticSequencer#applydeterministicQuantities() is hit.
Comment 8 Moritz Eysholdt CLA 2011-05-16 05:17:06 EDT
(In reply to comment #4)

> I suppose the ParseTreeConstructor will be dereleased after the Xtext 2.0
> release. Is that correct? If yes, do you intend to make bug fixes (such as for
> the regression described in comment #1) to it as part of Xtext 2.0?

If you can provide a grammar snippet that allows to reproduce this I'll take a look at it.
Comment 9 Knut Wannheden CLA 2011-05-16 05:19:37 EDT
(In reply to comment #6)
> After reading your comment again, I noticed that you specified "1" as the
> second parameter for setLineWrap(...). This tells the formatter to insert a
> line warp in case there is no node model (or in case the nodes are not
> associated correctly with the new tokens). So the behavior of the formatter is
> as expected. Is setLinewrap(0, 0, 1) and option for you?

The result is not quite as expected: There is a node model available. I assume the problem is that it isn't correctly associated (as you say). This however works fine in Xtext 1.0.

Still I find my observations concerning AbstractParseTreeConstructor$WsMergerStream#writeComment(ILeafNode) curious. It looked like the parse tree constructor was out of sync wrt the tokens already written.

Let me know if a grammar to reproduce the problem would help or if I should file a new ticket for this.
Comment 10 Moritz Eysholdt CLA 2011-05-16 05:23:08 EDT
(In reply to comment #9)
> The result is not quite as expected: There is a node model available. I assume
> the problem is that it isn't correctly associated (as you say). This however
> works fine in Xtext 1.0.
> 
> Still I find my observations concerning
> AbstractParseTreeConstructor$WsMergerStream#writeComment(ILeafNode) curious. It
> looked like the parse tree constructor was out of sync wrt the tokens already
> written.
> 
> Let me know if a grammar to reproduce the problem would help or if I should
> file a new ticket for this.

Could you open a new bug and attach a grammar? Because I assume that even if I fix this regression this won't fix the issue in general.
Comment 11 Moritz Eysholdt CLA 2011-05-16 06:11:58 EDT
(In reply to comment #3)

> Also it seems like there is some kind of performance problem (or possibly
> infinite loop): My grammar is not overly complex, but when I call
> ISerlializer#serialize() it seems to get stuck in
> GenericSemanticSequencer#applydeterministicQuantities() (note the lowercase d
> in the method name) forever (I left the computer running a few minutes...).


this issue turn out to be of the same complexity as 
org.eclipse.xtext.xbase.serializer.XbaseSemanticSequencer.sequence_XAdditiveExpression_XBinaryOperation(EObject,
XBinaryOperation)
Comment 12 Moritz Eysholdt CLA 2011-05-18 13:57:13 EDT
(In reply to comment #1)
> I have a case (using Xtext 2.0M7) where the serialization (with preserved
> whitespace enabled) results in spurious linewraps in the serialized result. The
> formatter has in these cases configured optional linewraps using e.g.
> setLinewrap(0, 1, 1).
> 
> I traced the problem to
> AbstractPrseTreeConstructor$WsMergerStream#writeComment(ILeafNode). Here
> writeWhitespacesSince(INode) is called, but it seems like at this point the
> nodeIterator field is out of sync: It's next node is a whitespace node two
> nodes earlier in the stream (with a non whitespace node inbetween). As a result
> no preserved whitespace is written and the configured default linewrap kicks in
> during formatting. I didn't take the analysis any further.

http://git.eclipse.org/c/tmf/org.eclipse.xtext.git/commit/?id=a5a8f5f61706c29e8c1f50855f48671b2d200d24
fixes the bug described by Knut.

This bug reamins open since it is of a more general issue.
Comment 13 Moritz Eysholdt CLA 2011-10-18 04:33:14 EDT
Closing as "won't fix" since the new serializer handles the node model properly and I do not indent to invest more time into the ParseTreeConstructor which will be deprecated soon.