Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 355175

Summary: [Serializer] Wrong Generation of Whitespace from EMF Models
Product: [Modeling] TMF Reporter: bbraatz <benjamin.braatz>
Component: XtextAssignee: Project Inbox <tmf.xtext-inbox>
Status: CLOSED WORKSFORME QA Contact:
Severity: normal    
Priority: P3 CC: sebastian.zarnekow
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard:

Description bbraatz CLA 2011-08-18 18:36:40 EDT
Build Identifier: I20110613-1736

When serialising from a model that was not parsed from a textual DSL file, but generated programmatically (and, hence, does not contain information about hidden whitespace), we encountered a problem, where syntactically wrong whitespace is generated by Xtext's serialiser.

Consider the simple DSL specified by the following MyDsl.xtext:
grammar org.xtext.example.mydsl.MyDsl

hidden(WS)
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"

Model:
        function_call+=function_call*;
        
function_call:
        'fun' parameter NEWLINE;

parameter:
        {parameter} '(' par=('par')? ')';

terminal NEWLINE:
        '\r'|'\n'|'\r\n'|'\n\r';

terminal WS:
        (' '|'\t'|'\\\n'|'\\\r'|'\\\r\n'|'\\\n\r')?;

We use the following small Java program to read in the XMI representation in input.xmi and serialise it to output.mydsl:
package org.xtext.example.mydsl;

import java.io.IOException;

import org.eclipse.emf.common.util.URI;
import org.eclipse.emf.ecore.resource.Resource;
import org.eclipse.emf.ecore.resource.ResourceSet;
import org.eclipse.emf.ecore.resource.impl.ResourceSetImpl;
import org.eclipse.emf.mwe.utils.StandaloneSetup;

import org.xtext.example.mydsl.MyDslStandaloneSetup;
import org.xtext.example.mydsl.myDsl.Model;

public class Test {
        public static void main(String[] args) {
                String input = "input.xmi";
                String output = "output.mydsl";
                String uriPath = "platform:/resource/org.xtext.example.mydsl/";
                URI inURI = URI.createURI(uriPath + input);
                URI outURI = URI.createURI(uriPath + output);
                
                new StandaloneSetup().setPlatformUri("../");
                new MyDslStandaloneSetup().createInjectorAndDoEMFRegistration();
                ResourceSet resourceSet = new ResourceSetImpl();
                
                Resource inres = resourceSet.getResource(inURI, true);
                Model model = (Model) inres.getContents().get(0);
                
                Resource outres = resourceSet.createResource(outURI);
                outres.getContents().add(model);
                try {
                        outres.save(null);
                } catch (IOException e) {
                        e.printStackTrace();
                }
        }
}

input.xmi:
<?xml version="1.0" encoding="ASCII"?>
<myDsl:Model xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:myDsl="http://www.xtext.org/example/mydsl/MyDsl">
  <function_call xsi:type="myDsl:parameter"/>
  <function_call xsi:type="myDsl:parameter" par="par"/>
  <function_call xsi:type="myDsl:parameter"/>
  <function_call xsi:type="myDsl:parameter" par="par"/>
  <function_call xsi:type="myDsl:parameter"/>
  <function_call xsi:type="myDsl:parameter" par="par"/>
  <function_call xsi:type="myDsl:parameter"/>
  <function_call xsi:type="myDsl:parameter" par="par"/>
  <function_call xsi:type="myDsl:parameter"/>
  <function_call xsi:type="myDsl:parameter" par="par"/>
  <function_call xsi:type="myDsl:parameter"/>
  <function_call xsi:type="myDsl:parameter" par="par"/>
  <function_call xsi:type="myDsl:parameter"/>
  <function_call xsi:type="myDsl:parameter" par="par"/>
</myDsl:Model>

output.mydsl:
fun ( ) 
 fun ( par ) 
 fun ( ) 
 fun ( par ) 
 fun ( ) 
 fun ( par ) 
 fun ( )

 fun ( par ) 
 fun ( ) 
 fun ( par ) 
 fun ( ) 
 fun ( par ) 
 fun ( ) 
 fun (
par ) 


Whitespace is generated at every possible position and finally, in the last function call, a newline is inserted inside the parantheses for the parameter, where it is not allowed by the grammar. Moreover, the extra newline between the seventh and eight function call is also not allowed by the grammar.

Oddly enough, this does not happen when we change the definition of the terminal NEWLINE from "'\r'|'\n'|'\r\n'|'\n\r'" to "'\n'|'\r'|'\r\n'|'\n\r'" (but spaces are still inserted at every possible position). FWIW, the serialised NEWLINE that is necessary at the end of every function call changes from CR to LF (i.e., the first alternative in NEWLINE is used), while the wrong linebreaks are LFs already in the first case.

In my opinion, hidden, unnecessary whitespace should not be generated during serialisation at all, but generating whitespace that breaks the grammar, is a bug in any case, isn't it?

Reproducible: Always

Steps to Reproduce:
1. Use the above Xtext grammar in a new Xtext project and generate.
2. Copy the above Java class Test into the org.xtext.example.mydsl package and input.xmi into the root of the project org.xtext.example.mydsl.
3. Run the Test Java class.
Comment 1 Sebastian Zarnekow CLA 2011-10-17 16:47:36 EDT
Did you implement the UnassignedTokenSerializer for the NEWLINE usage in function_call?
Comment 2 Sven Efftinge CLA 2012-11-23 02:39:05 EST
Closing it - no feedback.
Comment 3 Eclipse Webmaster CLA 2017-10-31 10:46:22 EDT
Requested via bug 522520.

-M.
Comment 4 Eclipse Webmaster CLA 2017-10-31 10:57:36 EDT
Requested via bug 522520.

-M.