Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 369476 - [Texo] Encoding of generated code is not correct
Summary: [Texo] Encoding of generated code is not correct
Status: RESOLVED FIXED
Alias: None
Product: EMFT
Classification: Modeling
Component: Texo (show other bugs)
Version: unspecified   Edit
Hardware: Macintosh Mac OS X
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Martin Taal CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-24 02:30 EST by Peter Kullmann CLA
Modified: 2012-01-31 02:43 EST (History)
0 users

See Also:


Attachments
example model (594 bytes, application/octet-stream)
2012-01-30 05:41 EST, Martin Taal CLA
no flags Details
java file with umlauts (587 bytes, text/x-java)
2012-01-30 05:42 EST, Martin Taal CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Kullmann CLA 2012-01-24 02:30:59 EST
I'm using Texo 0.1.0.v201112171308. Some german umlauts in my model are not correctly encoded in the generated code. For example the character "ü" in the model is represented by "√º" in the generated code (my project uses UTF-8 encoding). I think it worked correctly not long ago (say 2 months or so) but I'm not sure.
Comment 1 Martin Taal CLA 2012-01-26 14:11:56 EST
Hi Peter,
Where is this text generated? This is an old issue which should be solved. I have a testcase which generates a russian text in a comment and that works fine.

Do you have an example ecore I can use?

gr. Martin
Comment 2 Peter Kullmann CLA 2012-01-27 03:22:38 EST
Hi Martin, I have seen now that it only happens after the initial generation:
1. ecore with comment "äöü"  
2. Texo generates <!-- begin-model-doc -->äöü
3. The next generation produces garbage from the umlauts
Every new generation will add garbage to the umlauts 

Starting with "ü" it goes like this: 1st generation: "ü", 2nd generation "√º", 3rd generation "‚àö¬∫", 4th generation "‚Äö√†√∂¬¨‚à´" and so on...

So, I assume it is not a texo problem at all but a general emf generator problem.
Comment 3 Martin Taal CLA 2012-01-30 05:41:10 EST
Created attachment 210255 [details]
example model
Comment 4 Martin Taal CLA 2012-01-30 05:41:56 EST
Hi Peter,
I could not reproduce it using texo model generation. See the attached model. I regenerated the model several times but the umlauts are still there, correctly placed in the source code. 

gr. Martin
Comment 5 Martin Taal CLA 2012-01-30 05:42:17 EST
Created attachment 210256 [details]
java file with umlauts
Comment 6 Martin Taal CLA 2012-01-30 05:43:16 EST
Could it be the file encoding specified for your workspace? See the workspace preferences, mine are set at UTF-8. Texo will generate utf-8 encoding.
Comment 7 Peter Kullmann CLA 2012-01-30 05:53:38 EST
My Workspace encoding is set to MacRoman (default), but the project encoding is set to UTF-8. I have tried setting the workspace encoding to UTF-8 but it didn't help. 

I'm going to try out your example in my workspace.
Comment 8 Peter Kullmann CLA 2012-01-30 06:31:27 EST
Your example behaves the same way as my ecore in my workspace: The first generation was ok and the second generation produced this:

/**
 * A representation of the model object '<em><b>Test</b></em>'. <!--
 * begin-user-doc --> <!-- end-user-doc --> <!-- begin-model-doc --> äöü
 * äöü äöü äöü <!-- end-model-doc -->
 * 
 * @generated
 */
Comment 9 Martin Taal CLA 2012-01-30 06:45:04 EST
I have really no idea why this fails :-(

For the regeneration, normally the file should not be touched again, as nothing changed. But again I have really no idea why this fails with you and works with me.

Can you somehow check the encoding of the file itself (it is utf-8)? Through another editor maybe..

gr. Martin
Comment 10 Peter Kullmann CLA 2012-01-30 11:03:23 EST
I checked the file. It really is UTF-8 encoded. 

I did another test with your ecore: In a new workspace, I set the workspace encoding to UTF-8, created a new empty emf project, copied your ecore file to the model folder and generated the model code with the texo menu. The generated Test.java had the umlauts. Another generation destroyed the umlauts again.

I then created a genmodel for the ecore, generated the classical emf java sources and regenerated them. Here the umlauts are preserved. 

I think the merging generator needs to read the file and decide what to write. Perhaps in the reading step the encoding is not respected?
Comment 11 Martin Taal CLA 2012-01-30 14:07:36 EST
Indeed the merger read the original file without encoding. I changed this and published a new build. Can you retry with this latest build?

gr. Martin
Comment 12 Peter Kullmann CLA 2012-01-31 02:43:37 EST
Thanks a lot. It's working perfectly now.
Best regards,
Peter