Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 483690

Summary: IdeJavaSourceOutputStream always uses default charset
Product: [Eclipse Project] JDT Reporter: Christian Stein <sormuras>
Component: APTAssignee: Jay Arthanareeswaran <jarthana>
Status: VERIFIED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: eclipse, jarthana, sormuras
Version: 4.4.2   
Target Milestone: 4.6 M7   
Hardware: PC   
OS: Windows 10   
See Also: https://git.eclipse.org/r/70723
https://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?id=95a8299e9a4dec93fbd5f1afcde73dc1781505f6
Whiteboard:

Description Christian Stein CLA 2015-12-04 11:53:01 EST
IdeJavaSourceOutputStream always uses default charset via the inherited ByteArrayOutputStream#toString() call in line 68: "this.toString()".

That discards all character encoding used via the IdeOutputJavaFileObject#openOutputStream() method, like: new OutputStreamWriter(openOutputStream(), encoding);

Possible solution: write the collected bytes "as is" to the file?
Comment 1 Christian Stein CLA 2015-12-04 12:00:44 EST
Link to source: https://git.eclipse.org/c/jdt/eclipse.jdt.core.git/tree/org.eclipse.jdt.apt.pluggable.core/src/org/eclipse/jdt/internal/apt/pluggable/core/filer/IdeJavaSourceOutputStream.java#n68

Workaround: add "-Dfile.encoding=UTF-8" to eclipse.ini solved the problem here.
Comment 2 Walter Harley CLA 2015-12-04 14:51:55 EST
It'd be nice to actually make use of the encoding info.  I think a fix for this would be worth considering.
Comment 3 Christian Stein CLA 2015-12-04 15:31:49 EST
There is no encoding info left at that level, iirc. 

The final write process should not re-encode the bytes it got from the processor. They are already encoded by an OutputStreamWriter.
Comment 4 Christian Stein CLA 2016-03-01 12:54:51 EST
As I can't see a quick way to write the collected bytes "as is", maybe using the default charset from IContainer might be a solution:

http://git.eclipse.org/c/platform/eclipse.platform.resources.git/tree/bundles/org.eclipse.core.resources/src/org/eclipse/core/resources/IContainer.java#n246

It could be retrieved via: "_env.getAptProject()..." right?
Comment 5 Jay Arthanareeswaran CLA 2016-04-14 13:23:43 EDT
(In reply to Christian Stein from comment #4)
> It could be retrieved via: "_env.getAptProject()..." right?

That would work. Only question is whether it is the AptProject or getJavaProject().
Comment 6 Eclipse Genie CLA 2016-04-15 02:54:03 EDT
New Gerrit change created: https://git.eclipse.org/r/70723
Comment 8 Jay Arthanareeswaran CLA 2016-04-16 03:03:50 EDT
Writing a consistent testcase proved to be extremely difficult for two reasons: Setting the file.encoding in the middle of testcase execution doesn't have any impact. And two, all system I/O operations that a testcase would depend on in writing a encoding specific String would now be affected and has to be encoded accordingly. I have verified that this works for a simple case in the IDE.
Comment 9 Christian Stein CLA 2016-04-16 03:29:01 EDT
Thanks Jay for fixing this. Having this work-around should solve most of the charset problems - but still does not honor the actual character encoding used via Annotation Processing API. See initial issue description.
Comment 10 Jay Arthanareeswaran CLA 2016-04-18 00:01:32 EDT
(In reply to Christian Stein from comment #9)
> Thanks Jay for fixing this. Having this work-around should solve most of the
> charset problems - but still does not honor the actual character encoding
> used via Annotation Processing API. See initial issue description.

Yes, I agree. While trying to write a testcase I found that there are few other area as well that have this problem. I will try to summarize this in a separate bug.
Comment 11 Sasikanth Bharadwaj CLA 2016-04-27 05:23:59 EDT
Verified for Neon M7 using I20160426-1615 build