Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 314735 - package.bld.xml should support more than just UTF-8 encodings
Summary: package.bld.xml should support more than just UTF-8 encodings
Status: CLOSED FIXED
Alias: None
Product: RTSC
Classification: Technology
Component: Tools (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Dave Russo CLA
QA Contact:
URL:
Whiteboard: target:3.30
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-27 13:41 EDT by Dave Russo CLA
Modified: 2015-02-14 22:39 EST (History)
2 users (show)

See Also:


Attachments
the full package.bld.xml that illustrates the problem (2.36 KB, text/xml)
2010-05-27 13:41 EDT, Dave Russo CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Russo CLA 2010-05-27 13:41:01 EDT
Created attachment 170233 [details]
the full package.bld.xml that illustrates the problem

I want to auto change the default charset in package.bld.xml  at  line 1 automantic when compile the source code. the default charset is UTF-8.

    <?xml version="1.0" encoding="UTF-8"?>

This make the compile process break .  because my source code path name have multibytes words.
If I manual change the encode type to my local charset. the compile could be success.

The full error message as follow:

[Fatal Error] package.bld.xml:10:49: Invalid byte 2 of 2-byte UTF-8 sequence.
org.xml.sax.SAXParseException: Invalid byte 2 of 2-byte UTF-8 sequence.
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:264)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:292)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:146)
        at xdc.services.intern.gen.JClass.genPkgBuild(JClass.java:852)
        at xdc.services.intern.gen.JClass.genPkgValues(JClass.java:1009)
        at xdc.services.intern.gen.JClass.gen(JClass.java:194)
        at xdc.services.intern.cmd.Builder.gen(Builder.java:235)
        at xdc.services.intern.cmd.Builder.main(Builder.java:139)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:155)
        at org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:243)
        at org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:3237)
        at org.mozilla.javascript.Interpreter.interpret(Interpreter.java:2394)
        at org.mozilla.javascript.InterpretedFunction.exec(InterpretedFunction.java:176)
        at org.mozilla.javascript.Context.evaluateReader(Context.java:1227)
        at config.Shell.evaluateLoad(Shell.java:789)
        at config.Shell.processLoad(Shell.java:672)
        at config.Shell.load(Shell.java:1229)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:155)
        at org.mozilla.javascript.FunctionObject.call(FunctionObject.java:411)
        at org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:3237)
        at org.mozilla.javascript.Interpreter.interpret(Interpreter.java:2394)
        at org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:162)
        at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:393)
        at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:2834)
        at org.mozilla.javascript.InterpretedFunction.exec(InterpretedFunction.java:173)
        at org.mozilla.javascript.Context.evaluateReader(Context.java:1227)
        at config.Shell.evaluateReader(Shell.java:830)
        at config.Shell.processReader(Shell.java:499)
        at config.Shell.processFile(Shell.java:561)
        at config.Shell.exec(Shell.java:773)
        at config.Shell.main(Shell.java:1376)
gmake: *** [package/package.xdc.inc] Error 1
Comment 1 Dave Russo CLA 2010-05-27 13:42:16 EDT
The user's local settings:

# locale
LANG=zh_CN
LC_CTYPE="zh_CN"
LC_NUMERIC="zh_CN"
LC_TIME="zh_CN"
LC_COLLATE="zh_CN"
LC_MONETARY="zh_CN"
LC_MESSAGES="zh_CN"
LC_PAPER="zh_CN"
LC_NAME="zh_CN"
LC_ADDRESS="zh_CN"
LC_TELEPHONE="zh_CN"
LC_MEASUREMENT="zh_CN"
LC_IDENTIFICATION="zh_CN"
LC_ALL=
Comment 2 Dave Russo CLA 2010-05-28 13:03:09 EDT
The generation of XML files always writes the encoding as UTF-8 but should always write the encoding that is used by FileWriter.  See FileWriter.getEncoding().
Comment 3 Dave Russo CLA 2010-07-12 21:00:33 EDT
The following files need to be reviewed and "fixed":
    xdc/bld/_gen.xs
    xdc/bld/_xml.xs
    xdc/bld/rel.tci
    xdc/cfg/Main.xs
    xdc/services/intern/gen/Doc.java
    xdc/tools/cdoc/Toc.xs
    xdc/tools/cdoc/files/toc.xsl
Comment 4 Dave Russo CLA 2010-07-14 15:09:48 EDT
"fixed" everything except cdoc related XML files by using 
    var encoding = java.nio.charset.Charset.defaultCharset().name();
instead of getEncoding() due to some parsers not understanding/supporting "historical" names.

The cdoc files 
    xdc/services/intern/gen/Doc.java
    xdc/tools/cdoc/Toc.xs
    xdc/tools/cdoc/files/toc.xsl
should be self contained and can be handled separately (if necessary).

References:
Getting Java default charset encoding:
    http://www.rgagnon.com/javadetails/java-0505.html
  XML encoding faq:
    http://www.opentag.com/xfaq_enc.htm

The java.nio.charset.Charset docs that explains this:
  http://download.oracle.com/docs/cd/E17409_01/javase/6/docs/api/java/nio/charset/Charset.html

JVM supported encodings and map between official and historical names:
  http://download.oracle.com/docs/cd/E17409_01/javase/6/docs/technotes/guides/intl/encoding.doc.html

Official names for encodings:
  http://www.iana.org/assignments/character-sets
Comment 5 Dave Russo CLA 2010-07-14 15:15:36 EDT
fixed in r1055 (xdctools 3.21.x)
Comment 6 Dave Russo CLA 2014-01-22 17:25:22 EST
this bug was fixed some time ago and needs to be validated.
Comment 7 Sasha Slijepcevic CLA 2014-03-17 14:51:16 EDT
I verified the fix by setting LANG to "zh_CN". To replicate the bug I used XDCtools 3.20.08.88 to build a package. That version set the encoding in package.bld.xml to "UTF-8". Then I tried XDCtools 3.21.00.55, and the encoding in that file was set to "GB2312", which is a character set for Chinese language.
Comment 8 Dave Russo CLA 2015-02-14 22:39:14 EST
clean out old verified bugs