Community
Participate
Working Groups
Created attachment 170233 [details] the full package.bld.xml that illustrates the problem I want to auto change the default charset in package.bld.xml at line 1 automantic when compile the source code. the default charset is UTF-8. <?xml version="1.0" encoding="UTF-8"?> This make the compile process break . because my source code path name have multibytes words. If I manual change the encode type to my local charset. the compile could be success. The full error message as follow: [Fatal Error] package.bld.xml:10:49: Invalid byte 2 of 2-byte UTF-8 sequence. org.xml.sax.SAXParseException: Invalid byte 2 of 2-byte UTF-8 sequence. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:264) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:292) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:146) at xdc.services.intern.gen.JClass.genPkgBuild(JClass.java:852) at xdc.services.intern.gen.JClass.genPkgValues(JClass.java:1009) at xdc.services.intern.gen.JClass.gen(JClass.java:194) at xdc.services.intern.cmd.Builder.gen(Builder.java:235) at xdc.services.intern.cmd.Builder.main(Builder.java:139) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:155) at org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:243) at org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:3237) at org.mozilla.javascript.Interpreter.interpret(Interpreter.java:2394) at org.mozilla.javascript.InterpretedFunction.exec(InterpretedFunction.java:176) at org.mozilla.javascript.Context.evaluateReader(Context.java:1227) at config.Shell.evaluateLoad(Shell.java:789) at config.Shell.processLoad(Shell.java:672) at config.Shell.load(Shell.java:1229) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:155) at org.mozilla.javascript.FunctionObject.call(FunctionObject.java:411) at org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:3237) at org.mozilla.javascript.Interpreter.interpret(Interpreter.java:2394) at org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:162) at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:393) at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:2834) at org.mozilla.javascript.InterpretedFunction.exec(InterpretedFunction.java:173) at org.mozilla.javascript.Context.evaluateReader(Context.java:1227) at config.Shell.evaluateReader(Shell.java:830) at config.Shell.processReader(Shell.java:499) at config.Shell.processFile(Shell.java:561) at config.Shell.exec(Shell.java:773) at config.Shell.main(Shell.java:1376) gmake: *** [package/package.xdc.inc] Error 1
The user's local settings: # locale LANG=zh_CN LC_CTYPE="zh_CN" LC_NUMERIC="zh_CN" LC_TIME="zh_CN" LC_COLLATE="zh_CN" LC_MONETARY="zh_CN" LC_MESSAGES="zh_CN" LC_PAPER="zh_CN" LC_NAME="zh_CN" LC_ADDRESS="zh_CN" LC_TELEPHONE="zh_CN" LC_MEASUREMENT="zh_CN" LC_IDENTIFICATION="zh_CN" LC_ALL=
The generation of XML files always writes the encoding as UTF-8 but should always write the encoding that is used by FileWriter. See FileWriter.getEncoding().
The following files need to be reviewed and "fixed": xdc/bld/_gen.xs xdc/bld/_xml.xs xdc/bld/rel.tci xdc/cfg/Main.xs xdc/services/intern/gen/Doc.java xdc/tools/cdoc/Toc.xs xdc/tools/cdoc/files/toc.xsl
"fixed" everything except cdoc related XML files by using var encoding = java.nio.charset.Charset.defaultCharset().name(); instead of getEncoding() due to some parsers not understanding/supporting "historical" names. The cdoc files xdc/services/intern/gen/Doc.java xdc/tools/cdoc/Toc.xs xdc/tools/cdoc/files/toc.xsl should be self contained and can be handled separately (if necessary). References: Getting Java default charset encoding: http://www.rgagnon.com/javadetails/java-0505.html XML encoding faq: http://www.opentag.com/xfaq_enc.htm The java.nio.charset.Charset docs that explains this: http://download.oracle.com/docs/cd/E17409_01/javase/6/docs/api/java/nio/charset/Charset.html JVM supported encodings and map between official and historical names: http://download.oracle.com/docs/cd/E17409_01/javase/6/docs/technotes/guides/intl/encoding.doc.html Official names for encodings: http://www.iana.org/assignments/character-sets
fixed in r1055 (xdctools 3.21.x)
this bug was fixed some time ago and needs to be validated.
I verified the fix by setting LANG to "zh_CN". To replicate the bug I used XDCtools 3.20.08.88 to build a package. That version set the encoding in package.bld.xml to "UTF-8". Then I tried XDCtools 3.21.00.55, and the encoding in that file was set to "GB2312", which is a character set for Chinese language.
clean out old verified bugs