| Summary: | DBCS3.7 DBCS (shift JIS) characters are corrupted in Outline view of ANT | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | Kentaroh Noji <kennoji> | ||||||||||||
| Component: | Ant | Assignee: | Michael Rennie <Michael_Rennie> | ||||||||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||||||||
| Severity: | normal | ||||||||||||||
| Priority: | P3 | CC: | camle, curtis.windatt.public, kitlo, Michael_Rennie, pwebster | ||||||||||||
| Version: | 4.1 | Flags: | curtis.windatt.public:
review+
|
||||||||||||
| Target Milestone: | 3.7 RC1 | ||||||||||||||
| Hardware: | PC | ||||||||||||||
| OS: | Windows 7 | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Attachments: |
|
||||||||||||||
|
Description
Kentaroh Noji
Created attachment 194225 [details]
Screen capture
Created attachment 194226 [details]
Ant source file in Shift JIS
Created attachment 194227 [details]
Ant source file in UTF-8
I reproduced this on 3.7 as well as 4.1.
The problem comes from this code in ProjectHelper:
if (source instanceof String) {
stream = new ByteArrayInputStream(((String)source).getBytes("UTF-8")); //$NON-NLS-1$
inputSource = new InputSource(stream);
}
so when we parse the XML file the ant model has the strings with the wrong encoding as well.
Created attachment 194990 [details]
proposed fix
Created a new utility method to try and compute the encoding for the build file prior to it being parsed into the Ant model, if detection fails it will fall back to UTF-8.
smoke tested against workspace and remote (out of workspace) build files.
Steps to test:
1. open the shift JIS file from any project in your workspace - the outline should match the editor
2. open the shift JIS file from any location outside of your workspace - the outline should match the editor
Curtis please try the patch +1 approach is good and problem is fixed for me with both workspace and external files. Created attachment 195093 [details]
better
Thinking about the other patch, I found that we end up creating / disconnecting a text file buffer each time we want either the backing IDocument for a build file or the encoding (in the external from workspace case). So I moved the fix into AntModel so its encoding can be cached, which saves a lot of work and the creation of unnecessary text file buffers.
Curtis please re-verify the new patch. The logic is the same, just in AntModel + cached. +1 New fix operates the same and the caching should be beneficial. applied patch to HEAD |