Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 344066 - DBCS3.7 DBCS (shift JIS) characters are corrupted in Outline view of ANT
Summary: DBCS3.7 DBCS (shift JIS) characters are corrupted in Outline view of ANT
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Ant (show other bugs)
Version: 4.1   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: 3.7 RC1   Edit
Assignee: Michael Rennie CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-27 23:06 EDT by Kentaroh Noji CLA
Modified: 2011-05-09 18:53 EDT (History)
5 users (show)

See Also:
curtis.windatt.public: review+


Attachments
Screen capture (96.89 KB, image/png)
2011-04-27 23:15 EDT, Kentaroh Noji CLA
no flags Details
Ant source file in Shift JIS (386 bytes, text/xml)
2011-04-27 23:15 EDT, Kentaroh Noji CLA
no flags Details
Ant source file in UTF-8 (410 bytes, text/xml)
2011-04-27 23:16 EDT, Kentaroh Noji CLA
no flags Details
proposed fix (6.63 KB, patch)
2011-05-06 16:54 EDT, Michael Rennie CLA
no flags Details | Diff
better (7.24 KB, patch)
2011-05-09 11:49 EDT, Michael Rennie CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kentaroh Noji CLA 2011-04-27 23:06:43 EDT
Build Identifier: I20110412-2200

If the encoding of ANT source files is Shift_JIS, DBCS characters in outline view are corrupted. If the encoding is UTF-8, the DBCS characters display correctly. I will attach the ANT files in Shift JIS and UTF-8.  

JDK: java full version "JRE 1.6.0 IBM Windows AMD 64 build pwa6460sr9fp1-20110208_03 (SR9 FP1)"
OS: Windows 7 SP7 Japanese edition. 

Reproducible: Always

Steps to Reproduce:
1.Copy ANT files attached. 
2.Open the ANT files.  
3.Open Outline view for the ANT files. Window > Show View > Outline
Comment 1 Kentaroh Noji CLA 2011-04-27 23:15:02 EDT
Created attachment 194225 [details]
Screen capture
Comment 2 Kentaroh Noji CLA 2011-04-27 23:15:59 EDT
Created attachment 194226 [details]
Ant source file in Shift JIS
Comment 3 Kentaroh Noji CLA 2011-04-27 23:16:46 EDT
Created attachment 194227 [details]
Ant source file in UTF-8
Comment 4 Michael Rennie CLA 2011-05-06 10:56:30 EDT
I reproduced this on 3.7 as well as 4.1.

The problem comes from this code in ProjectHelper:

if (source instanceof String) {
  stream = new ByteArrayInputStream(((String)source).getBytes("UTF-8")); //$NON-NLS-1$
  inputSource = new InputSource(stream);
}

so when we parse the XML file the ant model has the strings with the wrong encoding as well.
Comment 5 Michael Rennie CLA 2011-05-06 16:54:32 EDT
Created attachment 194990 [details]
proposed fix

Created a new utility method to try and compute the encoding for the build file prior to it being parsed into the Ant model, if detection fails it will fall back to UTF-8.

smoke tested against workspace and remote (out of workspace) build files.

Steps to test:
1. open the shift JIS file from any project in your workspace - the outline should match the editor
2. open the shift JIS file from any location outside of your workspace - the outline should match the editor
Comment 6 Michael Rennie CLA 2011-05-06 16:55:47 EDT
Curtis please try the patch
Comment 7 Curtis Windatt CLA 2011-05-06 17:15:24 EDT
+1 approach is good and problem is fixed for me with both workspace and external files.
Comment 8 Michael Rennie CLA 2011-05-09 11:49:18 EDT
Created attachment 195093 [details]
better

Thinking about the other patch, I found that we end up creating / disconnecting a text file buffer each time we want either the backing IDocument for a build file or the encoding (in the external from workspace case). So I moved the fix into AntModel so its encoding can be cached, which saves a lot of work and the creation of unnecessary text file buffers.
Comment 9 Michael Rennie CLA 2011-05-09 11:50:24 EDT
Curtis please re-verify the new patch. The logic is the same, just in AntModel + cached.
Comment 10 Curtis Windatt CLA 2011-05-09 17:11:08 EDT
+1 New fix operates the same and the caching should be beneficial.
Comment 11 Michael Rennie CLA 2011-05-09 18:53:03 EDT
applied patch to HEAD