Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 331394

Summary: On linux, filenames containing Japanese characters are not viewable in navigator
Product: [Eclipse Project] Platform Reporter: Jeffrey Wexler <bsgcic1776>
Component: UIAssignee: Platform-UI-Inbox <Platform-UI-Inbox>
Status: RESOLVED INVALID QA Contact:
Severity: major    
Priority: P3 CC: daniel_megert, francisu, harendra, kennoji, pwebster, remy.suen
Version: 3.6.1   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard:
Attachments:
Description Flags
Package Explorer
none
1st Batch of QA output screenshots
none
2nd Batch of QA output screenshots, tail of log file, .log file
none
Test files. The two with question marks were created within Eclipse.
none
Screenshots from test using default LANG (jp) when launching Eclipse none

Description Jeffrey Wexler CLA 2010-11-30 02:22:17 EST
Build Identifier: 20100917-0705

Files with filenames containing Japanese text are not viewable in the navigator. Performing refresh does not solve the issue. Further, if one tries to move a folder for which there is a file with Japanese characters in the filename inside the folder, Eclipse displays the following error message: "Resource 'myProject/myFolder/mySubFolder' is out of sync with file system." There is no way to get around this other than remove the file or rename it so that all characters are non-Japanese (i.e., I change them to English ASCII).

Environment details: Ubuntu 10.04 LTS - Japanese locale; Gnome version 2.30.2 with build date 25 June 2010; Eclipse install was the one for Java EE development; HP TouchSmart tx2 laptop with AMD Turion X2 64-bit (dual processor). 8GB RAM. Approx. 80GB free disk space.

I am going to select "Major" as the severity of this bug given that it is indeed major for the Japanese market.

Reproducible: Always

Steps to Reproduce:
1.Create a file with one or more Japanese characters in the filename inside a directory within a project of eclipse
2.Refresh the project
3.The file will still not be shown in eclipse
Comment 1 Francis Upton IV CLA 2010-11-30 02:24:41 EST
There are 3 "navigator" like views that come with Eclipse: 1) Navigator, 2) Project Explorer, 3)  Package Explorer

Can you confirm which of the views has this problem?
Comment 2 Jeffrey Wexler CLA 2010-11-30 02:51:07 EST
All 3 (Navigator, Project Explorer, Package Explorer) have the problem.
Comment 3 Dani Megert CLA 2010-11-30 06:48:47 EST
>1.Create a file with one or more Japanese characters in the filename inside a
>directory within a project of eclipse
Where exactly are you creating the file: inside Eclipse or outside?

Anything in the .log?
Comment 4 Dani Megert CLA 2010-11-30 06:49:21 EST
Also: is the version (4.1) correct?
Comment 5 Paul Webster CLA 2010-11-30 07:20:42 EST
Could you give me some of the unicode chars to create a japanese word I can use as a filename?  is U+3071 U+3072 U+3073 sufficient to cause the problem?

PW
Comment 6 Paul Webster CLA 2010-11-30 07:25:26 EST
Created attachment 184113 [details]
Package Explorer

I created a file in the filesystem with the title "&#12401;&#12402;&#12403;.txt" and then did a refresh in the Package Explorer.  I got the above picture, which seems like it was read in correctly.

If not, as Dani mentioned, could you please attach your .log: <workspace>/.metadata/.log

PW
Comment 7 Dani Megert CLA 2010-11-30 07:26:58 EST
Paul, while your at it: it might be a refresh bug i.e. one can create the file outside Eclipse but refresh doesn't load the file.
Comment 8 Paul Webster CLA 2010-11-30 07:34:59 EST
(In reply to comment #7)
> Paul, while your at it: it might be a refresh bug i.e. one can create the file
> outside Eclipse but refresh doesn't load the file.

I followed his 1,2,3 steps, creating the file outside of eclipse and then refreshing the project.  The file appeared no problem.

Other tests:

Outside of eclipse, I created <proj>/myFolder/mySubFolder and moved my file in there.   Then in eclipse I did a refresh.  The directories and file appeared.

Within eclipse, I created a <proj>/dest folder, and then used the Move menu item to move mySubFolder into dest.  It appears to have worked.


I'll also mention, I'm on RHEL 5.5 with gtk2-2.10.4-20.el5

PW
Comment 9 Jeffrey Wexler CLA 2010-11-30 22:30:25 EST
Please see attached two files:
eclipseHeliosJPFileNameIssue.tar.gz (QA results: Screenshots and log file)
tstJPFilenames.tar.gz (QA test files)

These contain QA Screen shots and log file for the two QA tests that I just performed. Below are details of the steps taken in the two QA tests and the output screenshots and log file.

In terms of the version of Eclipse, the following is the output of Help->About Eclipse:
Eclipse Java EE IDE for Web Developers.
Version: Helios Service Release 1
Build id: 20100917-0705

The following are the steps that I did in these 2 QA tests:

Tst A:
1. Within Eclipse, created a new folder with an English name called tstJPFilenames
2. Within Eclipse, created a new file with a Japanese name called 日本語1.txt
   -> Please refer to screen shot: tstA_001_CreatedJPFileDirectlyViaEclipseNewCreateFile.png
3. Viewed the newly created file in a linux console. The file is displayed as ????.txt
   -> Please refer to screen shot: tstA_002_JPTextOfFileJustCreatedDisplaysAsQMInLinuxConsole.png
4. Viewed the newly created file in a linux file browser. The file is displayed here as well as question marks dot txt: ????.txt
   -> Please refer to sreen shot: tstA_003_JPTextOfFileJustCreatedDisplaysAsQMInGnomeFileBrowser.png

Tst B:
1. Within the linux console, created a new file called 日本語2.txt in the same folder via the following command:
echo "コンソールで直接作成したファイル。" >日本語2.txt
   -> Please refer to screen shot: tstB_001_CreatedJPFileInConsole.png
2. Viewed the folder in Eclipse prior performing a refresh on that folder. The newly created file 日本語2.txt does not display as expected.
   -> Please refer to screen shot: tstB_002_EclipsePriorToRefresh.png
3. Within Eclipse, performed a refresh on the folder. The 2nd file still is not viewable.
   -> Please refer to screen shot: tstB_003_AfterPerformedRefreshOrigJPFileNowQMAnd2ndCreatedFileNotShown.png
4. Rechecked the contents of the folder in the linux console. No change, the 1st file is still ????.txt and the second is still 日本語2.txt
   -> Please refer to screen shot: tstB_004_LinuxConsoleStillShows1stJPFileAsQMAnd2ndAsCorrectName.png
5. Within Eclipse, right clicked on the folder and selected refresh. Received an out-of-sync error. The file still is not viewable after clicking ok on the error message.
   -> Please refer to screen shot: tstB_005_RightClickFolderAndSelectMoveErrorMsg-OutOfSync.png
6. In the linux console, still no change from that of 4 above.
   -> Please refer to screen shot: tstB_006_LinuxConsoleStillUnchanged.png
7. Viewed the end of <workspace>/.metadata/.log
   The contents were:
!SESSION 2010-12-01 07:56:46.743 -----------------------------------------------
eclipse.buildId=M20100909-0800
java.version=1.6.0_22
java.vendor=Sun Microsystems Inc.
BootLoader constants: OS=linux, ARCH=x86_64, WS=gtk, NL=en_US
Framework arguments:  -product org.eclipse.epp.package.jee.product
Command-line arguments:  -os linux -ws gtk -arch x86_64 -product org.eclipse.epp.package.jee.product

!ENTRY de.anbos.eclipse.logviewer.plugin 4 0 2010-12-01 09:39:40.499
!MESSAGE
!STACK 0
java.nio.charset.MalformedInputException: Input length = 1
        at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
        at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
        at de.anbos.eclipse.logviewer.plugin.file.FileTail.read(FileTail.java:157)
        at de.anbos.eclipse.logviewer.plugin.file.FileTail.run(FileTail.java:91)
        at java.lang.Thread.run(Thread.java:662)

   -> Please refer to the text file: tstB_007_TailOfLogFile.txt
   -> And please refer to the following file which is a copy of <workspace>/.metadata/.log: tstB_008_LogFileFromMetadataFolder.log


In terms of the unicode characters asked in the comments, I am actually not quite sure how to generate and use them. This is the Japanese locale of Ubuntu. There is a key on my keyboard (I am in Japan and the computer was made by HP Japan) which switches between Japanese and English text. Thus, I am not sure what to do with the following sequences asked: "U+3071 U+3072 U+3073" and "&#12401;&#12402;&#12403;.txt". I would suggest though, that my QA may be considered fairly representative of a Japanese user because my setup is with everything in Japanese (computer, os, etc.). I will include the two files as well in the tar and will create a new Japanese file called 日本語3より長いファイル名.txt prior to a refresh in Eclipse (i.e., prior to it turning to question marks). I named this file longer so that when it turns to question marks, hopefully it will not overwrite the earlier 日本語1.txt
Please refer to screen shots:
tstC_001_longerFileNamePriorToRefreshToIncludeInTarball.png
tstC_002_LinuxFileBrowserNewFileJPTextNowQM.png

Please also note that there is a separate major bug that occurs related to Japanese in Eclipse (both in this Linux Helios version and Windows Galileo and also did in Windows Ganymede) which may be related to this bug. Japanese text within files in Eclipse become garbled into effectively garbage when working with them. This is obviously a major issue for the Japanese market. It does not happen every time but eventually does happen. I have not pinpointed the exact trigger for that yet and plan to log a separate bug for it but thought I should briefly mention that bug here in case there is something related between the two bugs.
Comment 10 Jeffrey Wexler CLA 2010-11-30 22:36:36 EST
Created attachment 184214 [details]
1st Batch of QA output screenshots
Comment 11 Jeffrey Wexler CLA 2010-11-30 22:37:15 EST
Created attachment 184215 [details]
2nd Batch of QA output screenshots, tail of log file, .log file
Comment 12 Jeffrey Wexler CLA 2010-11-30 22:38:21 EST
Created attachment 184216 [details]
Test files. The two with question marks were created within Eclipse.
Comment 13 Jeffrey Wexler CLA 2010-11-30 22:41:02 EST
I had to attach the files in eclipseHeliosJPFileNameIssue.tar.gz as two separate tar files due to the 2mb non-patch attachment limit. Thus:
eclipseHeliosJPFileNameIssue-batchA.tar.gz and
eclipseHeliosJPFileNameIssue-batchB.tar.gz
Comment 14 Dani Megert CLA 2010-12-01 04:45:31 EST
From the .log:
>de.anbos.eclipse.logviewer.plugin.file.FileTail.read(FileTail.java:157)
==> this indicates you have some additional plug-ins installed. Please try with a fresh Eclipse SDK from here:
http://download.eclipse.org/eclipse/downloads/drops/R-3.6.1-201009090800/index.php

Also, are you using the same code page on your desktop as when launching eclipse? To me it looks like different code pages are in use.
Comment 15 Jeffrey Wexler CLA 2010-12-01 22:30:37 EST
Created attachment 184321 [details]
Screenshots from test using default LANG (jp) when launching Eclipse
Comment 16 Jeffrey Wexler CLA 2010-12-01 22:55:25 EST
WOW!! Looks like the code page was the source of the problem.
I have been starting eclipse via a script file with the following contents:
#!/bin/sh
LANG=en
/oadev/eclipse/eclipse-current $*

(where eclipse-current is a symbolic link that points to helios)

Per your comments, I planned to do two tests: 1) Use the same version of Eclipse but with a new workspace and simply starting eclipse from the console thereby using the default LANG setting of jp. 2) Downloading and testing with http://download.eclipse.org/eclipse/downloads/drops/R-3.6.1-201009090800/index.php

However, test (1) above seems to have identified the problem and thus I have not done test (2) yet assuming that it is not necessary anymore.

The reason for launching eclipse with different LANG= values is that I set up an Eclipse environment in Japanese (with a Japanese plugin, etc.) for Japanese menus, etc. for Japanese workers, and one in English for myself because it is just easier for me (Eclipse has enough complexity that it is much easier for me to see menus, messages, etc. in English than Japanese and vice-versa for Japanese workers.) The ability to set the LANG attribute in Linux for this is very convenient.

Attached are the screen shots from the test.

Any ideas as to why the launching of eclipse with LANG=en would cause these issues? One should be able to switch the LANG on launch without having the file issues occur, correct? I would vote for this still being worthy of investigating as a bug but perhaps with a lower severity?
Comment 17 Harendra CLA 2010-12-02 01:31:19 EST
Hi,
 Can you check your project encoding information?
Please follow the following steps.
1. Right click->Properties on your project in project explorer.
2. You should see the project properties screen.
3. click on Resource on left panel.
4. Under Text File encoding Check if you can see "Inherited from the container(UTF-8)".
5. If the encoding is not set to UTF-8, check the other radio button and set
   the encoding to UTF-8.
6. Click Apply and click ok.
7. Check if your bug still persists.
Comment 18 Dani Megert CLA 2010-12-02 02:07:40 EST
I don't think it's an Eclipse issue but rather "LANG=en" affecting the code page. Hence you would see a similar issue if you do this before launching any other app.
Comment 19 Harendra CLA 2010-12-02 22:02:21 EST
The reason that could be happening is because the locale you used may not 
have been installed in your system.
Check the locales by typing
$locale -a
In my default ubuntu installation I had following locales
C
en_AG
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_NG
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZW.utf8
ja_JP.eucjp
ja_JP.utf8
POSIX
you can see it does not have locale en.
If you try to use en_US.utf8 your
problem should be solved.
In my system I did it using following commands.
$export LANG=en_US.utf8
$/eclipsepath/eclipse
The default code page in eclipse was set to UTF-8 in that way.

(In reply to comment #16)
> WOW!! Looks like the code page was the source of the problem.
> I have been starting eclipse via a script file with the following contents:
> #!/bin/sh
> LANG=en
> /oadev/eclipse/eclipse-current $*
> (where eclipse-current is a symbolic link that points to helios)
> Per your comments, I planned to do two tests: 1) Use the same version of
> Eclipse but with a new workspace and simply starting eclipse from the console
> thereby using the default LANG setting of jp. 2) Downloading and testing with
> http://download.eclipse.org/eclipse/downloads/drops/R-3.6.1-201009090800/index.php
> However, test (1) above seems to have identified the problem and thus I have
> not done test (2) yet assuming that it is not necessary anymore.
> The reason for launching eclipse with different LANG= values is that I set up
> an Eclipse environment in Japanese (with a Japanese plugin, etc.) for Japanese
> menus, etc. for Japanese workers, and one in English for myself because it is
> just easier for me (Eclipse has enough complexity that it is much easier for me
> to see menus, messages, etc. in English than Japanese and vice-versa for
> Japanese workers.) The ability to set the LANG attribute in Linux for this is
> very convenient.
> Attached are the screen shots from the test.
> Any ideas as to why the launching of eclipse with LANG=en would cause these
> issues? One should be able to switch the LANG on launch without having the file
> issues occur, correct? I would vote for this still being worthy of
> investigating as a bug but perhaps with a lower severity?
Comment 20 Jeffrey Wexler CLA 2010-12-03 01:33:29 EST
Looks like the LANG=en was indeed the problem.
The following are the results of locale -a:
C
POSIX
en_AG
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_NG
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZW.utf8
ja_JP.utf8

I modified the eclipse launch script such by changing:
LANG=EN
to:
export LANG=en_US.utf8

And it worked. I can create files with Japanese text in eclipse which are viewable in the console and can create files with Japanese text in the console which then become viewable in Eclipse.

In terms of the Text file encoding settings on the project and actually all of the files as well, I make an effort to select "Other: UTF-8" as much as possible even when the default of "Inherited from container in parentheses is (UTF-8). The reason that I have been doing this is that we have been having a fairly tough time with Japanese content within text files becoming corrupt at some point down the line. We create files both in eclipse in Ubuntu and Windows and move them back and forth.

Thank you everyone for your help on this and, in retrospect, apologies for your time given that it looks like it ended up being a user error.
Comment 21 Dani Megert CLA 2010-12-03 01:51:12 EST
> Thank you everyone for your help on this and, in retrospect, apologies for your
> time given that it looks like it ended up being a user error.
np. Glad it works now!