Community
Participate
Working Groups
In Eclipse Help System, I make a help plugin that includes Japanese html file. The html file has ISO-2022-JP charset in <META> tag. If a charset(iso-2022-jp) is wrriten by lowercase, Search System can recognize a charset correctly and make database of search result. But, If a charset description(ISO-2022-JP) is wrriten by Uppercase, Eclpse can not recognize a charset, and a search result list includes a broken DBCS title.
sorry, I did't describe this problem exactly. *example. http://java.sun.com/j2se/1.4/ja/docs/ja/index.html a html file of above URL has "CHARSET=ISO-2022-JP", eclipse can not determine a exact charset. but, http://java.sun.com/j2se/1.4/ja/docs/ja/api/overview-summary.html a html file of above URL has "charset=iso-2022-jp", eclipse can determine a charset.
Strange. Which JDK are you using? Could you try with newest JDK 1.4?
Created attachment 6554 [details] Sample plugin
Created attachment 6555 [details] screenshot of invalid search result.
I tried again at J2SDJ1.4.2_02(newest one). But I have same result. I make a sample plugin to reproduce this bug, and I attach this plugin and screenshot to this page. 1. install attached plugin. 2. open help contents. 3. search a word. "ABCDEFG" 4. check your result and above screen shot.
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-2022-jp"> works <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-2022-JP"> works <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=ISO-2022-JP"> does not work. Problem is in our parser that extracts charset. The media parameter name should be case insensitive, our parser requires it to be "charset" not "CHARSET", and fails to extract charset for the latter.
I have released a fix for 3.0M5. If you are using 2.1.x builds, you can use use charset in upper case, just make sure that the attribute name "charset" is typed in lower case.