Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 45553 - a search engine cannot recognize a charset wrriten by Uppercase.
Summary: a search engine cannot recognize a charset wrriten by Uppercase.
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: User Assistance (show other bugs)
Version: 2.1.1   Edit
Hardware: PC Windows 2000
: P3 normal (vote)
Target Milestone: 3.0 M5   Edit
Assignee: Konrad Kolosowski CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-25 00:24 EDT by Ryuichiro Isobe CLA
Modified: 2003-10-28 15:44 EST (History)
0 users

See Also:


Attachments
Sample plugin (2.29 KB, application/octet-stream)
2003-10-26 11:05 EST, Ryuichiro Isobe CLA
no flags Details
screenshot of invalid search result. (27.74 KB, image/jpeg)
2003-10-26 11:13 EST, Ryuichiro Isobe CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryuichiro Isobe CLA 2003-10-25 00:24:40 EDT
In Eclipse Help System,
I make a help plugin that includes Japanese html file.
The html file has ISO-2022-JP charset in <META> tag.

If a charset(iso-2022-jp) is wrriten by lowercase,
Search System can recognize a charset correctly and make database
of search result.

But, If a charset description(ISO-2022-JP) is wrriten by Uppercase,
Eclpse can not recognize a charset, and a search result list
includes a broken DBCS title.
Comment 1 Ryuichiro Isobe CLA 2003-10-25 00:56:27 EDT
sorry, I did't describe this problem exactly.

*example.

http://java.sun.com/j2se/1.4/ja/docs/ja/index.html
a html file of above URL has "CHARSET=ISO-2022-JP",
eclipse can not determine a exact charset.

but, 
http://java.sun.com/j2se/1.4/ja/docs/ja/api/overview-summary.html
a html file of above URL has "charset=iso-2022-jp",
eclipse can determine a charset.
Comment 2 Konrad Kolosowski CLA 2003-10-26 01:57:18 EST
Strange.  Which JDK are you using?  Could you try with newest JDK 1.4?
Comment 3 Ryuichiro Isobe CLA 2003-10-26 11:05:45 EST
Created attachment 6554 [details]
Sample plugin
Comment 4 Ryuichiro Isobe CLA 2003-10-26 11:13:57 EST
Created attachment 6555 [details]
screenshot of invalid search result.
Comment 5 Ryuichiro Isobe CLA 2003-10-26 11:17:15 EST
I tried again at J2SDJ1.4.2_02(newest one).
But I have same result.

I make a sample plugin to reproduce this bug, and I attach this plugin
and screenshot to this page.

1. install attached plugin.
2. open help contents.
3. search a word. "ABCDEFG"
4. check your result and above screen shot.
Comment 6 Konrad Kolosowski CLA 2003-10-28 15:41:26 EST
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-2022-jp"> works
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-2022-JP"> works
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=ISO-2022-JP"> does 
not work.
Problem is in our parser that extracts charset.  The media parameter name 
should be case insensitive, our parser requires it to be "charset" 
not "CHARSET", and fails to extract charset for the latter.
Comment 7 Konrad Kolosowski CLA 2003-10-28 15:44:08 EST
I have released a fix for 3.0M5.

If you are using 2.1.x builds, you can use use charset in upper case, just 
make sure that the attribute name "charset" is typed in lower case.