Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 346793 - [misc] XHTML: content-type meta tag is ignored during charset detection
Summary: [misc] XHTML: content-type meta tag is ignored during charset detection
Status: RESOLVED WORKSFORME
Alias: None
Product: WTP Source Editing
Classification: WebTools
Component: wst.html (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: wst.html CLA
QA Contact: Nick Sandonato CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-21 13:16 EDT by Sven Köhler CLA
Modified: 2012-11-08 10:12 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sven Köhler CLA 2011-05-21 13:16:10 EDT
Build Identifier: M20110210-1200

Hi,

in Bug 318768, I reported that the document charset was not properly detected. In fact,I believe the W3C suggests that the default charset of an XHTML document is UTF-8, and that you may specify an alternative character set via a XML declaration.

However, for backwards compatibility, the W3C suggest that people, that serve XHTML as text/html, may also use a meta tag to specify the content type.

http://www.w3.org/International/O-charset.en.php?changelang=en

I believe, the order of detection for documents with XHTML should be the following:
- look for BOM
- look for xml declarion
- look for meta tag


I believe, the last step is currently missing.


Reproducible: Always
Comment 1 Nick Sandonato CLA 2012-11-08 10:12:29 EST
Thanks for the bug report. I'm seeing the proper encoding being picked up from the meta tag in the absence of the XML declaration for XHTML files. This may have been resolved in the meantime. If you have a different scenario that we're not covering, please reopen the defect with a scenario we should try.