Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 254522 - XMLContentDescriber guesses charset incorrectly for unterminated encoding value
Summary: XMLContentDescriber guesses charset incorrectly for unterminated encoding value
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.5   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.5 M4   Edit
Assignee: Platform-Resources-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 254504
  Show dependency tree
 
Reported: 2008-11-06 16:43 EST by Nick Sandonato CLA
Modified: 2009-06-03 10:08 EDT (History)
1 user (show)

See Also:


Attachments
patch for encoding guessing (999 bytes, patch)
2008-11-06 16:43 EST, Nick Sandonato CLA
Szymon.Brandys: iplog+
Details | Diff
Nick's fix + test (9.71 KB, patch)
2008-11-12 09:37 EST, Szymon Brandys CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Sandonato CLA 2008-11-06 16:43:39 EST
Created attachment 117265 [details]
patch for encoding guessing

The patch for bug 251748 changed the validity of of the content for the content type when the XML encoding attribute's value is unterminated by a closing quote. Before, if the encoding was incomplete, it would return null, and not set the charset for the content description, but the content would still be VALID.

Now, the algorithm will return the remainder of the line after the opening quote, if it is present.  This causes charsets of UTF-8?> to be returned, which fails the isCharsetValid test, and causes the content to be declared INVALID for the content type.  It looks like the best guessing is just off by a little.

I've attached a patch which allows for a little more flexible determination of the encoding, by only reading up to before the XML declaration end.
Comment 1 Szymon Brandys CLA 2008-11-12 09:37:35 EST
Created attachment 117663 [details]
Nick's fix + test
Comment 2 Szymon Brandys CLA 2008-11-12 09:42:03 EST
I released the fix and changes to the test. Thanks  Nick.