Community
Participate
Working Groups
Created attachment 117265 [details] patch for encoding guessing The patch for bug 251748 changed the validity of of the content for the content type when the XML encoding attribute's value is unterminated by a closing quote. Before, if the encoding was incomplete, it would return null, and not set the charset for the content description, but the content would still be VALID. Now, the algorithm will return the remainder of the line after the opening quote, if it is present. This causes charsets of UTF-8?> to be returned, which fails the isCharsetValid test, and causes the content to be declared INVALID for the content type. It looks like the best guessing is just off by a little. I've attached a patch which allows for a little more flexible determination of the encoding, by only reading up to before the XML declaration end.
Created attachment 117663 [details] Nick's fix + test
I released the fix and changes to the test. Thanks Nick.