Community
Participate
Working Groups
After work was done on Bug 333925, I came to realize another inefficiency. Right now, the HTMLResourceEncodingDetector will try to read in bytes for the EncodingGuesser to work on; however, the EncodingGuesser only ever will work for Japanese encodings. So it seems like we're doing a lot of extra work.
Created attachment 188617 [details] patch Patch makes it so that only if the EncodingGuesser is capable of making a guess will it bother reading from the stream. In a large enough workspace (~1020 HTML files) I saw IFile#getContentDescription() times go from ~10ms per file down to about .66ms per file.
Code checked in. Thanks, Nitin.