Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 333925

Summary: Inefficiency in HTMLResourceEncodingDetector
Product: [WebTools] WTP Source Editing Reporter: Nick Sandonato <nsand.dev>
Component: wst.htmlAssignee: Nick Sandonato <nsand.dev>
Status: RESOLVED FIXED QA Contact: Nick Sandonato <nsand.dev>
Severity: normal    
Priority: P3 CC: thatnitind
Version: 3.2.2Flags: thatnitind: review+
Target Milestone: 3.2.3   
Hardware: PC   
OS: Windows XP   
Whiteboard: WI60406
Attachments:
Description Flags
patch
none
patch with end-of-stream check none

Description Nick Sandonato CLA 2011-01-10 19:01:28 EST
Currently, the HTMLResourceEncodingDetector does a ready() check before reading a byte. For sufficiently large files, or when the detector has to run over several files, this can cause the checkheuristics() method to take up a considerable amount of time. I don't think anything is bought by bailing out when a resource isn't ready and may be more troublesome since we can't check the heuristics to determine the encoding.

In running tests, I noticed a performance gain of nearly 82% from just not doing the ready() check.
Comment 1 Nick Sandonato CLA 2011-01-10 19:03:13 EST
Created attachment 186446 [details]
patch
Comment 2 Nitin Dahyabhai CLA 2011-01-12 14:52:09 EST
This isn't new, but what if fReader.read() returns -1?
Comment 3 Nick Sandonato CLA 2011-01-12 15:20:03 EST
Created attachment 186668 [details]
patch with end-of-stream check
Comment 4 Nick Sandonato CLA 2011-01-12 18:01:56 EST
Code checked in. Thanks.