Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 161010

Summary: Unicode BOMs not detected in ordinary text files
Product: [Eclipse Project] Platform Reporter: David Williams <david_williams>
Component: ResourcesAssignee: Platform-Resources-Inbox <platform-resources-inbox>
Status: RESOLVED DUPLICATE QA Contact:
Severity: major    
Priority: P3 CC: thatnitind
Version: 3.2   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Attachments:
Description Flags
this zip contains the presentations.data file. none

Description David Williams CLA 2006-10-16 02:37:40 EDT
I'll attach a sample, in a zip file to avoid corruption, called presentations.data that has an initial BOM indicating UT-16. 

Even if the contents was not XML, it is still supposed to be read in as UTF-16, according to Unicode spec's. 

And, further, since after the BOM there is an XML Declaration, the file should be detected as "xml content type". 

If I explicitly use preferences to day "data" extensions are xml, everything works as expected, but that should not be required for a file like this. 

I fear I'm missing something, since this is a pretty big bug and am surprised its existed so long.
Comment 1 David Williams CLA 2006-10-16 02:38:43 EDT
Created attachment 52016 [details]
this zip contains the presentations.data file.
Comment 2 John Arthorne CLA 2006-10-16 10:27:02 EDT

*** This bug has been marked as a duplicate of 160614 ***