Community
Participate
Eclipse IDE
This will likely have to just be a readme item as it would appear to be a problem with the xml parser. The XMLRootHandler fails parsing an XMl file if that fail has a byte order mark (BOM) (fatal SAXParseException that the document root element is missing). This only occurs using the Crimson parser. With the Xerces parser the file is parsed successfully. For a test file with a BOM, see bug 61564 As a result of this problem, valid buildfiles do not have the Run Ant menu entries in the Run context menu.
That is right, some parsers seem not to be able to take BOMs in XML files.
Darin, actually I am seeing problems happening while using IBM's VM (which uses Xerces). Using Sun's vm does not cause any problem. Can you confirm this?
This is what I am seeing with IBM 1.4.1 (if you enable tracing for org.eclipse.core.runtime and and check the contenttypes/debug debug option, you should be able to see errors thrown by content describers in the log). sun.io.MalformedInputException at sun.io.ByteToCharUnicode.flush(ByteToCharUnicode.java:214) at sun.nio.cs.StreamDecoder$ConverterSD.flushInto(StreamDecoder.java:305) at sun.nio.cs.StreamDecoder$ConverterSD.implRead(StreamDecoder.java:329) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:222) at java.io.InputStreamReader.read(InputStreamReader.java:207) at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.skipString(Unknown Source) at org.apache.xerces.impl.XMLDocumentScannerImpl$XMLDeclDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(Unknown Source) at org.eclipse.core.internal.content.XMLRootHandler.parseContents(XMLRootHandler.java:176) at org.eclipse.core.runtime.content.XMLRootElementContentDescriber.checkCriteria(XMLRootElementContentDescriber.java:75) at org.eclipse.core.runtime.content.XMLRootElementContentDescriber.describe(XMLRootElementContentDescriber.java:105) at org.eclipse.core.internal.content.ContentType.describe(ContentType.java:172) at org.eclipse.core.internal.content.ContentTypeManager.internalFindContentTypesFor(ContentTypeManager.java:278) at org.eclipse.core.internal.content.ContentTypeManager.getDescriptionFor(ContentTypeManager.java:244) at org.eclipse.core.internal.resources.ContentDescriptionManager.readDescription(ContentDescriptionManager.java:93) at org.eclipse.core.internal.resources.ContentDescriptionManager.getDescriptionFor(ContentDescriptionManager.java:58) at org.eclipse.core.internal.resources.File.getCharset(File.java:220)
The content type fails to be set for me using a Sun 1.4.2 VM (crimson parser) using the test case of bug 61564. Debugging the following fatal exception occurs: Thread [main] (Suspended (exception SAXParseException)) XMLRootHandler(DefaultHandler).fatalError(SAXParseException) line: 447 Parser2.fatal(String, Object[], Exception) line: 3342 Parser2.fatal(String) line: 3327 Parser2.parseInternal(InputSource) line: 635 Parser2.parse(InputSource) line: 333 XMLReaderImpl.parse(InputSource) line: 448 SAXParserImpl(SAXParser).parse(InputSource, DefaultHandler) line: 345 XMLRootHandler.parseContents(InputSource) line: 176 XMLRootElementContentDescriber.checkCriteria(InputSource) line: 75 XMLRootElementContentDescriber.describe(InputStream, IContentDescription) line: 105 ContentType.describe(IContentDescriber, InputStream, ContentDescription) line: 172 ContentTypeManager.internalFindContentTypesFor(InputStream, IContentType []) line: 278 ContentTypeManager.getDescriptionFor(InputStream, String, QualifiedName []) line: 244 ContentDescriptionManager.readDescription(File) line: 92 ContentDescriptionManager.getDescriptionFor(File, ResourceInfo) line: 57 File.getContentDescription() line: 239 EncodingActionGroup.getEncodingFromContent(IFile) line: 209 EncodingActionGroup.getDefaultEncodingText(ITextEditor, String) line: 196 EncodingActionGroup.access$0(ITextEditor, String) line: 186 EncodingActionGroup$PredefinedEncodingAction.update() line: 166 EncodingActionGroup.update() line: 437 DefaultEncodingSupport.reset() line: 93 AntEditor(TextEditor).updatePropertyDependentActions() line: 312 AntEditor(AbstractTextEditor).firePropertyChange(int) line: 4525 AbstractTextEditor$3.run() line: 301 AbstractTextEditor$ElementStateListener.execute(Runnable) line: 424 AbstractTextEditor$ElementStateListener.elementDirtyStateChanged (Object, boolean) line: 304 TextFileDocumentProvider$FileBufferListener.dirtyStateChanged (IFileBuffer, boolean) line: 249 TextFileBufferManager.fireDirtyStateChanged(IFileBuffer, boolean) line: 240 ResourceTextFileBuffer(ResourceFileBuffer).commit(IProgressMonitor, boolean) line: 304 AntEditorDocumentProvider(TextFileDocumentProvider).commitFileBuffer (IProgressMonitor, TextFileDocumentProvider$FileInfo, boolean) line: 680 TextFileDocumentProvider$2.execute(IProgressMonitor) line: 642 TextFileDocumentProvider$2 (TextFileDocumentProvider$DocumentProviderOperation).run(IProgressMonitor) line: 105 WorkspaceModifyDelegatingOperation.execute(IProgressMonitor) line: 67 WorkspaceModifyOperation$1.run(IProgressMonitor) line: 91 Workspace.run(IWorkspaceRunnable, ISchedulingRule, int, IProgressMonitor) line: 1673 WorkspaceModifyDelegatingOperation(WorkspaceModifyOperation).run (IProgressMonitor) line: 105 WorkspaceOperationRunner.run(boolean, boolean, IRunnableWithProgress, ISchedulingRule) line: 73 WorkspaceOperationRunner.run(boolean, boolean, IRunnableWithProgress) line: 63 AntEditorDocumentProvider(TextFileDocumentProvider).executeOperation (TextFileDocumentProvider$DocumentProviderOperation, IProgressMonitor) line: 403 AntEditorDocumentProvider(TextFileDocumentProvider).saveDocument (IProgressMonitor, Object, IDocument, boolean) line: 623 AntEditor(AbstractTextEditor).performSave(boolean, IProgressMonitor) line: 3444 AntEditor(AbstractTextEditor).doSave(IProgressMonitor) line: 3233 AntEditor.doSave(IProgressMonitor) line: 683 EditorManager$12.run(IProgressMonitor) line: 1160 EditorManager$10.run(IProgressMonitor) line: 1015 ModalContext.runInCurrentThread(IRunnableWithProgress, IProgressMonitor) line: 303 ModalContext.run(IRunnableWithProgress, boolean, IProgressMonitor, Display) line: 253 ApplicationWindow$1.run() line: 588 BusyIndicator.showWhile(Display, Runnable) line: 69 WorkbenchWindow(ApplicationWindow).run(boolean, boolean, IRunnableWithProgress) line: 585 WorkbenchWindow.run(boolean, boolean, IRunnableWithProgress) line: 1653 EditorManager.runProgressMonitorOperation(String, IRunnableWithProgress, IWorkbenchWindow) line: 1021 EditorManager.savePart(ISaveablePart, IWorkbenchPart, boolean) line: 1165 WorkbenchPage.savePart(ISaveablePart, IWorkbenchPart, boolean) line: 2528 WorkbenchPage.saveEditor(IEditorPart, boolean) line: 2540 SaveAction.run() line: 69 SaveAction(Action).runWithEvent(Event) line: 881 ActionHandler.execute(Map) line: 141 Command.execute(Map) line: 132 WorkbenchKeyboard.executeCommand(String) line: 469 WorkbenchKeyboard.press(List, Event) line: 887 WorkbenchKeyboard.processKeyEvent(List, Event) line: 928 WorkbenchKeyboard.filterKeySequenceBindings(Event) line: 546 WorkbenchKeyboard.access$2(WorkbenchKeyboard, Event) line: 494 WorkbenchKeyboard$1.handleEvent(Event) line: 259 EventTable.sendEvent(Event) line: 82 Display.filterEvent(Event) line: 714 Tree(Widget).sendEvent(Event) line: 795 Tree(Widget).sendEvent(int, Event, boolean) line: 820 Tree(Widget).sendEvent(int, Event) line: 805 Tree(Control).sendKeyEvent(int, int, int, int, Event) line: 1734 Tree(Control).sendKeyEvent(int, int, int, int) line: 1730 Tree(Control).WM_CHAR(int, int) line: 3067 Tree.WM_CHAR(int, int) line: 1372 Tree(Control).windowProc(int, int, int, int) line: 2970 Display.windowProc(int, int, int, int) line: 3298 OS.DispatchMessageW(MSG) line: not available [native method] OS.DispatchMessage(MSG) line: 1467 Display.readAndDispatch() line: 2396 Workbench.runEventLoop(Window$IExceptionHandler, Display) line: 1375 Workbench.runUI() line: 1346 Workbench.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 252 PlatformUI.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 141 IDEApplication.run(Object) line: 96 PlatformActivator$1.run(Object) line: 335 EclipseStarter.run(Object) line: 272 EclipseStarter.run(String[], Runnable) line: 128 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 39 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25 Method.invoke(Object, Object[]) line: 324 Main.basicRun(String[]) line: 186 Main.run(String[]) line: 647 Main.main(String[]) line: 631
Thanks, Darin, the problem I was seeing was actually bug 67975 (UTF-16 BOM on Windows IBM VM). This bug is caused by: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058 And affects not only content type determination, but also any other Crimson clients such as Ant. When running manually or as an external builder, I get this error: Buildfile: d:\temp\tests\runtime-workbench\AntTest\build.xml BUILD FAILED: D:\temp\tests\runtime-workbench\AntTest\build.xml:1: Document root element is missing. Total time: 125 milliseconds There is one thing I do not understand: when running with IBM 1.4.1, content type detection works fine, but running Ant manually (using the "Run Ant" action) fails. What is worse, running the same script as external too builder works fine. Darin, is there any difference between the two modes of running Ant w.r.t. XML parsing?
All kinds of differences :-) All depends on the VM you are running the Ant build within and what is on the Ant runtime classpath. You can specify Xerces to be on the Ant runtime classpath and then Xerces is used as the parser (just like Ant at the commandline). External tool builders by default run in the same VM (IBM 1.4.1). Your Run As test case: is that running in IBM 1.4.1 or in a Sun VM? What does its runtime classpath look like?
Ok, got it. Since I am running (alternately) wit Sun and IBM VMs on the same workspace, I guess I ran Ant for the first time using Sun's VM, and then when running with IBM's VM the original settings (Sun's) were remembered. Using the "Run->Ant build..." action, it seems I caused the settings to be re-computed for the current default JRE, because then Ant worked.
So you are going to add a readme section about this. I should probably add one in Ant land to specify how to run the Ant builds for buildfiles that do contain a BOM. Logged bug 68132
So is there an workaround (other than running with Xerces?)? Also, the problem is only with UTF-8 BOMs (UTF-16 BOMs are fine).
Not that I know of.
Added to README for 3.0.
*** Bug 70177 has been marked as a duplicate of this bug. ***
Would it be possible to read in the entire file, strip out the BOM characters, and process as normal? I know that this is probably a bit overkill for every parse of every file, but if we encounter this particular Exception in parsing, couldn't we rejig the file a bit to strip the BOM in memory before parsing ?