Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 337479

Summary: URIUtil#toURI(URL) mangles fragments in file URLs
Product: [Eclipse Project] Equinox Reporter: Markus Keller <markus.kell.r>
Component: ComponentsAssignee: equinox.components-inbox <equinox.components-inbox>
Status: CLOSED DUPLICATE QA Contact:
Severity: normal    
Priority: P3 CC: daniel_megert, john.arthorne, tjwatson
Version: 3.7   
Target Milestone: ---   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Bug Depends on: 339422    
Bug Blocks:    
Attachments:
Description Flags
Fix for one of the URIUtils none

Description Markus Keller CLA 2011-02-17 14:29:13 EST
Created attachment 189218 [details]
Fix for one of the URIUtils

I20110215-0800

URIUtil#toURI(URL) mangles fragments in file URLs (it wrongly encodes the # as %23 and also wrongly encodes other characters in the fragment).

The method probably causes similar problems for the query part of a URL.


I found the problem when trying to open generated Javadoc for a method in an external browser like this:

- Preferences > General > Web Browser: Select "Use external web browser"

- Generate Javadoc for a method, e.g.
package p;
public class C {
	public static void main(String[] args) {
		int i= 2;
		System.out.println(i);
	}
}

- Project > Generate Javadoc... (accept all defaults)
- select "main" in the Editor and invoke Navigate > Open Attached Javadoc
Comment 1 Markus Keller CLA 2011-03-09 15:45:19 EST
I'm not so sure any more what URIUtil#toURI(URL) is actually supposed to do, see bug 339422. Depending on that bug, the current implementation may actually be considered correct.
Comment 2 John Arthorne CLA 2011-03-28 13:43:16 EDT
Because java.net.URL and java.io.File#toURL are so broken, I'm not sure we can do much here. Take for example these two snippets:

new java.io.File("C:\\temp\\test\\a#b\\test.html").toURL().toString()

>> "file:/C:/temp/test/a#b/test.html"

new java.io.File("C:\\temp\\test\\test.html#anchor").toURL().toString()

>> "file:/C:/temp/test/test.html#anchor"

The first one is a directory containing the character #, the second is an anchor within a file. The resulting output is ambiguous. The URL class itself even parses them the same way:

new java.io.File("C:\\temp\\test\\a#b\\test.html").toURL().getRef()

>> "b/test.html"

new java.io.File("C:\\temp\\test\\test.html#anchor").toURL().getRef()

>> "anchor"

I.e., it always interprets the hash as the fragment separator, even though it might just be a path character. The fix from Markus will fix the second case, but break the first case.
Comment 3 John Arthorne CLA 2011-03-28 15:10:49 EDT
I'm going to mark this as a duplicate of the more general bug 339422.

*** This bug has been marked as a duplicate of bug 339422 ***