| Summary: | URIUtil#toURI(URL) encodes properly encoded URLs again | ||
|---|---|---|---|
| Product: | [Eclipse Project] Equinox | Reporter: | Markus Keller <markus.kell.r> |
| Component: | Components | Assignee: | equinox.components-inbox <equinox.components-inbox> |
| Status: | CLOSED WONTFIX | QA Contact: | |
| Severity: | major | ||
| Priority: | P3 | CC: | daniel_megert, john.arthorne, mober.at+eclipse |
| Version: | 3.7 | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | Windows 7 | ||
| Whiteboard: | stalebug | ||
| Bug Depends on: | |||
| Bug Blocks: | 337479 | ||
|
Description
Markus Keller
I will work on the javadoc. We can't handle both properly encoded and unencoded URLs with a single method. I.e., if the URL contains "%20" this is either a properly encoded space character, or an unencoded string containing the three chars (%, 2, 0). We can't reliably determine which that is. There is no perfect answer for all cases, but it bothers me that this method treats file: URLs different from any other URL. For non-file URLs, it starts with the assumption that the URL is a well-formed URI, and if it fails then it assumes the URL is not encoded. For file: URLs, it always assumes the URL is not properly encoded. However the more I think about it, I think we should leave it alone. Since java.io.File#toURL always returns unencoded URLs, it is most likely that if someone has a file: URL then it needs to be encoded. For non file URLs I like the fact that we prefer well-formed URI's, since it encourages clients to do the right thing, and generally transition their code to well-formed encoded URIs. I would be happy to add javadoc here saying java.net.URL should be avoided at all costs because it is inherently ambiguous and doesn't conform to RFC specified behaviour for URLs and URIs. *** Bug 337479 has been marked as a duplicate of this bug. *** (In reply to bug 337479 comment #2) java.io.File is not supposed to know about fragments, so all # in a file path should be considered file name characters. java.io.File#toURL() is indeed broken and has been deprecated in 1.6. Since it loses information, nobody should use that method any more (also pre 1.6). We have to decide what URIUtil#toURI(URL) is meant for. If it is meant for "fixing" unencoded URLs, then the Javadoc should tell explicitly that this is not a recommended method, and that clients should instead try to avoid File#toURL() in the first place and use the File#toURI(). Or if they need a URL, they should use file.toURI().toURL(). But I also see some special handling for UNC paths. I'm not sure what bugs exactly the implementation should fix there. But if the UNC special cases are also required for properly encoded URLs, then we need a separate API that only fixes the UNC problems but does not encode the URL again. UNC handling in the wiki also needs a closer look: http://wiki.eclipse.org/Eclipse/UNC_Paths needs to be updated, since it recommends the broken File#toURL() and unconditionally recommends URIUtil. (In reply to comment #2) > I would be happy to add javadoc here saying java.net.URL should be avoided at > all costs ... I don't think URL is the problem. File#toURL() is broken. This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie. This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug. If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie. |