| Summary: | Inconsistency with "multipart/form-data" and "name" field | ||
|---|---|---|---|
| Product: | [RT] Jetty | Reporter: | Bill Mair <bill.mair> |
| Component: | server | Assignee: | Jan Bartel <janb> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | CC: | david_williams, janb, jetty-inbox, tjwatson |
| Version: | 8.1.4 | ||
| Target Milestone: | 9.2.x | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Bug Depends on: | |||
| Bug Blocks: | 458126 | ||
|
Description
Bill Mair
Hi Bill, Can you post an analysis using a more recent version of jetty, eg 9.2.6? thanks Jan Hi Jan, I've just looked at the 9.2.6 code and it is essentially the same. In o.e.j.u.MultiPartInputStreamParser, if the expected "name" isn't set, then the code "continue"s to look for a name (585). This takes it back to the start of the "outer:" while loop which resets the already parsed variables (514-516). The next line to be read is the empty line before the content leading to the code assuming that the data is now headers and reads until it hits the next empty line where it "break"s (529) then throws an IOException at 555 because 514-516 had taken care of the values. This exception is still consumed by o.e.j.server.Request in extractMultipartParameters at line 387 where it is simply logged. The client still becomes an empty result list and has no idea that an error has occurred and incorrectly assumes that it was an empty request that was sent without any parts. This is a major problem, maybe more so than the cause. Regarding the options, I would do the following: 1) Add a warning at 585 instead of a continue. 2) Add all the parts into a List<Part> than can be accessed with getParts() and that getPart(Sting name) iterates through the list looking for the header "name". 3) Make sure that any exceptions either produced or encountered are "thrown all the way" to the calling servlet. The same is true if the remote client closes the connection. You continue processing although the connection is gone because an EOFException never gets through. Bill, Hhhmmm, bit tricky. If there is an error parsing the multipart data, how to communicate that to a servlet? Consider the following scenarios: 1. If the parsing error occurred before we successfully read the first part, then potentially we could throw an error when the application calls getParts() and/or getPart(String). 2. However, if some parts were read ok before the parsing error occurred, how do we communicate that? Do we just return the ones that were parsed ok for getParts() ... then return null for getPart(String) for any other names? Alternatively, we could always throw an exception for getParts() and getPart(String), but then the app doesn't get access to the parts successfully received. Thinking about it, perhaps in this circumstance we could return the successful parts in getParts(), but any call to getPart(String) for a non-existing part could throw an exception if an exception happened during parsing. A return value of null to getPart(String) would mean that the part was never sent. 3. If an error occurred during the parsing of the content of a part (after we got the name, filename etc) then we could throw an exception on getPart(String), and any of the methods on the Part that seek to retrieve the part's content. Does this part get returned in a call to getParts() or not? Unfortunately, the spec is rather silent on these complexities. Your thoughts? Jan The request either complete or not, so in my opinion all of the cases you've identified should result in a ServletException being thrown when getPart(s) is called. If someone wants to process "partial" requests then that would be a case for a custom handler but for most applications I would expect an all or nothing kind of approach. When a ServletException exception is thrown you should indicate the cause in the message, e.g. formatting/parser error, EOF, etc. . Bill, Can you post the dump of a multipart request that triggers this problem? I'm having a little difficulty crafting a multipart request without a name that causes an IOException. thanks Jan (In reply to Jan Bartel from comment #1) > Hi Bill, > > Can you post an analysis using a more recent version of jetty, eg 9.2.6? > > thanks > Jan For example, the following multipart request input (where the first field does not contain a name filed) does NOT cause a parse exception: String multipart = "--AaB03x\r\n"+ "content-disposition: form-data; \r\n"+ "\r\n"+ "Joe Blow\r\n"+ "--AaB03x\r\n"+ "content-disposition: form-data; name=\"stuff\"; filename=\"foo.upload\"\r\n"+ "Content-Type: text/plain;charset=ISO-8859-1\r\n"+ "\r\n"+ "000000000000000000000000000000000000000000000000000\r\n"+ "--AaB03x--\r\n"; String request="GET /foo/x.html HTTP/1.1\r\n"+ "Host: whatever\r\n"+ "Content-Type: multipart/form-data; boundary=\"AaB03x\"\r\n"+ "Content-Length: "+multipart.getBytes().length+"\r\n"+ "Connection: close\r\n"+ "\r\n"+ multipart; BTW, the solution we're pursing to this problem is: + if the application does NOT call one of the getParameter() methods, an exception is thrown during the parsing invoked by Request.getPart/Parts() + if the application DOES call one of the getParameter() methods, the parsing of the multipart stream will cause a RuntimeException to be thrown, that contains the IOException. I think this satisfies your needs re exception handling, yes? Jan Hi Jan, seeing as getPart(s) are allowed to throw either an IOException, a ServletException or an IllegalStateException, I don't see the need for an unexpected RunTimeException. If the Servlet accesses any request headers then the "content" of the request doesn't have to be read. If it identifies the request as being "multipart/form-data" and then calls getPart(s), then the appropriate Exception (IO or Servlet) can/should be thrown directly. If the request is wrong then a ServletException (i.e. request is not multipart/form-data). If the parts can not be read because the connection is closed then an EOFException can be thrown directly. And if the part with the requested "name" is not there BUT all the content was read then getPart(String) should return null. I really think that preemptively parsing the entire message while getting the request headers is happening at the wrong time. I think the request body should only be processed if and when the Servlet calls either getPart(String) or getParts(), thus allowing the parser to react by throwing the appropriate Exception directly. If you left it to being called only when the Servlet wants the Server to process the request for you by calling getPart(s), then I think you have the correct processing behaviour. This leaves it up to the Servlet to respond to a "multipart/form-data" itself by parsing the contents itself after getting the InputStream, if so desired. (In reply to Bill Mair from comment #8) > Hi Jan, > > seeing as getPart(s) are allowed to throw either an IOException, a > ServletException or an IllegalStateException, I don't see the need for an > unexpected RunTimeException. There isn't. The operation of getPart/Parts() conforms to the servlet api and throws only the exceptions defined by it. The RuntimeException will only be thrown by getParameterXXXX() methods, as it is not permitted to throw checked exceptions. See below for why. > If the Servlet accesses any request headers then the "content" of the > request doesn't have to be read. Not true unfortunately, see below. > If it identifies the request as being "multipart/form-data" and then calls > getPart(s), then the appropriate Exception (IO or Servlet) can/should be > thrown directly. Absolutely, we're in strenuous, 100% agreement here :) > > If the request is wrong then a ServletException (i.e. request is not > multipart/form-data). > > If the parts can not be read because the connection is closed then an > EOFException can be thrown directly. > > And if the part with the requested "name" is not there BUT all the content > was read then getPart(String) should return null. > > I really think that preemptively parsing the entire message while getting > the request headers is happening at the wrong time. Unfortunately, the servlet spec says the following, section 3.2 File Upload: "For parts with form-data as the Content-Disposition , but without a filename, the string value of the part will also be available through the getParameter and getParameterValues methods on HttpServletRequest , using the name of the part." You have to parse the form-data in order to be able to fulfil this contract, and as the data has to be available when the user calls getParameter, it has to be parsed either lazily at that time, or sooner, before the request gets to the servlet. FWIW we are discussing using the latter approach above in jetty 9.3, so that the container can direct straight to an error page if there is a problem processing the multipart. However, this is a bit too big of a change for a stable release like 9.2, hence the approach of throwing RuntimeException for the getParamterXXX() methods. Note that if your code doesn't call getParameterXXXX() methods, then you should be fine to rely on calling getPart/Parts() to throw you the exception. > > I think the request body should only be processed if and when the Servlet > calls either getPart(String) or getParts(), thus allowing the parser to > react by throwing the appropriate Exception directly. > > If you left it to being called only when the Servlet wants the Server to > process the request for you by calling getPart(s), then I think you have the > correct processing behaviour. > > This leaves it up to the Servlet to respond to a "multipart/form-data" > itself by parsing the contents itself after getting the InputStream, if so > desired. I think you should open an issue with the Servlet Spec JSR so they can clarify this behaviour (getPart/Parts() being a relatively new addition in servlet 3.0, bolted onto the existing getParameterXXX() way of serving back multipart content). The issue tracker is here: https://java.net/jira/browse/SERVLET_SPEC cheers Jan Hi Jan, some more input from my customer: They are POSTing the data according to RFC-2045 and RFC-2046. RFC-2183 defines the "optional" Content-Disposition field (that means per definition, the specification in RFC-2388 is optional too). So in a nutshell, I think that jetty should only verify the minimal RFC specifications to maintain backwards compatibility (i.e. 2045 & 2046). A note to jetty-dev about 9.2.7 lists this as one of the "fixed' bugs. https://dev.eclipse.org/mhonarc/lists/jetty-dev/msg02316.html Was it? (and status simply needs to be changed?) ... or, was it "partially fixed" and that's why its still open? Or, was the list of fixed bugs incorrect? Thanks, David, I did not make it clear in comment #9 (just alluded to it) but the situation is thus: + a change was made in 9.2 as per comment #7 thus the release notes are correct + the issue is left open as we have not yet made the further changes that were contemplated for 9.3. I will close this issue and open a fresh one for the further changes to 9.3 and link the issues together if possible. Jan (In reply to David Williams from comment #11) > A note to jetty-dev about 9.2.7 lists this as one of the "fixed' bugs. > > https://dev.eclipse.org/mhonarc/lists/jetty-dev/msg02316.html > > Was it? (and status simply needs to be changed?) ... or, was it "partially > fixed" and that's why its still open? > > Or, was the list of fixed bugs incorrect? > > Thanks, Hi Bill, Well, if you want a servlet to intercept the multipart POST, it has to be multipart/form-data, and the relevant RFC here is https://www.ietf.org/rfc/rfc1867.txt. That RFC does say that a multipart/form-data part is expected to contain a name field. RFC 2388 is an update to that specification, explicitly refers back to it in the introduction (see https://www.ietf.org/rfc/rfc2388.txt section 2), and makes it clear that a name field is required. None of the specs you mention below refer to multipart/form-data, and the servlet specification (v3.1 pg 3-24) explicitly refers to multipart/form-data processing. As per my comment #9, we've changed jetty-9.2 to throw some exceptions to indicate a problem processing a multipart/form-data part at a point in the code where it makes sense to a servlet retrieving the parts. We also intend to change the point at which the input is processed in jetty-9.3 (for which I will open a different bug and post the bug number on this thread). However, we have no plans to change jetty to permit missing name fields for the reasons I've given previously. I do encourage you though to take this to the Servlet Spec JSR as I mentioned previously, as we would need some clarification from the servlet spec as to how to handle 1 or more parts that have no name. I'll close this issue now and open a fresh one for planned changes to 9.3. regards Jan (In reply to Bill Mair from comment #10) > Hi Jan, > > some more input from my customer: > > They are POSTing the data according to RFC-2045 and RFC-2046. > > RFC-2183 defines the "optional" Content-Disposition field (that means per > definition, the specification in RFC-2388 is optional too). > > So in a nutshell, I think that jetty should only verify the minimal RFC > specifications to maintain backwards compatibility (i.e. 2045 & 2046). Opened separate issue for changes to jetty-9.3: https://bugs.eclipse.org/bugs/show_bug.cgi?id=458126 Good point Jan. RFC 1867 covers posting multipart/form-data. 2045 and 2046 that are quoted by the customer cover MIME. :-/ I will add to other report for my proposed resolution. |