Community
Participate
Working Groups
(See bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=422210) When the client sends a JSON request without specifying a charset, OrionServlet decodes the body using the system character set. For example, a client sending this request: > PUT /file/mamacdon-OrionContent/whatever/some_file.txt > Content-Type: application/json > Orion-Version: 1 > > { "Name": "你好" } ^^^ 0xE4 0xBD 0xA0 0xE5 0xA5 0xBD (the UTF-8 bytes for 你好) Ends up creating a file named ä½ å¥½ because (on my machine, at least) the server decodes the JSON request as Windows-1252. This is incorrect. JSON specifies UTF-8 as the default encoding, so that's what should get used unless the client explicitly provides one using "Content-Type: application/json;charset=Shift_JIS" or whatever.
Created attachment 254586 [details] Test case - JSON body encoded as UTF-8 To reproduce the problem using `curl`: 1) Save body.json somewhere on your computer 2) Log in to Orion at localhost, grab your JSESSIONID cookie and copy it. 3) Run the following command, replacing the values as needed: > curl -v "http://localhost:8080/file/mamacdon-OrionContent/MyProject/" \ > -H "Cookie: JSESSIONID=your_session_cookie" \ > -H "Orion-Version: 1" \ > -H "Content-Type: application/json" \ > --data '@/path/to/body.json' This will create a file inside MyProject with a messed up name depending on your system's default charset. 4) Now run this command: > curl -v "http://localhost:8080/file/mamacdon-OrionContent/MyProject/" \ > -H "Cookie: JSESSIONID=your_session_cookie" \ > -H "Orion-Version: 1" \ > -H "Content-Type: application/json;charset=UTF-8" \ # set explicitly > --data '@/path/to/body.json' This time the file will be named 你好, as expected.
I thought the system character is UTF-8 by default on the servers (hub.jazz.net, orion.eclipse.org).
(In reply to Silenio Quarti from comment #2) > I thought the system character is UTF-8 by default on the servers > (hub.jazz.net, orion.eclipse.org). I just tried step 3 against both servers: orion.eclipse.org: ä½ å¥½ hub.jazz.net: 你好 So whatever the outcome of this bug, orion.eclipse.org should probably be configured to use utf-8.
What's worse, Orion encodes JSON responses using the server's default encoding. In this case the server actually lies, claiming the response is UTF-8 when it's not. For example: running the server on Windows, I created a project with some Hebrew characters in the name. When the client lists my projects (GET /file/user-OrionContent) the server sends this: > HTTP/1.1 200 OK > Cache-Control: no-store > Content-Type: application/json; charset=UTF-8 // not true > ... > > { > "Children": [{ > "ChildrenLocation": "/file/abc-OrionContent/????%20??/?depth=1", // corruption > "Directory": true, > "Id": "???? ??", > "ImportLocation": "/xfer/import/abc-OrionContent/????%20??", // corruption > "LocalTimeStamp": 1438549964073, > "Location": "/file/abc-OrionContent/????%20??/", > "Name": "???? ??" > }], > ... > } See http://dev.eclipse.org/mhonarc/lists/orion-dev/msg03580.html
Opened bug 474366 for the issue in comment 4, since it can be fixed separately
Closing as part of a mass clean up of inactive bugs. Please reopen if this problem still occurs or is relevant to you. For more details see: https://dev.eclipse.org/mhonarc/lists/orion-dev/msg04002.html