Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 470615 - JSON requests should be treated as UTF-8 unless otherwise indicated
Summary: JSON requests should be treated as UTF-8 unless otherwise indicated
Status: CLOSED WONTFIX
Alias: None
Product: Orion
Classification: ECD
Component: Server (show other bugs)
Version: 9.0   Edit
Hardware: PC Windows 7
: P3 minor (vote)
Target Milestone: ---   Edit
Assignee: Anthony Hunter CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-19 12:06 EDT by Mark Macdonald CLA
Modified: 2017-01-10 15:46 EST (History)
2 users (show)

See Also:


Attachments
Test case - JSON body encoded as UTF-8 (20 bytes, application/octet-stream)
2015-06-19 12:17 EDT, Mark Macdonald CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Macdonald CLA 2015-06-19 12:06:56 EDT
(See bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=422210)

When the client sends a JSON request without specifying a charset, OrionServlet decodes the body using the system character set.

For example, a client sending this request:

> PUT /file/mamacdon-OrionContent/whatever/some_file.txt
> Content-Type: application/json
> Orion-Version: 1
> 
> { "Name": "你好" }
             ^^^ 0xE4 0xBD 0xA0 0xE5 0xA5 0xBD (the UTF-8 bytes for 你好)


Ends up creating a file named 你好 because (on my machine, at least) the server decodes the JSON request as Windows-1252.

This is incorrect. JSON specifies UTF-8 as the default encoding, so that's what should get used unless the client explicitly provides one using "Content-Type: application/json;charset=Shift_JIS" or whatever.
Comment 1 Mark Macdonald CLA 2015-06-19 12:17:54 EDT
Created attachment 254586 [details]
Test case - JSON body encoded as UTF-8

To reproduce the problem using `curl`:

1) Save body.json somewhere on your computer
2) Log in to Orion at localhost, grab your JSESSIONID cookie and copy it.
3) Run the following command, replacing the values as needed:

> curl -v "http://localhost:8080/file/mamacdon-OrionContent/MyProject/" \
>  -H "Cookie: JSESSIONID=your_session_cookie" \
>  -H "Orion-Version: 1" \
>  -H "Content-Type: application/json" \
>  --data '@/path/to/body.json'

This will create a file inside MyProject with a messed up name depending on your system's default charset. 

4) Now run this command:

> curl -v "http://localhost:8080/file/mamacdon-OrionContent/MyProject/" \
>  -H "Cookie: JSESSIONID=your_session_cookie" \
>  -H "Orion-Version: 1" \
>  -H "Content-Type: application/json;charset=UTF-8" \   # set explicitly
>  --data '@/path/to/body.json'

This time the file will be named 你好, as expected.
Comment 2 Silenio Quarti CLA 2015-06-19 14:06:21 EDT
I thought the system character is UTF-8 by default on the servers (hub.jazz.net, orion.eclipse.org).
Comment 3 Mark Macdonald CLA 2015-06-19 19:04:20 EDT
(In reply to Silenio Quarti from comment #2)
> I thought the system character is UTF-8 by default on the servers
> (hub.jazz.net, orion.eclipse.org).

I just tried step 3 against both servers:

 orion.eclipse.org: 你好 
 hub.jazz.net: 你好   

So whatever the outcome of this bug, orion.eclipse.org should probably be configured to use utf-8.
Comment 4 Mark Macdonald CLA 2015-08-02 17:20:58 EDT
What's worse, Orion encodes JSON responses using the server's default encoding. In this case the server actually lies, claiming the response is UTF-8 when it's not.

For example: running the server on Windows, I created a project with some Hebrew characters in the name. When the client lists my projects (GET /file/user-OrionContent) the server sends this:

> HTTP/1.1 200 OK
> Cache-Control: no-store
> Content-Type: application/json; charset=UTF-8    // not true
> ...
>
> {
>   "Children": [{
>     "ChildrenLocation": "/file/abc-OrionContent/????%20??/?depth=1", // corruption
>     "Directory": true,
>     "Id": "???? ??",
>     "ImportLocation": "/xfer/import/abc-OrionContent/????%20??",    // corruption
>     "LocalTimeStamp": 1438549964073,
>     "Location": "/file/abc-OrionContent/????%20??/",
>     "Name": "???? ??"
>   }],
>   ...
> }

See http://dev.eclipse.org/mhonarc/lists/orion-dev/msg03580.html
Comment 5 Mark Macdonald CLA 2015-08-05 18:41:31 EDT
Opened bug 474366 for the issue in comment 4, since it can be fixed separately
Comment 6 Michael Rennie CLA 2017-01-10 15:46:34 EST
Closing as part of a mass clean up of inactive bugs. Please reopen if this problem still occurs or is relevant to you. For more details see:

https://dev.eclipse.org/mhonarc/lists/orion-dev/msg04002.html