Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 474366

Summary: Orion corrupts JSON responses when platform charset is not UTF-8
Product: [ECD] Orion Reporter: Mark Macdonald <mamacdon>
Component: ServerAssignee: Mark Macdonald <mamacdon>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3    
Version: 10.0   
Target Milestone: 10.0   
Hardware: PC   
OS: Windows 7   
Whiteboard:

Description Mark Macdonald CLA 2015-08-05 18:41:18 EDT
1. Launch the Orion server with a default character encoding to something that is not UTF-8. You can do this by adding -Dfile.encoding=Cp1252 to your launch args or orion.ini.

2. Create a project named 你好世界

3. The project name is replaced by "????".

In the network log, when the workspace content is listed (GET /file/user-OrionContent) the server is sending this:

> HTTP/1.1 200 OK
> Cache-Control: no-store
> Content-Type: application/json; charset=UTF-8    // not true
> ...
>
> {
>   "Children": [{
>     "ChildrenLocation": "/file/abc-OrionContent/????/?depth=1", // no
>     "Directory": true,
>     "Id": "???? ??",                                           // no
>     "ImportLocation": "/xfer/import/abc-OrionContent/????",    // no
>     "LocalTimeStamp": 1438549964073,
>     "Location": "/file/abc-OrionContent/????/",                // nope
>     "Name": "???? ??"                                          // no
>   }],
>   ...
> }

The server's claiming the body is UTF-8 but it's really the default character encoding (cp1252 in this case).

See http://dev.eclipse.org/mhonarc/lists/orion-dev/msg03580.html
Comment 1 Mark Macdonald CLA 2015-08-07 13:16:58 EDT
The Gzip filter is the culprit: it returns a PrintWriter that always uses the default character encoding, when it's supposed to use the response's encoding.

Fixed & added regression test
http://git.eclipse.org/c/orion/org.eclipse.orion.server.git/commit/?id=d76d3fc

Unfortunately the test depends on the default character encoding, so it is only meaningful on platforms whose default encoding is not UTF-8.