Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 427003

Summary: Upload of binary files with invalid UTF-8 byte sequences destroys content
Product: [ECD] Orion Reporter: Michael Ochmann <michael.ochmann>
Component: ServerAssignee: Project Inbox <orion.server-inbox>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: matthias.sohn
Version: 5.0   
Target Milestone: 5.0 RC1   
Hardware: All   
OS: All   
Whiteboard:
Attachments:
Description Flags
Simple Java program demonstrating the behavior of InputStreamReader for invalid UTF-8 byte sequences none

Description Michael Ochmann CLA 2014-01-30 09:38:05 EST
Upload of files via file REST API may destroy the binary content of
the files when they contain invalid UTF-8 byte sequences, see for example http://en.wikipedia.org/wiki/UTF-8#Invalid_byte_sequences.

The problem is in FileHandlerV1, which uses an intermediate UTF-8 encoded InputStreamReader to copy request entities into their corresponding workspace files. This is fine for the metadata part of multipart requests, but may corrupt binary body parts.

I'll provide a patch on Gerrit with a test case that demonstrates the problem.
Comment 1 Michael Ochmann CLA 2014-01-30 09:40:24 EST
https://git.eclipse.org/r/#/c/21334/
Comment 2 Michael Ochmann CLA 2014-01-30 09:48:44 EST
Created attachment 239475 [details]
Simple Java program demonstrating the behavior of InputStreamReader for invalid UTF-8 byte sequences
Comment 3 Matthias Sohn CLA 2014-02-04 09:04:01 EST
merged as 4dc26de2049f0e65dcaa21b0f257a2aa5f2b3fcd