| Summary: | [server] Metadata fails on project name with emoji | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [ECD] Orion | Reporter: | John Arthorne <john.arthorne> | ||||||
| Component: | Server | Assignee: | Anthony Hunter <ahunter.eclipse> | ||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||
| Severity: | normal | ||||||||
| Priority: | P3 | CC: | ahunter.eclipse, mamacdon | ||||||
| Version: | unspecified | ||||||||
| Target Milestone: | 5.0 M1 | ||||||||
| Hardware: | PC | ||||||||
| OS: | Windows 7 | ||||||||
| Whiteboard: | |||||||||
| Attachments: |
|
||||||||
I can successfully create a project programatically with emoji characters.
The client can successfully get this list of projects and display on the editor page and in the navigator. You cannot get at anything under the project nor create files / or folders in the project:
{"HttpCode":404,"Message":"File not found: /anthony-OrionContent/Project ὃ6ὃ5/","Severity":"Error","Code":0}
So if create succeeds but then I can't access anything in the folder afterwards it sounds like a bug. If we can't represent it on disk then we should fail to create, and if we can represent it then I should be able to access the contents. Created attachment 237537 [details]
Screen shot of the browser working on Linux
This is not a problem running against Linux. The screen shot shows the successful creation and edit of a project and file with emoji characters.
On Windows however, we return an error 500 because we cannot create files with these characters, we need to return a proper error that is displayed to the user.
Created attachment 237538 [details] screenshot of legacy vs simple metastore on Windows (In reply to Anthony Hunter from comment #3) > On Windows however, we return an error 500 because we cannot create files > with these characters, we need to return a proper error that is displayed to > the user. I don't think this is a Windows problem. On Windows I can use emojis everywhere when I'm running Orion with the legacy metastore. But when I use the simple metastore, emojis only work in subfolder and file names. Using an emoji as a top-level folder breaks -- the emoji characters seem to be corrupted by `workspace.json`. Attaching a pic showing (In reply to Mark Macdonald from comment #4) > Attaching a pic showing …Attaching a pic showing (In reply to Mark Macdonald from comment #4) > Attaching a pic showing So bugzilla doesn't like emojis either. The character I used in the screenshot was U+1F424 and I tried it in top-level folders, subfolders, and filenames. I did my testing on a local server, and noticed that the VM had been running with
> -Dfile.encoding=Cp1252
When I change that to UTF-8, everything just works.
So I think the problem is that the simple metastore relies on the default JVM encoding when it writes your metadata files. It needs to either always write them as UTF-8, or perhaps just encode all non-ASCII characters as escapes (as the legacy metastore did).
(In reply to Mark Macdonald from comment #7) > I did my testing on a local server, and noticed that the VM had been running > with > > -Dfile.encoding=Cp1252 > > When I change that to UTF-8, everything just works. > > So I think the problem is that the simple metastore relies on the default > JVM encoding when it writes your metadata files. It needs to either always > write them as UTF-8, or perhaps just encode all non-ASCII characters as > escapes (as the legacy metastore did). I committed a test: http://git.eclipse.org/c/orion/org.eclipse.orion.server.git/commit/?id=477197eabaf25b0ca8ff859b343be152dbf54c2b This test is successful on Linux but fails on Windows. However, I can fail the test on Linux by changing the encoding on Linux to ISO-8859-1. I did some quick reading and the FileReader/FileWriter I am using does not explicitly set the character encoding as you have shown. This needs to be fixed. This problem has been fixed with commit: http://git.eclipse.org/c/orion/org.eclipse.orion.server.git/commit/?id=75f8128ebb004cb0185e6ee4f9973239cad4b022 |
If you have a project name with unprintable characters (e.g., emoji), the simple metadata store can't seem to handle it. There is a workspace with a project called "ð©". The search indexer has code like this: for (String projectName : workspace.getProjectNames()) { ProjectInfo project = store.readProject(workspace.getUniqueId(), projectName); readProject is returning null, even though the workspace claims to contain that project.