Community
Participate
Working Groups
Created attachment 249612 [details] CJK char can not be displayed I installed the nightly build(version: 2.5.0.N20141217-1956) gerrit connector, and found that it has the CJK problem. See the pictures for more detail.
Created attachment 249613 [details] compare view code compare view has the problem too.
Since you say "the CJK problem" hopefully you have some more background on this? Test case (including public server changes, maybe just a bogus eclipse review) would be really helpful.
Use the word "测试" for test. In python shell: >>> aaaa = '测试' >>> print aaaa 测试 >>> aaaa '\xe6\xb5\x8b\xe8\xaf\x95' >>> >>> aaaa = u'测试' >>> aaaa u'\u6d4b\u8bd5' I publish a comment with the word 测试 and capture the post request: In text format: POST /a/changes/28/revisions/4/review HTTP/1.1 Accept: application/json X-Gerrit-Auth: aSceprrIwDs0TIrY5jNefjXWZj-7e8hQoW User-Agent: Jakarta Commons-HttpClient/3.1 Host: gerrit.xxxxxxx.cn Cookie: $Version=0; GerritAccount=aSceprqEp7Sf1WT-NkwMrp-Fi6Hd0-Hv7a; $Path=/ Content-Length: 62 Content-Type: application/json {"message":"娴�璇�","labels":{"Verified":-1,"Code-Review":-2}} In hex format: 00000000 7b 22 6d 65 73 73 61 67 65 22 3a 22 e6 b5 8b e8 {"message":" 00000010 af 95 22 2c 22 6c 61 62 65 6c 73 22 3a 7b 22 56 ","labels":{"V 00000020 65 72 69 66 69 65 64 22 3a 2d 31 2c 22 43 6f 64 erified":-1,"Cod 00000030 65 2d 52 65 76 69 65 77 22 3a 2d 32 7d 7d e-Review":-2}} I get a comment with "²âÊÔ". I fake a request like this: POST /a/changes/28/revisions/4/review HTTP/1.1 Accept: application/json X-Gerrit-Auth: aSceprrIwDs0TIrY5jNefjXWZj-7e8hQoW User-Agent: Jakarta Commons-HttpClient/3.1 Host: gerrit.xxxxxxx.cn Cookie: $Version=0; GerritAccount=aSceprqEp7Sf1WT-NkwMrp-Fi6Hd0-Hv7a; $Path=/ Content-Type: application/json Content-Length: 68 {"message":"\u6d4b\u8bd5","labels":{"Verified":-1,"Code-Review":-2}} and I can get a correct comment with "测试". So the problem is gerrit server does not decode UTF-8 encoded JSON string. There's two way to fix this: 1. use ascii-escaping JSON string when post. 2. make gerrit decode UTF-8 encoded JSON string correctly. Link: http://stackoverflow.com/questions/583562/json-character-encoding-is-utf-8-well-supported-by-browsers-or-should-i-use-nu
Bad luck, I found this link https://code.google.com/p/google-gson/issues/detail?id=388
Also this link: http://stackoverflow.com/questions/18300018/how-to-ensure-that-gsons-output-in-tojson-is-ascii
It seems that we'd want to use the ASCII escaping approach (1) because otherwise (2) you'd get garbage in Web UI when you tried to read something posted from Gerrit Reviews, right? We need to act just like the web client unless there is simply no way to do that. However, the bad thing is that we would need to encode every comment by walking through it char by char. Considering this is all in memory, it might not be a terrible thing to do, but I wish we could think of a way to avoid it..
(Huh, this if revealing an issue with the Bugzilla comment handling itself! It seems that json encoding is breaking the message.)
Li, the entry point is pretty straightforward, but I'm having a bit of trouble getting the unicode encoding working as needed. Everything I've tried ends up giving me "\\u.." in json message body which of course won't work -- the Writer approach in link isn't well suited to this usage either. I'll take another look next week but in meantime if you can find a clean way to do that Escaped Unicode String -> Json String, that would be reallhy helpful.
Thanks for quick reply. I'm trying to do some tests too, but I'm not a Java programmer and because of complex design patterns I don't known which file should be modified. Could you tell me in which interface/method is the best place to do such conversion, so I can submit a patch.
See this review for the correct entry point. Unfortunately, it doesn't actually work because of tricky Java escape sequence handling. https://git.eclipse.org/r/38856
Is this related to bug 438139?
Yes, it appears so.
korean comment message same problem. This problem, I do fix it myself? Do you plan to bug fix?
It would be great if you could take a look at the work in progress at https://git.eclipse.org/r/#/c/38856/ (which doesn't work) and either push a better fix to Gerrit or suggest improvements.
Mylyn has been restructured, and our issue tracking has moved to GitHub [1]. We are closing ~14K Bugzilla issues to give the new team a fresh start. If you feel that this issue is still relevant, please create a new one on GitHub. [1] https://github.com/orgs/eclipse-mylyn