| Summary: | Korean, Chinese and Japan: missing apostrophe [']. Instead value of parameter, its name is displayed. | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Technology] Babel | Reporter: | Zbigniew Kosiński <z.kosinski> | ||||||||
| Component: | translations | Assignee: | Kit Lo <kitlo> | ||||||||
| Status: | RESOLVED FIXED | QA Contact: | |||||||||
| Severity: | normal | ||||||||||
| Priority: | P3 | CC: | denis.roy, kitlo, ramat | ||||||||
| Version: | unspecified | ||||||||||
| Target Milestone: | --- | ||||||||||
| Hardware: | PC | ||||||||||
| OS: | All | ||||||||||
| Whiteboard: | stalebug | ||||||||||
| Attachments: |
|
||||||||||
Created attachment 187834 [details]
git diff
Created attachment 187835 [details]
Example 1
Translations for popup is located in org.eclipse.ui.workbench.nl_ko eclipse fragment. WorkbenchStatusDialog_ProblemOccurredInJob
Created attachment 187836 [details]
Example 2
Zbigniew, thanks for catching this! I traced the history. Seems like these strings came from a large contribution back in 2008. The problem seems to happen only to the double-byte languages as you mentioned. I spot checked a few languages like French, German, Italian. I don't see the problem for those languages. Looks like you identified the poblem strings and fixed them locally. Would you like to contribute the fixed up translations to Babel and help the community? With the help of Denis, who is an expert of the database, we probably can check those corrections in. Thanks! Take a look at http://www.eclipse.org/babel/development/large_contributions.php if you are willing to help. Today I have just fixed for Japanese. sorry I can not read and write for Chinese and Korean. Thanks Yoshida-san, may I know how you fixed the Japanese? How did you find the complete list of strings having the problem? Did you fix them manually one by one, or ran some scripts to update the strings? Hi, Kit Lo-san.
I have fixed them manually from some older version.
At first we can pick up datas as following condition from mysqldump
SELECT f.`project_id`, f.`version`, f.`name`, s.value, t.`value`
FROM (strings s inner join files f
ON s.file_id = f.file_id
AND f.is_active = 1
AND s.is_active = 1 )
inner join translation t
ON s.string_id = t.string_id
AND t.value like '\'{0}\'\'%'
AND t.language_id = 8
AND t.is_active = 1
order by 1,2,3;
Then, we can fix one by one from older version (NOT oldest)
For example, You can find that all Eclipse vaersions would be fixed by fixing anly 3.6 .
Thanks.
Yoshida-san, do you know if this problem only happens to strings with '{0}'' at the beginning of the strings? About how many problem strings in Japanese did you find? 10's? 100's? Or thousands?
Hi, Kit Lo-san, it is good question for me.
There are three points to keep in our mind.
1)
The mysqldump is taken by last September.
So, I do not know newer projects/versions have datas to fix.
But about webtools 3.4, I can guess from another webtools.x 3.3 projects.
2)
I ignored older version, like as dltk 1.0, webtools.sourceediting 3.0.4 and 3.1,
tools.cdt 6.0 and more previous versions.
3)
I have not checked yet all enclosing patterns.
For example '%\'\'{0}\'%'
I checked with following LIKE conditions.
condition:
'\'{0}\'\'%'
result:
255 rows
other conditions on left corner:
'\'{1}\'\'%'
'{0}\'%'
'{1}\'%'
'\"{0)\"\"%'
'\"{1)\"\"%'
'{0)\"%'
'{1)\"%'
results: all 0 rows
conditions on right corner:
'%\'\'{0)\''
'%\'\'{1)\''
'%\'{0)'
'%\'{1)'
'%\"\"{0)\"'
'%\"\"{1)\"'
'%\"{0)'
'%\"{1)'
results: all 0 rows
I hope it could help You and All.
Thanks.
There are several misspell on previous comment.
{0), {1) is {0}, {1}
Thanks
Yoshida-san, thanks for the very useful information. I will investigate more and fix the Korean and Chinese strings. I have found several patterns from translation strings (using recent mysqldump file).
They are not only restricted in Korean, but most of them are in Korean translation.
List: ("WRONG" => "CORRECT", without double quotes, RegEx patterns. From complex patterns to simpler)
# (a): invalid ' escape (would be '' or ")
"'\{0\}'에서 '\{" => ""{0}"에서 "{"
# (a) + (b): unmatched quote
"파일 '\{0\} "" => "파일 "{0}""
# same as (a)
"\\'\{(\d)\}\\'" => ""{$1}""
"^'\{(\d)\}" => "''{$1}"
" '\{0\}'" => " "{0}""
# (c) omit closing bracket
"\{1 " => "{1}"
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie. All the below problems have been corrected (manually and by program). Checked with latest (2015/03/29) SQL dump.
- un-paired quote (e.g. opened with ', " and not closed)
- number of single quotes not matched (e.g. '{0}'' ; MS Excel seems to be the cause?)
- non-required escaping single quotes (e.g. '{0}' (not ''{0}'')
- unnessesary escape char. and quote (e.g. \' or \")
For languages:
- Chinese (Simplified)
- Chinese (Traditional)
- Japanese
- Korean
However, some string may appear again later, because original string contains error and copies that when translating.
Thank you for improvement and strongly effort! I think most of these un-paired quote was imported from MS Excel, as you suggested. maybe happend in old time, Eclipse 3.3 or 3.4 era, I think. I change status but you can reopen it if you would find more mis-quoted strings, thanks. |
Build Identifier: M20100211-1343 In some cases when parameters are used in translations, apostrophe is not closed. This cases a problem. Instead value of parameter, its name is displayed. Example located in org.eclipse.ui.workbench.nl_ko: WorkbenchStatusDialog_ProblemOccurredInJob='{0}''\uc5d0 \ubb38\uc81c\uc810\uc774 \ubc1c\uc0dd\ud588\uc2b5\ub2c8\ub2e4. To fix that add missing sign ' is required: WorkbenchStatusDialog_ProblemOccurredInJob=''{0}''\uc5d0 \ubb38\uc81c\uc810\uc774 \ubc1c\uc0dd\ud588\uc2b5\ub2c8\ub2e4. Reproducible: Always