Community
Participate
Working Groups
Created attachment 100861 [details] report design Description: The result is not consistent when preview the report containing arabic words in PPT 2003 and 2007. Build number: 2.3.0.v20080519-0630 Steps to reproduce: 1. Preview the attached report design. Expected result: The text is reversed. Actual result: See the screenshot. Error log: N/A
Created attachment 100864 [details] screenshot
Created attachment 108230 [details] screenshot from PPT 2003 Does the problem occur on PPT 2003 or 2007? Here is what I see in PPT 2003. Please notice that 2003 doesn't render Bidi text correctly on a non-Bidi enabled OS.
The screenshot from PPT 2003 shows Arabic text reordered correctly.
Created attachment 108530 [details] zip file of doc file, ppt file and screenshot The Arabic words in the doc file generated from the report are in opposite direction with those in ppt file generated from the report.
I was able to reproduce the problem. It seems to be connected with the OS locale. PPT 2003 requires Bidi as a primary locale, not just basic Bidi enablement. It did work for me in the Hebrew locale (as shown in the screenshot - attachment 108230 [details]), but I see the problem reported in the Russian locale.
I will try to figure out if for PPT there can be added some markup to specify character run direction explicitly, like what we are doing for DOC.
Sorry, accidentally ran on an old build. The problem is not reproducible for me even in a non-Bidi locale. Xiaodan, can you please give some details on your runtime environment? - Which PPT version is experiencing the problem? - What is the machine locale? - Does the OS support Bidi as a primary or supplemental language? (On WinXP, "Control Panel -> Regional and Language Options -> Languages -> Install files for complex script..." enables Bidi languages as supplemental ones.)
Created attachment 108555 [details] Patch Anyway, we can write language and directional attributes out. The patch includes the following changes: - Renderer drives now the 'rtl' text style rather from the run level of the specific text fragment being processed, than on the whole paragraph level, - PPT writer adds two new attributes: (a) 'dir', with possible values 'ltr' and 'rtl' matching the text style set by PageDeviceRender, (b) 'lang', with currently possible values 'HE' (Hebrew), 'AR' (Arabic) and 'EN-US' (Englisg US). Additional languages can be also addressed if necessary.
Created attachment 108609 [details] zip file of doc file, ppt file (In reply to comment #7) > Sorry, accidentally ran on an old build. The problem is not reproducible for me > even in a non-Bidi locale. > Xiaodan, can you please give some details on your runtime environment? > - Which PPT version is experiencing the problem? > - What is the machine locale? > - Does the OS support Bidi as a primary or supplemental language? (On WinXP, > "Control Panel -> Regional and Language Options -> Languages -> Install files > for complex script..." enables Bidi languages as supplemental ones.) Lina, Here are the infomations you need: PPT version: (11.8169.8202) SP3 Machine locale: English (United States) Supplemental language support: the "Control Panel -> Regional and Language Options -> Languages -> Install files for complex script..." checkbox in is tick
Hello I have the same configuration (US locale and same version of Power point), the Arabic text is displayed correctly on my machine.
The patch is applied.
Created attachment 115215 [details] screenshot in 2007 The direction is wrong in ppt 2007
The direction is correct in ppt 2003.
Created attachment 119817 [details] Another screenshot showing it working Guys, I am puzzled by this bug... Seems to work for me (including recognizing Arabic characters by PPT 2007), on English or Russian machine.
Jun, Can you please create in PPT 2007 itself a file containing a few Arabic words separated with a white space and attach the file it to the bugzilla? Thanks!
Lima, I created a ppt file with ppt 2007 as you said, BIDI worked fine. I compared the file with the one generated by BIRT and found that the key difference is the attribute "lang". The file created by Powerpoint 2007 uses "lang=3D'AR-DZ'", the file generated by BIRT uses "lang=3D'AR'". I changed the attribute to "lang=3D'AR-DZ'" or "lang=3D'AR-IQ'" etc., then the BIDI was ok in both 2003 and 2007.
Created attachment 126674 [details] File generated by ppt 2007
Created attachment 126675 [details] File generated by BIRT
Jun, this is great news, thanks! I will change the lang value to AR-XX (probably AR-DZ to be on the safe side) in the code. (Weird though, it works for us with "lang=3D'AR'" too... That said, reliable testing of the fix cannot be done on our side ;))
Created attachment 129247 [details] Patch to fix incomplete lang attribute value specification Replaced 'AR' with 'AR-DZ', and also 'HE' with 'HE-IL'.
defer to future as we have no resource to resolve the BIDI issues.
Patch applied.
Created attachment 136596 [details] the zip file of the generated PPT and HTML These are generated with build (2.5.0.v20090521-0630), and the result of PPT and HTML are still not the same.
Created attachment 136597 [details] screenshot
Reopen for further investigation.
Hi Lina, After the patch in ppt, i found there are still some problems in PPTWriter: If user doesn't set "rtl" properties on text element, "dir=3D'rtl' lang=3D'AR-DZ'" still should to be outputted when it is UCharacter.UnicodeBlock.HEBREW or UCharacter.UnicodeBlock.ARABIC. So the hebrew text or arabic text will displayed correctly. The code should be corrected as follows: private String buildI18nAttributes( String text, boolean rtl ) { if ( text == null ) return ""; //$NON-NLS-1$ for ( int i = text.length( ); i-- > 0; ) { UnicodeBlock block = UCharacter.UnicodeBlock.of( text.charAt( i ) ); // If there is a Hebrew or Arabic content, write the // corresponding language attribute if ( UCharacter.UnicodeBlock.HEBREW.equals( block ) ) { return " dir=3D'rtl' lang=3D'HE-IL'"; //$NON-NLS-1$ } if ( UCharacter.UnicodeBlock.ARABIC.equals( block ) || UCharacter.UnicodeBlock.ARABIC_PRESENTATION_FORMS_A .equals( block ) || UCharacter.UnicodeBlock.ARABIC_PRESENTATION_FORMS_B .equals( block ) || UCharacter.UnicodeBlock.ARABIC_SUPPLEMENT.equals( block ) ) { return " dir=3D'rtl' lang=3D'AR-DZ'"; //$NON-NLS-1$ } } // If no actual RTL content was found (e.g. in case the text // consists of sheer neutral characters), indicate Arabic language if ( rtl ) return " dir=3D'rtl' lang=3D'AR-DZ'"; //$NON-NLS-1$ else { // XXX Other language attributes can be addressed as needed return " dir=3D'ltr' lang=3D'EN-US'"; //$NON-NLS-1$ } }
Hi, I am not sure this is necessary... Does it work after the change? 'rtl' here is not a property set by user, but a character run property resolved by the Bidi engine behind the scenes. The outcome of this Bidi resolution depends on various factors, not only intrinsic character properties. As a rule, 'rtl' will match a run of Hebrew/Arabic and associated characters, and 'ltr' - non-Bidi (e.g. English) literals and associated characters. However, this can be different, e.g. in presence of control characters. So I believe we should respect this 'rtl' property rather than intrinsic character properties (decided here based on belonging to a UnicodeBlock).
Hi Lina, Yes, it works after that change. The "rtl" flag in the code is got from text style and the text style is set by user. If user set "rtl", all Hebrew/Arabic and associated characters need to be displayed reversal and the report also need to be start from right to left. If user dont set "rtl", all Hebrew/Arabic and associated characters still need to be displayed reversal and the report does not need. So here, whether user set "rtl" flag or not , the logic in the circulation is still needed.
Hi Jingwen, Based on the patch from 2008-07-28 https://bugs.eclipse.org/bugs/attachment.cgi?id=108555&action=diff: boolean rtl = text instanceof TextArea ? ( ( (TextArea) text ) .getRunLevel( ) & 1 ) != 0 : CSSConstants.CSS_RTL_VALUE .equals( style.getProperty( IStyle.STYLE_DIRECTION ) ); -- for TextArea 'rtl' is got from the run level and not text style. However, I can't locate this change in the current code base. I will try to figure out why it is not present. I totally agree with you that the Arabic/Hebrew characters should be usually reversed regardless of the paragraph direction; however, I think it would be more correct if we apply reverse to characters with an odd run level [which Arabic/Hebrew usually are]...
fixed.
Verified in build (2.5.1.v20090623-0630), closed.