Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 222170 - Bidirectional (Bidi) code contribution, Part 2
Summary: Bidirectional (Bidi) code contribution, Part 2
Status: RESOLVED INVALID
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: BIRT (show other bugs)
Version: 2.2.1   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 2.3.1   Edit
Assignee: Wenbin He CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-11 02:40 EDT by Mohamed El-Kholy CLA
Modified: 2009-06-22 13:29 EDT (History)
12 users (show)

See Also:
rafsarifard: iplog+


Attachments
BIDI Design Document (355.71 KB, application/zip)
2008-03-14 19:24 EDT, Mohamed El-Kholy CLA
no flags Details
Patch for BIDI support code (101.25 KB, patch)
2008-03-14 19:39 EDT, Mohamed El-Kholy CLA
no flags Details | Diff
Images to be added to workspace (7.06 KB, image/gif)
2008-03-14 19:43 EDT, Mohamed El-Kholy CLA
no flags Details
Patch for BIDI support code by ACGC (18.19 KB, patch)
2008-03-24 15:07 EDT, Mohamed El-Kholy CLA
khouly: review?
Details | Diff
design spec with comments from wenbin (848.50 KB, application/octet-stream)
2008-03-28 03:53 EDT, Wenbin He CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mohamed El-Kholy CLA 2008-03-11 02:40:49 EDT
Build ID: M20070921-1145

Steps To Reproduce:
Currently BIRT lacks support for right-to-left report layout or right-to-left text orientation for Arabic and Hebrew reports
There is also no support for right-to-left text in charts or Arabic shaping for external data sources

More information:
Comment 1 Mohamed El-Kholy CLA 2008-03-14 19:24:05 EDT
Created attachment 92606 [details]
BIDI Design Document
Comment 2 Mohamed El-Kholy CLA 2008-03-14 19:39:40 EDT
Created attachment 92607 [details]
Patch for BIDI support code

Steps to create the diff file 

In  a linux box, put the original directory and new directory in the working directory 
then use the following command 

diff -r -uN out in > acgc_bidi.diff

where "out" is the original soure directory ( code from BIRT release 2.2.1 20070924) patched with HCG bidi code for BIDI support for BIRT (from BUG 222072 on bugzilla )

"in" is the directory containing code developed by ACGC Bidi team 


acgc_bidi.diff is the generated diff file 

to patch the original directory with the attached diff file (acgc_bidi.diff) to get the ACGC bidi support code , the following command should be used on a linux box 

patch -p0 < acgc_bidi.diff 
where the original directory  (the same out directory described above) is placed in the same working directory with the diff file
Comment 3 Mohamed El-Kholy CLA 2008-03-14 19:43:06 EDT
Created attachment 92608 [details]
Images to be added to workspace

This attachment contains images to be added to the webcontent directory in the viewer plugin
Comment 4 Mohamed El-Kholy CLA 2008-03-24 15:07:35 EDT
Created attachment 93308 [details]
Patch for BIDI support code by ACGC 

A small change is applied to the original patch
Comment 5 Wenbin He CLA 2008-03-28 03:53:03 EDT
Created attachment 93940 [details]
design spec with comments from wenbin
Comment 6 Wei Yan CLA 2008-04-01 04:15:42 EDT
there are some limitations on the PDF support.

Assume the user has following text:

<span>(arabic)</span><span>, english, </span><span>franch</span>

in LTOR direction, the layout engine will generate following area:

area1: (arabic)
area2: , english,
area3: franch

we will get the report as:

(cibara), english, franch

if the user set the direction to RTOL, in the proposal we need only mirror the area and output the text as it is, so we get a layout area as:

area1: franch
area2: , english,
area3: (arabic)

and the display result is:

franch, enlish, (cibara)

It is incorrect. The correct result is:

,english, franch (cibara)

or if the dot is treated as arabic character:

franch,english(,cibara)
Comment 7 Lina Kemmel CLA 2008-06-19 08:09:25 EDT
(In reply to comment #6)

Yes, there are currently some limitations on PDF reordering. However, they are not supported by the proposal (at least it was not our intention).

Here is even stronger example:
  <span>english,arabic,</span><span>digits</span>

Assuming base direction is left-to-right, Bidi chunks would be:
  (a) english,
  (b) arabic,
  (c) digits
- with the following embedding levels respectively:
  (a) 0
  (b) 1
  (c) 2
- and the following expected display:
  english,digits,arabic
(here reordering of chunks is required even in LTR direction).

To achieve that I think we need to perform the following steps during layout:

1. Merge the entire block text content (into a temporary storage) resulting in |english,arabic,digits|, make Bidi resolution with respect to this entire content, and split it into homogeneous Bidi chunks as needed. Current limitation: we apply resolution and splitting to each span separately. As a result, the following chunks come up:
  (a) english,
  (b) arabic,
  (c) digits
- which is fine in this particular case - but, since Bidi resolution was limited to each individual span, being unaware of its surrounding content, the levels are wrong:
  (a) 0
  (b) 1
  (2) 0
Consequently, when reordering based on those levels, we get incorrect display:
  english,arabic,digits

2. The next step could be splitting mixed (from the Bidi perspective) containers (|english,arabic,| in this example) into a number of inline containers (currently for the Bidi purposes we only split into Text frames). However, this step is unnecessary in case presentation engine can accept overlapping and discontinuous containers (in terms of X positions of their children). I.e., suppose the nominal characteristics of the containers are:
  (a) x = 0, width = 15
      with 2 text frames:
      (a1) |english,|: x = 0, width = 8
      (a2) |arabic,|: x = 8, width = 7
  (b) x = 15, width = 6
      with 1 text frame:
      (b1) |digits|: x = 0, width = 6

- and we change these to:
  (a) x = 0, width = 21
      with 2 text frames:
      (a1) |english,|: x = 0, width = 8
      (a2) |arabic,|: x = 14, width = 7
  (b) x = 8, width = 6
      with 1 text frame:
      (b1) |digits|: x = 0, width = 6

3. Reorder frames per line basis as at present, but based on the correct embedding levels.

4. Reposition frames horizontally. To be decided depending on the strategy in step 2.

I think most challenging is step 1, because it requires some significant changes on the current implementation and also because at this point it may be not yet be clear which parts of content would be associated with the same block-level element in presentation.
Comment 8 Lina Kemmel CLA 2008-07-02 08:29:44 EDT
How can report author make inline containers be created? I.e. what's the correct way to express "<span>english,arabic,</span><span>digits</span>" in report design?

When I create a Text item with that content and "html" content type in Designer, the HTML markup doesn't seem to have effect, and the content is treated as plain text either in Designer or being exported to output formats.

Is it only possible by setting the display property of (expected) block sub-elements to "inline"?
Comment 9 Zhiqiang Qian CLA 2008-07-02 23:44:31 EDT
(In reply to comment #8)
> How can report author make inline containers be created? I.e. what's the
> correct way to express "<span>english,arabic,</span><span>digits</span>" in
> report design?
> 
> When I create a Text item with that content and "html" content type in
> Designer, the HTML markup doesn't seem to have effect, and the content is
> treated as plain text either in Designer or being exported to output formats.
> 
> Is it only possible by setting the display property of (expected) block
> sub-elements to "inline"?
> 
I used a Text item with content type "HTML" and input the content as following:

<span dir="rtl">english,arabic,</span><span dir="ltr">digits</span>

and in HTML output it displays:

digits,english,arabic

Is this the expected output?

Comment 10 Lina Kemmel CLA 2008-07-03 04:24:01 EDT
(In reply to comment #9)
> I used a Text item with content type "HTML" and input the content as following:
> 
> <span dir="rtl">english,arabic,</span><span dir="ltr">digits</span>
> 
> and in HTML output it displays:
> 
> digits,english,arabic

This didn't work for me, since '<', '>' characters in HTML tags got escaped. Perhaps I had to do CVS update though.

Should the HTML content type be respected for output formats other than HTML (e.g. PDF)?

> Is this the expected output?

It depends on the paragraph direction and user agent. If the paragraph direction is LTR (i.e. underlying report orientation is LTR), both IE and Mozilla should display "digits,arabic,english". (That's assuming "arabic" and "digits" stand for actual Arabic characters/digits.)
Comment 11 Mohamed El-Kholy CLA 2008-08-20 08:31:47 EDT
This bug was originally introduced to add Bidi support to BIRT, but it was split to three separate parts, so it should be closed