Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 436573

Summary: Locator doesn't always return the current offset when using the MarkupParser
Product: z_Archived Reporter: Stephan Wahlbrink <sw>
Component: MylynAssignee: Project Inbox <mylyn-triaged>
Status: CLOSED MOVED QA Contact: David Green <greensopinion>
Severity: normal    
Priority: P3 Keywords: helpwanted
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Whiteboard:

Description Stephan Wahlbrink CLA 2014-06-04 08:39:37 EDT
In org.eclipse.mylyn.wikitext.core.parser.Locator:

 * get the 0-based character offset of the current character from the start of the document. Equivalent to
 * <code>getLineDocumentOffset()+getLineCharacterOffset()</code>
 */
public int getDocumentOffset();

Unfortunatelly the method doesn't always return the *current* offset when using MarkupParser.parse. E.g. if endHeading, endBlock and endSpan of the DocumentBuilder is invoked, it doesn't point to the expected end offset of the language element in the source document.
Comment 1 David Green CLA 2014-06-26 16:21:15 EDT
Thanks for the bug.  Can you provide some sample markup (e.g. Textile, Markdown, ...) that can be used to exhibit the problem?
Comment 2 Stephan Wahlbrink CLA 2014-06-27 03:14:27 EDT
I found meanwhile out that it was in most situations only a problem of missing documentation. What is the correct end offset in endHeading, endBlock and endSpan?

It seems:
endHeading -> getLineDocumentOffset() + getLineSegmentEndOffset()
endBlock -> getLineDocumentOffset() + getLineCharacterOffset()
endSpan -> getLineDocumentOffset() + getLineSegmentEndOffset()
Comment 3 Stephan Wahlbrink CLA 2014-06-27 03:24:36 EDT
A simple Textile example:

A link to "*Eclipse*":http://www.eclipse.org/ .

All events of the builder with [offset, end offset) of the locator:
==== Document Events (language= Textile, textLength= 47) ====
[-1, -1) beginDocument: <out-of-range>
    [0, 0) beginBlock(PARAGRAPH): A link t ...
        [0, 10) characters: 
        [10, 45) beginSpan(LINK): "*Eclips ...
            [11, 20) beginSpan(STRONG): *Eclipse ...
                [12, 19) characters: 
            [12, 19) endSpan: ... *Eclipse
        [12, 19) endSpan: ... *Eclipse
        [45, 47) characters: 
    [47, 94) endBlock: <out-of-range>
[47, 94) endDocument: <out-of-range>
====

The end offset of the span elements is not available in endSpan.

A workaround seems to be to use getLineSegmentEndOffset() in beginSpan.
Comment 4 Stephan Wahlbrink CLA 2014-06-30 09:17:20 EDT
Another example, Markdown with blocksOnly= true:

# A minimal example

Text


All events of the builder with [offset, end offset) of the locator:
==== Document Events (language= Markdown, textLength= 26) ====
[-1, -1) beginDocument: <out-of-range>
    [0, 0) beginHeading: # A mini ...
        [0, 0) characters: 
    [0, 0) endHeading: ... 
    [21, 21) beginBlock(PARAGRAPH): Text
 ...
        [21, 21) characters: 
    [26, 26) endBlock: ... e<0x0a><0x0a>Text<0x0a>
[26, 26) endDocument: ... e<0x0a><0x0a>Text<0x0a>
====

The offset in endHeading is wrong.
Comment 5 David Green CLA 2014-07-11 15:01:13 EDT
Thanks Stephan.  I won't be able to get to this right away.  Feel free to make a contribution.
Comment 6 Eclipse Webmaster CLA 2022-11-15 11:45:08 EST
Mylyn has been restructured, and our issue tracking has moved to GitHub [1].

We are closing ~14K Bugzilla issues to give the new team a fresh start. If you feel that this issue is still relevant, please create a new one on GitHub.

[1] https://github.com/orgs/eclipse-mylyn