| Summary: | Problem with markdown grammar | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [ECD] Orion | Reporter: | Szymon Brandys <Szymon.Brandys> | ||||
| Component: | Client | Assignee: | Mark Macdonald <mamacdon> | ||||
| Status: | RESOLVED WONTFIX | QA Contact: | |||||
| Severity: | normal | ||||||
| Priority: | P3 | CC: | mamacdon | ||||
| Version: | 0.5 | ||||||
| Target Milestone: | --- | ||||||
| Hardware: | PC | ||||||
| OS: | Windows 7 | ||||||
| Whiteboard: | |||||||
| Attachments: |
|
||||||
|
Description
Szymon Brandys
I have a small fix for this -- will test more tomorrow. (In reply to comment #1) > I have a small fix for this -- will test more tomorrow. So the "fix" didn't fix the problem. I spent some time reviewing the TextMate manual, and I think the code is working as designed. The problem is with the rule given in Comment 0, which defines an infinite loop when applied to empty lines. Consider this input: ----------------------------------------- foo bar ----------------------------------------- The procedure is: 1. First line: no matches. Continue to second line. 2. The "begin" rule matches the empty line. The parser position does not advance, because the match ("^$") captures no characters. At this point, we enter the begin..end rule's context, so the rules in "patterns" become active, along with the "end" rule. 3. The "end" rule matches the empty line. Again, the parser position does not advance. Now we exit the begin..end rule context. 4. Since we're at the top-level context, the "begin" rule matches again. Go to step 2, repeat ad infinitum. Perhaps these cases can be detected by analyzing the parser state. But recovery would be limited to terminating with an exception. There's no way to correctly apply a grammar like this. (FWIW, TextMate also hangs on this input when given a similar grammar.) Szymon, we're going to have to find a different technique for recognizing markdown blocks. This is not INVALID bug. As I understand we are not going to fix it due to Textmate limitations. Markdown list may look like this: <empty line> * item1 * item2 item2 second line item2 third line * item3 item3 second line item3 third line item3 fourth line <empty space> some text So the list block starts with <empty space> following by '*' in the next line and ends with <empty line> followed by a line starting with a non-whitespace character. I was not able to use two-line rules, so Mark advised to use: "begin": "^$", "end": "^$", and then pattern inside. "^$" does not work, but this is not the problem. The real problem is how to describe the Markdown list block. As I wrote above the begin and end rule I would like to use would have to consider two lines instead of just one. (In reply to comment #4) > I was not able to use two-line rules, so Mark advised to use: Yeah, I made that suggestion without working through the parsing implications, sorry :( I think we can do this by breaking it into 2 rules: an outer one that identifies the start of the list, and an inner one that finds the paragraph breaks. We can use two nested begin..end rules and lookaheads. For a bulleted list, it would look like this: // outer rule { begin: '^ {0,3}([*-+)(?=\\s)', end: '^(?=\\S)', name: 'markdown.list', patterns: [ // inner rule { begin: '\\s+(?=\\S)', end: '^\\s*$' name: 'markdown.list.paragraph' } ] } - The outer rule matches the list bullet. It ends by seeing a line starting with non-whitespace characters (which terminates the list, like "some text" in your example). - The inner rule matches starting from a space (which can either be on the same line as the bullet like " item2"; or at the beginning of a new paragraph like " item3 third line"). It ends by matching a blank line. Whenever the inner rule ends, it's because a blank line was encountered. At this point either the inner rule's begin will match (indicating the list continues), or the outer rule's end will match (indicating the list terminates due to a blank line followed by non-whitespace characters). I was able to do an OK job highlighting with a grammar based on this approach. I got the idea from a Markdown grammar that I found on Github. Created attachment 212696 [details] screenshot Here is a screenshot showing how far I got. (The colors are for demonstration -- I edited the theme to show the different markdown structures that are recognized. You'll want to change the rule names, so that it works with the default theme.) The grammar I used is here: https://github.com/mamacdon/szbra.github.com/commit/6ad43d749d431a4663d57e0cd379f7d3175bbc20#diff-0 It's only a starting point -- I didn't handle HTML tags and a bunch of other MD features. Hope this helps Closing as part of a mass clean up of inactive bugs. Please reopen if this problem still occurs or is relevant to you. For more details see: https://dev.eclipse.org/mhonarc/lists/orion-dev/msg03444.html Closing as part of a mass clean up of inactive bugs. Please reopen if this problem still occurs or is relevant to you. For more details see: https://dev.eclipse.org/mhonarc/lists/orion-dev/msg03444.html |