Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 399695 - [1.8][compiler] migrate parser to other syntax for default methods
Summary: [1.8][compiler] migrate parser to other syntax for default methods
Status: RESOLVED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 4.3   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: BETA J8   Edit
Assignee: Jesper Moller CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 380501 399780
  Show dependency tree
 
Reported: 2013-01-31 18:44 EST by Stephan Herrmann CLA
Modified: 2013-02-19 06:46 EST (History)
3 users (show)

See Also:


Attachments
Test failures (9.51 MB, text/xml)
2013-02-13 22:26 EST, Srikanth Sankaran CLA
no flags Details
Patch so that the resource files can be generated by 'buildFilesFromLPG' without changing CWD (1.52 KB, patch)
2013-02-15 16:02 EST, Jesper Moller CLA
no flags Details | Diff
Java Program to invoke Jikespg and merge the outputs nicely into the Source. (10.29 KB, text/plain)
2013-02-15 16:08 EST, Jesper Moller CLA
no flags Details
Updated the test case (9.59 KB, patch)
2013-02-15 20:06 EST, Jesper Moller CLA
no flags Details | Diff
patch for other affected tests (3.98 KB, patch)
2013-02-15 20:08 EST, Jesper Moller CLA
no flags Details | Diff
Actual patch for java.g and Parser.java (3.28 KB, patch)
2013-02-15 20:19 EST, Jesper Moller CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stephan Herrmann CLA 2013-01-31 18:44:30 EST
Looking at a 0.6.1 spec I finally can see that the parser work around bug 380194 comment 4 is now obsolete:

We need to move from
  void foo() default { stmts; }
to
  default void foo() { stmts; }

sigh.

At least this might in turn obsolete bug 383973, let's see.
Comment 1 Srikanth Sankaran CLA 2013-02-01 04:36:33 EST
Jay, this is a leaf item - not blocking anything. Could be a good
candidate for Shankha ?
Comment 2 Jay Arthanareeswaran CLA 2013-02-04 01:58:20 EST
Shankha, here's your first eclipse bug to fix. All the best!
Comment 3 Srikanth Sankaran CLA 2013-02-13 00:47:32 EST
Shankha, I'll take this one - for one thing this is coming in the way of
compatibility testing with Oracle compiler as they have already migrated
to the new syntax. For another, the straightforward change results in
shift/reduce conflicts and so this may need some prior experience working with
jikes parser generator.
Comment 4 Srikanth Sankaran CLA 2013-02-13 08:57:24 EST
Stephan/Jesper, is this an area you would like to try your hand ?

Looks like the conflict arises due to the permissiveness in the grammar,
where we don't discriminate what modifiers can legally occur where.

When 'default' is seen in a snippet like below:

    switch (x) {
    case 10: 
             x++;
             break;
    default // <---------------------- ??

Am I declaring something that has default modifier as a part of the case 10
SwitchBlockStatement, in which case the automaton should shift or am I starting
a default case arm, in which case the automaton should reduce the 
    
    x++; 
    break;

into a SwitchBlockStatement seems to be the conflict.

I tried my hand some at refactoring the relevant productions, no luck
yet - the "hammer" is always there, but I would like to see what a 
refactored grammar would looks like before deciding.

Jesper, if you want to have a go at it: we use the jikes parser 
generator.  JDT grammar file can be opened by Ctrl+Shift+R java.g.

Some instructions are here: http://www.eclipse.org/jdt/core/howto/generate%20parser/generateParser.html


I plan to spend some cycles on it anyways and we can share notes if one/more
of you are also going to find time/interest.
Comment 5 Stephan Herrmann CLA 2013-02-13 10:25:04 EST
I can give it a try - say - on the weekend unless s.o. already resolved this by then :)

A quick guess: is the problem due to the generalized rule Modifiers that is used for class/method/field/local alike? Would it already help to split this rule into MethodModifiers and NonMethodModifiers? Since at statement level we don't expect a method, the default could then be detected as a default-branch of the switch, no?
Comment 6 Srikanth Sankaran CLA 2013-02-13 16:51:54 EST
(In reply to comment #5)
> I can give it a try - say - on the weekend unless s.o. already resolved this
> by then :)
> 
> A quick guess: is the problem due to the generalized rule Modifiers that is
> used for class/method/field/local alike? 

That is my initial assessment.

> Would it already help to split this
> rule into MethodModifiers and NonMethodModifiers? Since at statement level
> we don't expect a method, the default could then be detected as a
> default-branch of the switch, no?

Probably.

Thanks, this is an area where I don't mind multiple people simultaneously
investing time - we can settle for what looks like the minimal, readable/maintainable solution.
Comment 7 Srikanth Sankaran CLA 2013-02-13 16:55:26 EST
(In reply to comment #4)

> Some instructions are here:
> http://www.eclipse.org/jdt/core/howto/generate%20parser/generateParser.html

This page does not mention it explicitly - when there are conflicts, the file
java.l is useful to study the states and the LR items that correspond to each
state - it also lists the rules with numbers, so one can see how at a state
we end with a shift/reduce or a reduce/reduce conflict.
Comment 8 Jesper Moller CLA 2013-02-13 17:03:14 EST
(In reply to comment #4)
> Stephan/Jesper, is this an area you would like to try your hand ?

Sure.


> Some instructions are here:
> http://www.eclipse.org/jdt/core/howto/generate%20parser/generateParser.html

You have got to be kidding me -- you do this manually every time? Even the itsy-bitsy parser used in the XPath2 project over in WTP has this part automated. As if dealing with parser generators wasn't tricky enough as it was...

Count me in for fixing that manual process first, OK?
Comment 9 Srikanth Sankaran CLA 2013-02-13 17:11:40 EST
> Count me in for fixing that manual process first, OK?

At IBM, we internally use an automation - after jikespg.exe has successfully
declared the grammar as being LALR(1) and it does everything - the only thing
manually needed is to refresh the work space.

I don't know the provenance of this automation code - I'll investigate what
it would take to get it publicly available: It may be a while though.
Comment 10 Srikanth Sankaran CLA 2013-02-13 22:18:01 EST
(In reply to comment #6)

> > Would it already help to split this
> > rule into MethodModifiers and NonMethodModifiers? Since at statement level
> > we don't expect a method, the default could then be detected as a
> > default-branch of the switch, no?

This is one part of the problem. We can attempt conventional solution for
this via refactoring/transforming the grammar rules. If it looks unwieldly or
difficult to prove to ourselves that the G' == G, then we can resort to
alternate solutions by munging the token - this will be very simple, but
should still be resorted to only at last.

The other conflict arises due to the use of Modifiersopt in recovery rules
for headers:

-- Modifiersopt is used to properly consume a header and exit the rule reduction at the end of the parse() method
Goal ::= '>>>' Header1 Modifiersopt
Goal ::= '!' Header2 Modifiersopt

The comment is interesting. There ought to be extra-constitutional authority
at play here to signal EOF after modifiers or otherwise why would input stream
terminate with Modifiers ??

I removed Modifiersopt from these rules and ran the tests. Will attach the
failing tests. Studying them could shed some light.
Comment 11 Srikanth Sankaran CLA 2013-02-13 22:26:12 EST
Created attachment 227052 [details]
Test failures

xml log of tests that fail if Modifiersopt is removed from the rules for
the recovery of headers.

Since the automaton would terminate and declare success only when unconsumed
input is empty, i.e input string is fully tokenized and parsed, just after
Modifiers are seen in input, EOF should get signalled somehow. If there is
an agent that is doing it, that agent may as well be modified to signal EOF
ahead of the modifiers, unless it is looking for a particular modifier (annotation ?) which could be interspersed deeper among the modifiers.
Comment 12 Srikanth Sankaran CLA 2013-02-13 22:39:38 EST
(In reply to comment #11)

> Since the automaton would terminate and declare success only when unconsumed
> input is empty, i.e input string is fully tokenized and parsed, just after
> Modifiers are seen in input, EOF should get signalled somehow. If there is
> an agent that is doing it, that agent may as well be modified to signal EOF
> ahead of the modifiers, unless it is looking for a particular modifier
> (annotation ?) which could be interspersed deeper among the modifiers.

Looking at the write references to org.eclipse.jdt.internal.compiler.parser.Scanner.eofPosition, I see some
interesting places to research in completion parser, completion scanner
etc. The majority of tests failing are in that area.
Comment 13 Stephan Herrmann CLA 2013-02-14 10:32:04 EST
(In reply to comment #9)
> > Count me in for fixing that manual process first, OK?
> 
> At IBM, we internally use an automation - after jikespg.exe has successfully
> declared the grammar as being LALR(1) and it does everything - the only thing
> manually needed is to refresh the work space.
> 
> I don't know the provenance of this automation code - I'll investigate what
> it would take to get it publicly available: It may be a while though.

As mentioned before, the Object Teams project has a simple script (bash)
for these tasks. Jesper, would that help you for the time being?
The only reason for not committing this to the JDT git was the platform
dependence, not everybody has a bash, unfortunately.
Comment 14 Stephan Herrmann CLA 2013-02-14 10:38:54 EST
(In reply to comment #13)
> As mentioned before, the Object Teams project has a simple script (bash)
> for these tasks. Jesper, would that help you for the time being?
> The only reason for not committing this to the JDT git was the platform
> dependence, not everybody has a bash, unfortunately.

And the links:

https://git.eclipse.org/c/objectteams/org.eclipse.objectteams.git/tree/org.eclipse.jdt.core/scripts/generateOTParser.sh

https://git.eclipse.org/c/objectteams/org.eclipse.objectteams.git/tree/org.eclipse.jdt.core/scripts/generateParser.launch

Watchout: there's one path to an existing jdt.jar hardcoded in the script.
And all this is not rocket science but it works (most of the time :) ).

I assume that Danes understand some words of German, which might help here :)
Comment 15 Jesper Moller CLA 2013-02-15 14:27:23 EST
(In reply to comment #14)
> I assume that Danes understand some words of German, which might help here :)

Nur ein Bisschen. Actually, before I read your comment, I put together a Java tool to invoke Jikespg and merge the output files.
I will submit this as a patch once I get back to a better networked and less snowy location.
I haven't gotten to actually hacking the grammar yet.
Comment 16 Jesper Moller CLA 2013-02-15 16:02:24 EST
Created attachment 227149 [details]
Patch so that the resource files can be generated by 'buildFilesFromLPG' without changing CWD

This little patch is required in order to run the Java main I'm uploading in a bit. It merely adds an overload to buildFilesFromLPG which takes a parent directory prefix to the files created.
Comment 17 Jesper Moller CLA 2013-02-15 16:08:30 EST
Created attachment 227150 [details]
Java Program to invoke Jikespg and merge the outputs nicely into the Source.

So now there's a Java program to to this, which is EPL and doesn't require bash or fiddling with the jdt.jar.

Makes a 'tmp' folder under 'grammar', may need to do away with that.

(Only tested on Mac OS X, running a self-compiled Jikespg 1.3 which even had issues with overflowing temporary char buffers using strcat, until I padded those buffers some more.)
Comment 18 Stephan Herrmann CLA 2013-02-15 16:51:31 EST
(In reply to comment #17)
> (Only tested on Mac OS X, running a self-compiled Jikespg 1.3 which even had
> issues with overflowing temporary char buffers using strcat, until I padded
> those buffers some more.)

Outch, I should probably alert you also of bug 380194 comment 30.
Not sure if that is relevant on Mac OS, too.
Fix is in the next comment.

Which buffer did you need to increase? I see that I also increased IOBUFFER_SIZE in common.h by factor 10.

I wish s.o. would still maintain jikespg, sigh.

Seems we should update the HowTo, once all is stabilized here.
Comment 19 Jesper Moller CLA 2013-02-15 19:51:17 EST
 Got an idea for a fix for this, without majorly changing the way which modifiers work. You may call it a hack, but it works well with the grammar.
 
 The crux of it is here:
 
-InterfaceMemberDeclaration ::= MethodHeader 'default' PushDefault MethodBody
+InterfaceMemberDeclaration ::= DefaultMethodHeader PushDefault MethodBody

where:

DefaultMethodHeader ::= DefaultMethodHeaderName FormalParameterListopt MethodHeaderRightParen MethodHeaderExtendedDims MethodHeaderThrowsClauseopt
/.$putCase consumeMethodHeader(); $break ./
/:$readableName MethodDeclaration:/

DefaultMethodHeaderName ::= Modifiersopt 'default' Modifiersopt PushCombineModifiers TypeParameters Type 'Identifier' '('
/.$putCase consumeMethodHeaderNameWithTypeParameters(false); $break ./
DefaultMethodHeaderName ::= Modifiersopt 'default' Modifiersopt PushCombineModifiers Type 'Identifier' '('
/.$putCase consumeMethodHeaderName(false); $break ./
/:$readableName MethodHeaderName:/

Basically, it maintains that default is not a 'proper' modifier, but is only used in this context. This allows a leading Modifieropt before the 'default' and then another one after, and uses a semantic action to combine those modifiers with the ones from the actual header. That way, there is no conflict between the Modifieropt and other uses of 'default', and parsing can reduce modifiers as usual.
I have NO idea how it will work with e.g. the completion parser, or error recovery, and I'm unsure how to test that, but the rest of the tests work fine, including getting the proper bits into the modifiers fields for testing on 'abstract' and counting the annotations used as a modifier.

Interested, or too hack-ish?

Also, I've reworked the tests (actually move the 'default'), too, and they are all green after the refac.
Comment 20 Srikanth Sankaran CLA 2013-02-15 19:55:48 EST
(In reply to comment #18)

> I wish s.o. would still maintain jikespg, sigh.

On my windows v1.3 (13 Oct 2001) executable, I have seen sporadic crashes.
Reruns usually work OK.

Recently I had a strange issue with the generated parser triggering AIOOB.
Turned out the START_STATE was assigned a huge number - while for our grammar
it is usually a small number closer to thousand. Regeneration fixed this too.
Comment 21 Jesper Moller CLA 2013-02-15 19:56:57 EST
Looking at what I just wrote, I refactored it further into

DefaultMethodHeader ::= DefaultMethodHeaderName FormalParameterListopt MethodHeaderRightParen MethodHeaderExtendedDims MethodHeaderThrowsClauseopt
/.$putCase consumeMethodHeader(); $break ./
/:$readableName MethodDeclaration:/

DefaultMethodHeaderName ::= ModifiersWithDefault TypeParameters Type 'Identifier' '('
/.$putCase consumeMethodHeaderNameWithTypeParameters(false); $break ./
DefaultMethodHeaderName ::= ModifiersWithDefault Type 'Identifier' '('
/.$putCase consumeMethodHeaderName(false); $break ./
/:$readableName MethodHeaderName:/

ModifiersWithDefault ::= Modifiersopt 'default' Modifiersopt
/.$putCase consumePushCombineModifiers(); $break ./
/:$readableName Modifiers:/
/:$compliance 1.8:/

Cleaner, right?

Making the automatic jikespg-tool post-processing tool is paying off now.
Comment 22 Srikanth Sankaran CLA 2013-02-15 20:04:14 EST
(In reply to comment #19)


> -InterfaceMemberDeclaration ::= MethodHeader 'default' PushDefault MethodBody
> +InterfaceMemberDeclaration ::= DefaultMethodHeader PushDefault MethodBody

> Interested, or too hack-ish?

As grammar refactorings go, this one is eminently understandable and
maintainable. Good thinking !

> Also, I've reworked the tests (actually move the 'default'), too, and they
> are all green after the refac.

Cool, I'll wait for the patch.
Comment 23 Srikanth Sankaran CLA 2013-02-15 20:06:15 EST
(In reply to comment #19)

> Basically, it maintains that default is not a 'proper' modifier, but is only
> used in this context. 

This is also nice in that various checkAndSet* methods won't need change
as it is rejected in the grammar itself.
Comment 24 Jesper Moller CLA 2013-02-15 20:06:24 EST
Created attachment 227154 [details]
Updated the test case
Comment 25 Jesper Moller CLA 2013-02-15 20:08:19 EST
Created attachment 227155 [details]
patch for other affected tests
Comment 26 Jesper Moller CLA 2013-02-15 20:19:20 EST
Created attachment 227156 [details]
Actual patch for java.g and Parser.java

This will require a run through jikespg afterwards, generates resources not supplied.
Comment 27 Srikanth Sankaran CLA 2013-02-15 21:59:59 EST
(In reply to comment #26)
> Created attachment 227156 [details]
> Actual patch for java.g and Parser.java
> 
> This will require a run through jikespg afterwards, generates resources not
> supplied.

I'll take it forward. I'll state that while I had not started working on this
being busy with other issues, I had overlooked the simpler solution and was
background processing a much more complex solution. Less is more - Thanks.

If it will "tickle you"/"give you the kicks" you are welcome to explore a 
simpler solution for bug 399773. I had gone with a extra-grammatical solution,
perhaps a simpler one can be found for that too.
Comment 28 Srikanth Sankaran CLA 2013-02-15 22:04:57 EST
(In reply to comment #27)

> If it will "tickle you"/"give you the kicks" you are welcome to explore a 
> simpler solution for bug 399773. I had gone with a extra-grammatical
> solution,
> perhaps a simpler one can be found for that too.

If you choose to do, I would recommend bounding by a reasonable time limits.
What we have is a maintainable solution, so we don't want to invest too much
into an alternate solution at this point in time.
Comment 29 Srikanth Sankaran CLA 2013-02-16 02:31:39 EST
Patch is good. I made some cleanups (reordered productions so they are
easy to compare, got rid of the parameter to consumeInterfaceMethodDeclaration,
got rid of the old production PushDefault) and one bug fix in recovery
(see consumePushCombineModifiers).

Fix released here: http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?h=BETA_JAVA8&id=f8ad0d5bfc02915772c8bb705662d0256beab7ec
Comment 30 Jesper Moller CLA 2013-02-16 14:31:18 EST
Srikanth or Stephan, would one of you mind committing the ParserGenerator if it works for you? I'm thinking it should reside outside the bundle's actual build, so it was available locally, but not for consumers.
Comment 31 Srikanth Sankaran CLA 2013-02-16 20:13:03 EST
(In reply to comment #30)
> Srikanth or Stephan, would one of you mind committing the ParserGenerator if
> it works for you? I'm thinking it should reside outside the bundle's actual
> build, so it was available locally, but not for consumers.

Jay, can you please investigate what is the best way to take this forward ?
Perhaps we need to create a org.eclipse.jdt.core.internal.tools project ?
Remember to have Jesper reflected as the author when you release it. TIA.
Comment 32 Stephan Herrmann CLA 2013-02-17 04:04:00 EST
(In reply to comment #31)
> (In reply to comment #30)
> > Srikanth or Stephan, would one of you mind committing the ParserGenerator if
> > it works for you? I'm thinking it should reside outside the bundle's actual
> > build, so it was available locally, but not for consumers.
> 
> Jay, can you please investigate what is the best way to take this forward ?
> Perhaps we need to create a org.eclipse.jdt.core.internal.tools project ?
> Remember to have Jesper reflected as the author when you release it. TIA.

Maybe it's enough to place GenerateParser.java into the existing scripts/
directory, and put an Ant file next to it so the tool can be invoked without
messing with the plug-in's buildpath?

BTW: why is scripts/GenerateBuildScript.class and friends stored in git?

Once we're confident about the generator, wouldn't we even want to run it
on every build so we can stop persisting *any* generated stuff?
Ah, there's one problem: the build may not have jikespg available,
maybe that's why.
Comment 33 Jay Arthanareeswaran CLA 2013-02-19 01:46:55 EST
(In reply to comment #32)
> Maybe it's enough to place GenerateParser.java into the existing scripts/
> directory, and put an Ant file next to it so the tool can be invoked without
> messing with the plug-in's buildpath?

Sounds good to me. We already have the precedence in the form of GenerateBuildScript.java.

Srikanth, a separate project would be an overkill unless you have something else to put in the project. I like the out of build path and ant script solution better simply because it's less work.

I quickly wrote an ant script and tested the GenerateParser. Works well, except that I would like to see the tmp folder to be cleaned-up.
Comment 34 Srikanth Sankaran CLA 2013-02-19 06:37:43 EST
(In reply to comment #33)

> Srikanth, a separate project would be an overkill unless you have something
> else to put in the project. I like the out of build path and ant script
> solution better simply because it's less work.

Less is more. Please feel free to settle for simpler solutions.

> I quickly wrote an ant script and tested the GenerateParser. Works well,
> except that I would like to see the tmp folder to be cleaned-up.

Also I liked Stephan's idea about an automatic refresh - is this possible ?
Comment 35 Jesper Moller CLA 2013-02-19 06:46:17 EST
(In reply to comment #34)
> Also I liked Stephan's idea about an automatic refresh - is this possible ?

If run as an Ant task as an External Tools, yes. Normal Launches can't refresh.