Community
Participate
Working Groups
Looking at a 0.6.1 spec I finally can see that the parser work around bug 380194 comment 4 is now obsolete: We need to move from void foo() default { stmts; } to default void foo() { stmts; } sigh. At least this might in turn obsolete bug 383973, let's see.
Jay, this is a leaf item - not blocking anything. Could be a good candidate for Shankha ?
Shankha, here's your first eclipse bug to fix. All the best!
Shankha, I'll take this one - for one thing this is coming in the way of compatibility testing with Oracle compiler as they have already migrated to the new syntax. For another, the straightforward change results in shift/reduce conflicts and so this may need some prior experience working with jikes parser generator.
Stephan/Jesper, is this an area you would like to try your hand ? Looks like the conflict arises due to the permissiveness in the grammar, where we don't discriminate what modifiers can legally occur where. When 'default' is seen in a snippet like below: switch (x) { case 10: x++; break; default // <---------------------- ?? Am I declaring something that has default modifier as a part of the case 10 SwitchBlockStatement, in which case the automaton should shift or am I starting a default case arm, in which case the automaton should reduce the x++; break; into a SwitchBlockStatement seems to be the conflict. I tried my hand some at refactoring the relevant productions, no luck yet - the "hammer" is always there, but I would like to see what a refactored grammar would looks like before deciding. Jesper, if you want to have a go at it: we use the jikes parser generator. JDT grammar file can be opened by Ctrl+Shift+R java.g. Some instructions are here: http://www.eclipse.org/jdt/core/howto/generate%20parser/generateParser.html I plan to spend some cycles on it anyways and we can share notes if one/more of you are also going to find time/interest.
I can give it a try - say - on the weekend unless s.o. already resolved this by then :) A quick guess: is the problem due to the generalized rule Modifiers that is used for class/method/field/local alike? Would it already help to split this rule into MethodModifiers and NonMethodModifiers? Since at statement level we don't expect a method, the default could then be detected as a default-branch of the switch, no?
(In reply to comment #5) > I can give it a try - say - on the weekend unless s.o. already resolved this > by then :) > > A quick guess: is the problem due to the generalized rule Modifiers that is > used for class/method/field/local alike? That is my initial assessment. > Would it already help to split this > rule into MethodModifiers and NonMethodModifiers? Since at statement level > we don't expect a method, the default could then be detected as a > default-branch of the switch, no? Probably. Thanks, this is an area where I don't mind multiple people simultaneously investing time - we can settle for what looks like the minimal, readable/maintainable solution.
(In reply to comment #4) > Some instructions are here: > http://www.eclipse.org/jdt/core/howto/generate%20parser/generateParser.html This page does not mention it explicitly - when there are conflicts, the file java.l is useful to study the states and the LR items that correspond to each state - it also lists the rules with numbers, so one can see how at a state we end with a shift/reduce or a reduce/reduce conflict.
(In reply to comment #4) > Stephan/Jesper, is this an area you would like to try your hand ? Sure. > Some instructions are here: > http://www.eclipse.org/jdt/core/howto/generate%20parser/generateParser.html You have got to be kidding me -- you do this manually every time? Even the itsy-bitsy parser used in the XPath2 project over in WTP has this part automated. As if dealing with parser generators wasn't tricky enough as it was... Count me in for fixing that manual process first, OK?
> Count me in for fixing that manual process first, OK? At IBM, we internally use an automation - after jikespg.exe has successfully declared the grammar as being LALR(1) and it does everything - the only thing manually needed is to refresh the work space. I don't know the provenance of this automation code - I'll investigate what it would take to get it publicly available: It may be a while though.
(In reply to comment #6) > > Would it already help to split this > > rule into MethodModifiers and NonMethodModifiers? Since at statement level > > we don't expect a method, the default could then be detected as a > > default-branch of the switch, no? This is one part of the problem. We can attempt conventional solution for this via refactoring/transforming the grammar rules. If it looks unwieldly or difficult to prove to ourselves that the G' == G, then we can resort to alternate solutions by munging the token - this will be very simple, but should still be resorted to only at last. The other conflict arises due to the use of Modifiersopt in recovery rules for headers: -- Modifiersopt is used to properly consume a header and exit the rule reduction at the end of the parse() method Goal ::= '>>>' Header1 Modifiersopt Goal ::= '!' Header2 Modifiersopt The comment is interesting. There ought to be extra-constitutional authority at play here to signal EOF after modifiers or otherwise why would input stream terminate with Modifiers ?? I removed Modifiersopt from these rules and ran the tests. Will attach the failing tests. Studying them could shed some light.
Created attachment 227052 [details] Test failures xml log of tests that fail if Modifiersopt is removed from the rules for the recovery of headers. Since the automaton would terminate and declare success only when unconsumed input is empty, i.e input string is fully tokenized and parsed, just after Modifiers are seen in input, EOF should get signalled somehow. If there is an agent that is doing it, that agent may as well be modified to signal EOF ahead of the modifiers, unless it is looking for a particular modifier (annotation ?) which could be interspersed deeper among the modifiers.
(In reply to comment #11) > Since the automaton would terminate and declare success only when unconsumed > input is empty, i.e input string is fully tokenized and parsed, just after > Modifiers are seen in input, EOF should get signalled somehow. If there is > an agent that is doing it, that agent may as well be modified to signal EOF > ahead of the modifiers, unless it is looking for a particular modifier > (annotation ?) which could be interspersed deeper among the modifiers. Looking at the write references to org.eclipse.jdt.internal.compiler.parser.Scanner.eofPosition, I see some interesting places to research in completion parser, completion scanner etc. The majority of tests failing are in that area.
(In reply to comment #9) > > Count me in for fixing that manual process first, OK? > > At IBM, we internally use an automation - after jikespg.exe has successfully > declared the grammar as being LALR(1) and it does everything - the only thing > manually needed is to refresh the work space. > > I don't know the provenance of this automation code - I'll investigate what > it would take to get it publicly available: It may be a while though. As mentioned before, the Object Teams project has a simple script (bash) for these tasks. Jesper, would that help you for the time being? The only reason for not committing this to the JDT git was the platform dependence, not everybody has a bash, unfortunately.
(In reply to comment #13) > As mentioned before, the Object Teams project has a simple script (bash) > for these tasks. Jesper, would that help you for the time being? > The only reason for not committing this to the JDT git was the platform > dependence, not everybody has a bash, unfortunately. And the links: https://git.eclipse.org/c/objectteams/org.eclipse.objectteams.git/tree/org.eclipse.jdt.core/scripts/generateOTParser.sh https://git.eclipse.org/c/objectteams/org.eclipse.objectteams.git/tree/org.eclipse.jdt.core/scripts/generateParser.launch Watchout: there's one path to an existing jdt.jar hardcoded in the script. And all this is not rocket science but it works (most of the time :) ). I assume that Danes understand some words of German, which might help here :)
(In reply to comment #14) > I assume that Danes understand some words of German, which might help here :) Nur ein Bisschen. Actually, before I read your comment, I put together a Java tool to invoke Jikespg and merge the output files. I will submit this as a patch once I get back to a better networked and less snowy location. I haven't gotten to actually hacking the grammar yet.
Created attachment 227149 [details] Patch so that the resource files can be generated by 'buildFilesFromLPG' without changing CWD This little patch is required in order to run the Java main I'm uploading in a bit. It merely adds an overload to buildFilesFromLPG which takes a parent directory prefix to the files created.
Created attachment 227150 [details] Java Program to invoke Jikespg and merge the outputs nicely into the Source. So now there's a Java program to to this, which is EPL and doesn't require bash or fiddling with the jdt.jar. Makes a 'tmp' folder under 'grammar', may need to do away with that. (Only tested on Mac OS X, running a self-compiled Jikespg 1.3 which even had issues with overflowing temporary char buffers using strcat, until I padded those buffers some more.)
(In reply to comment #17) > (Only tested on Mac OS X, running a self-compiled Jikespg 1.3 which even had > issues with overflowing temporary char buffers using strcat, until I padded > those buffers some more.) Outch, I should probably alert you also of bug 380194 comment 30. Not sure if that is relevant on Mac OS, too. Fix is in the next comment. Which buffer did you need to increase? I see that I also increased IOBUFFER_SIZE in common.h by factor 10. I wish s.o. would still maintain jikespg, sigh. Seems we should update the HowTo, once all is stabilized here.
Got an idea for a fix for this, without majorly changing the way which modifiers work. You may call it a hack, but it works well with the grammar. The crux of it is here: -InterfaceMemberDeclaration ::= MethodHeader 'default' PushDefault MethodBody +InterfaceMemberDeclaration ::= DefaultMethodHeader PushDefault MethodBody where: DefaultMethodHeader ::= DefaultMethodHeaderName FormalParameterListopt MethodHeaderRightParen MethodHeaderExtendedDims MethodHeaderThrowsClauseopt /.$putCase consumeMethodHeader(); $break ./ /:$readableName MethodDeclaration:/ DefaultMethodHeaderName ::= Modifiersopt 'default' Modifiersopt PushCombineModifiers TypeParameters Type 'Identifier' '(' /.$putCase consumeMethodHeaderNameWithTypeParameters(false); $break ./ DefaultMethodHeaderName ::= Modifiersopt 'default' Modifiersopt PushCombineModifiers Type 'Identifier' '(' /.$putCase consumeMethodHeaderName(false); $break ./ /:$readableName MethodHeaderName:/ Basically, it maintains that default is not a 'proper' modifier, but is only used in this context. This allows a leading Modifieropt before the 'default' and then another one after, and uses a semantic action to combine those modifiers with the ones from the actual header. That way, there is no conflict between the Modifieropt and other uses of 'default', and parsing can reduce modifiers as usual. I have NO idea how it will work with e.g. the completion parser, or error recovery, and I'm unsure how to test that, but the rest of the tests work fine, including getting the proper bits into the modifiers fields for testing on 'abstract' and counting the annotations used as a modifier. Interested, or too hack-ish? Also, I've reworked the tests (actually move the 'default'), too, and they are all green after the refac.
(In reply to comment #18) > I wish s.o. would still maintain jikespg, sigh. On my windows v1.3 (13 Oct 2001) executable, I have seen sporadic crashes. Reruns usually work OK. Recently I had a strange issue with the generated parser triggering AIOOB. Turned out the START_STATE was assigned a huge number - while for our grammar it is usually a small number closer to thousand. Regeneration fixed this too.
Looking at what I just wrote, I refactored it further into DefaultMethodHeader ::= DefaultMethodHeaderName FormalParameterListopt MethodHeaderRightParen MethodHeaderExtendedDims MethodHeaderThrowsClauseopt /.$putCase consumeMethodHeader(); $break ./ /:$readableName MethodDeclaration:/ DefaultMethodHeaderName ::= ModifiersWithDefault TypeParameters Type 'Identifier' '(' /.$putCase consumeMethodHeaderNameWithTypeParameters(false); $break ./ DefaultMethodHeaderName ::= ModifiersWithDefault Type 'Identifier' '(' /.$putCase consumeMethodHeaderName(false); $break ./ /:$readableName MethodHeaderName:/ ModifiersWithDefault ::= Modifiersopt 'default' Modifiersopt /.$putCase consumePushCombineModifiers(); $break ./ /:$readableName Modifiers:/ /:$compliance 1.8:/ Cleaner, right? Making the automatic jikespg-tool post-processing tool is paying off now.
(In reply to comment #19) > -InterfaceMemberDeclaration ::= MethodHeader 'default' PushDefault MethodBody > +InterfaceMemberDeclaration ::= DefaultMethodHeader PushDefault MethodBody > Interested, or too hack-ish? As grammar refactorings go, this one is eminently understandable and maintainable. Good thinking ! > Also, I've reworked the tests (actually move the 'default'), too, and they > are all green after the refac. Cool, I'll wait for the patch.
(In reply to comment #19) > Basically, it maintains that default is not a 'proper' modifier, but is only > used in this context. This is also nice in that various checkAndSet* methods won't need change as it is rejected in the grammar itself.
Created attachment 227154 [details] Updated the test case
Created attachment 227155 [details] patch for other affected tests
Created attachment 227156 [details] Actual patch for java.g and Parser.java This will require a run through jikespg afterwards, generates resources not supplied.
(In reply to comment #26) > Created attachment 227156 [details] > Actual patch for java.g and Parser.java > > This will require a run through jikespg afterwards, generates resources not > supplied. I'll take it forward. I'll state that while I had not started working on this being busy with other issues, I had overlooked the simpler solution and was background processing a much more complex solution. Less is more - Thanks. If it will "tickle you"/"give you the kicks" you are welcome to explore a simpler solution for bug 399773. I had gone with a extra-grammatical solution, perhaps a simpler one can be found for that too.
(In reply to comment #27) > If it will "tickle you"/"give you the kicks" you are welcome to explore a > simpler solution for bug 399773. I had gone with a extra-grammatical > solution, > perhaps a simpler one can be found for that too. If you choose to do, I would recommend bounding by a reasonable time limits. What we have is a maintainable solution, so we don't want to invest too much into an alternate solution at this point in time.
Patch is good. I made some cleanups (reordered productions so they are easy to compare, got rid of the parameter to consumeInterfaceMethodDeclaration, got rid of the old production PushDefault) and one bug fix in recovery (see consumePushCombineModifiers). Fix released here: http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?h=BETA_JAVA8&id=f8ad0d5bfc02915772c8bb705662d0256beab7ec
Srikanth or Stephan, would one of you mind committing the ParserGenerator if it works for you? I'm thinking it should reside outside the bundle's actual build, so it was available locally, but not for consumers.
(In reply to comment #30) > Srikanth or Stephan, would one of you mind committing the ParserGenerator if > it works for you? I'm thinking it should reside outside the bundle's actual > build, so it was available locally, but not for consumers. Jay, can you please investigate what is the best way to take this forward ? Perhaps we need to create a org.eclipse.jdt.core.internal.tools project ? Remember to have Jesper reflected as the author when you release it. TIA.
(In reply to comment #31) > (In reply to comment #30) > > Srikanth or Stephan, would one of you mind committing the ParserGenerator if > > it works for you? I'm thinking it should reside outside the bundle's actual > > build, so it was available locally, but not for consumers. > > Jay, can you please investigate what is the best way to take this forward ? > Perhaps we need to create a org.eclipse.jdt.core.internal.tools project ? > Remember to have Jesper reflected as the author when you release it. TIA. Maybe it's enough to place GenerateParser.java into the existing scripts/ directory, and put an Ant file next to it so the tool can be invoked without messing with the plug-in's buildpath? BTW: why is scripts/GenerateBuildScript.class and friends stored in git? Once we're confident about the generator, wouldn't we even want to run it on every build so we can stop persisting *any* generated stuff? Ah, there's one problem: the build may not have jikespg available, maybe that's why.
(In reply to comment #32) > Maybe it's enough to place GenerateParser.java into the existing scripts/ > directory, and put an Ant file next to it so the tool can be invoked without > messing with the plug-in's buildpath? Sounds good to me. We already have the precedence in the form of GenerateBuildScript.java. Srikanth, a separate project would be an overkill unless you have something else to put in the project. I like the out of build path and ant script solution better simply because it's less work. I quickly wrote an ant script and tested the GenerateParser. Works well, except that I would like to see the tmp folder to be cleaned-up.
(In reply to comment #33) > Srikanth, a separate project would be an overkill unless you have something > else to put in the project. I like the out of build path and ant script > solution better simply because it's less work. Less is more. Please feel free to settle for simpler solutions. > I quickly wrote an ant script and tested the GenerateParser. Works well, > except that I would like to see the tmp folder to be cleaned-up. Also I liked Stephan's idea about an automatic refresh - is this possible ?
(In reply to comment #34) > Also I liked Stephan's idea about an automatic refresh - is this possible ? If run as an Ant task as an External Tools, yes. Normal Launches can't refresh.