| Summary: | ejc failing to build with StackOverflowError | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] JDT | Reporter: | Richard Steiger <rsteiger> | ||||
| Component: | Core | Assignee: | Stephan Herrmann <stephan.herrmann> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | |||||
| Severity: | blocker | ||||||
| Priority: | P3 | CC: | felix.rotthowe, manoj.palat, rsteiger, stephan.herrmann, ta | ||||
| Version: | 4.7 | ||||||
| Target Milestone: | 4.8 M3 | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | |||||||
| Attachments: |
|
||||||
|
Description
Richard Steiger
As I've mentioned elsewhere, I'm an eclipse development newbie. Following initial suggestions from Ed Merk, I've installed the JDT dev env via Oomph, created a dummy RCP app, am able to launch the app, load the problematic projects, trigger a compile, and repro the SOE. The overflow stacktrace is also printing to the JDT instance's console. Where I'm stuck: despite setting breakpoints in ejc (in org.eclipse.jdt.internal.compiler.lookup.ParameterizedTypeBinding, via the source editor), none of them are hit when ejc runs. I checked the org.eclipse.jdt.core project's compilation debugging flags, all are enabled. The editor has no problems manipulating breakpoints in this file. They simply aren't getting set on the dummy app's vm. Help. Versions all look good. First to the stack overflow itself: Are you saying the error log doesn't even show, where the error occurs? Are you saying the error log indicates a recursion in StackOverflowError.<init> itself?? Please paster or attach any information you have on that error, either from the log or from the console of the outer Eclipse. On debugging: Great you succeeded to reproduce in a runtime instance. Regarding breakpoints that don't suspend a few suggestions to try: - set a breakpoint early in the compiler, e.g.: org.eclipse.jdt.internal.compiler.Compiler.beginToCompile(ICompilationUnit[]) does it stop? - use a method breakpoint instead of a line breakpoint (caveat: method breakpoints slow down the compiler :( ) - manually suspend the target vm while it's busy building the workspace (I'm sure you know the difference between Run as and Debug as ;p ) Good questions, I overlooked examining the app's Error Log, and there's indeed an entry that's many identical lines like: java.lang.StackOverflowError at org.eclipse.jdt.internal.compiler.lookup.ParameterizedTypeBinding.collectInferenceVariables(ParameterizedTypeBinding.java:959) which is the same as originally posted above, and which indeed points to a recursive call-site that matches the symptoms. I put a static depth-limit in the method and did a Debug as, and hit the limit, stack looks as follows: RawTypeBinding(ParameterizedTypeBinding).collectInferenceVariables(Set<InferenceVariable>) line: 963 RawTypeBinding(ParameterizedTypeBinding).collectInferenceVariables(Set<InferenceVariable>) line: 963 ParameterizedTypeBinding.collectInferenceVariables(Set<InferenceVariable>) line: 963 ConstraintExceptionFormula(ConstraintFormula).outputVariables(InferenceContext18) line: 41 InferenceContext18.allOutputVariables(Set<ConstraintFormula>) line: 1539 InferenceContext18.inferInvocationType(TypeBinding, InvocationSite, MethodBinding) line: 431 ParameterizedGenericMethodBinding.computeCompatibleMethod18(MethodBinding, TypeBinding[], Scope, InvocationSite) line: 266 ParameterizedGenericMethodBinding.computeCompatibleMethod(MethodBinding, TypeBinding[], Scope, InvocationSite) line: 88 MethodScope(Scope).computeCompatibleMethod(MethodBinding, TypeBinding[], InvocationSite, boolean) line: 771 MethodScope(Scope).computeCompatibleMethod(MethodBinding, TypeBinding[], InvocationSite) line: 728 MethodScope(Scope).findMethod0(ReferenceBinding, char[], TypeBinding[], InvocationSite, boolean) line: 1686 MethodScope(Scope).findMethod(ReferenceBinding, char[], TypeBinding[], InvocationSite, boolean) line: 1587 MethodScope(Scope).getMethod(TypeBinding, char[], TypeBinding[], InvocationSite) line: 2870 MessageSend.findMethodBinding(BlockScope) line: 934 MessageSend.resolveType(BlockScope) line: 757 ReturnStatement.resolve(BlockScope) line: 342 MethodDeclaration(AbstractMethodDeclaration).resolveStatements() line: 634 MethodDeclaration.resolveStatements() line: 306 MethodDeclaration(AbstractMethodDeclaration).resolve(ClassScope) line: 544 TypeDeclaration.resolve() line: 1195 TypeDeclaration.resolve(CompilationUnitScope) line: 1308 CompilationUnitDeclaration.resolve() line: 605 Compiler.process(CompilationUnitDeclaration, int) line: 867 ProcessTaskManager.run() line: 141 Thread.run() line: 745 The root-cause is as I guessed, namely "this.arguments[i].collectInferenceVariables(variables);" is recursively called without any qualification on whether a given argument has already been passed through the method. At this point, it seems most prudent to have someone like yourself who knows ejc internals to determine the best fix. Please feel free to either suggest specific patches, which I'll verify resolve this issue, or point me toward attempting to investigate on my own (with an undoubtedly steep learning curve). BTW, are there any docs on ejc internals that might be relevant? Waiting to hear back. Thanks for investigating thus far.
A raw type binding referencing itself as a type argument? "Shouldn't happen" :)
We do expect TypeVariableBindings to cause cyclic type structures, still for raw types we didn't see, how they could cause a cycle. BUT, there's a funny method RawTypeBinding.initializeArguments(), which erases the generic type's type variables and sets them as type arguments of the raw type (don't ask me why :) ). If a type variable has a relevant bound, that might be able to cause a reference cycle.
You should be able to check this interpretation in the debugger: does RawTypeBinding#arguments indirectly contain the raw type itself? (Interestingly, direct circles are already detected).
The canonical solution would be to add the following override to RawTypeBinding:
@Override
void collectInferenceVariables(Set<InferenceVariable> variables) {
if (this.inRecursiveFunction)
return; // nothing seen
this.inRecursiveFunction = true;
try {
super.collectInferenceVariables(variables);
}
} finally {
this.inRecursiveFunction = false;
}
}
The field "boolean inRecursiveFunction = false;" still needs to be added to that type.
This should fix the SOE, doesn't it?
In particular a breakpoing on "return; // nothing seen" should indeed be triggered.
Your suggested fix works, not seeing any more SOEs! At this point, having not submitted any eclipse changes, I have a learning curve ahead of me, which I'm inclined to take on, so can become an actual contributor (bot sure where the learning trail picks-up, but can likely find it). If, on the other hand, you feel like just making the mod yourself, that would be fine with me, as well (I have a lot of lost ground to cover in the now-unblocked project). Let me know what you advise. Thanks for the help! (BTW, found the pointers into the JLS in comments very helpful for grokking context, not a replacement for good internals docs, but way better than nothing :-) Created attachment 268861 [details] Block repeated recursive calls to RawTypeBinding::collectInferenceVariables with same variables Fix Bug 518095 - ejc failing to build with StackOverflowError Attached the patch. Attached the patch. Attached the patch. Comment on attachment 268861 [details]
Block repeated recursive calls to RawTypeBinding::collectInferenceVariables with same variables
It would help if you attach an actual patch (= diff), not the full file (or ideally create a gerrit change).
*** Bug 521423 has been marked as a duplicate of this bug. *** @Richard, as duplicates are starting to come in, can you provide any hints about the code pattern that triggers the bug? moving here the discussion in a duplicate: (In reply to Christian Bulitta from bug 521423 comment #3) > Hello Stephan, > > I'd love to give you a reproducible example. Unfortunately I have none and > due to IP reasons I cannot give avay the full sourcecode of my company (34 > maven modules, 2500+ .java Files). Is there any type of debugging I can do > on my side to isolate the problem? > > If I knew what to look for I would be really happy. I'm open for optios to > avoid certain patterns and change our code. I'm still hoping, Richard had seen enough during his experiments to describe the pattern that triggered the bug in his case? Meanwhile this has been fixed via bug 525576 (released for 4.8M3 and pending backport to 4.7.2). *** This bug has been marked as a duplicate of bug 525576 *** |