Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 518095

Summary: ejc failing to build with StackOverflowError
Product: [Eclipse Project] JDT Reporter: Richard Steiger <rsteiger>
Component: CoreAssignee: Stephan Herrmann <stephan.herrmann>
Status: CLOSED DUPLICATE QA Contact:
Severity: blocker    
Priority: P3 CC: felix.rotthowe, manoj.palat, rsteiger, stephan.herrmann, ta
Version: 4.7   
Target Milestone: 4.8 M3   
Hardware: All   
OS: All   
Whiteboard:
Attachments:
Description Flags
Block repeated recursive calls to RawTypeBinding::collectInferenceVariables with same variables none

Description Richard Steiger CLA 2017-06-10 20:43:42 EDT
host: Dell XPS 8920, Windows 10 pro
IDE version: eclipse-java-oxygen-M7-win32-x86_64
JDK version: jdk1.8.0_152
jdt version: org.eclipse.jdt.core_3.13.0.v20170516-1929

I originally described this bug through comments to "Bug 432541 Stack Overflow in Java Search - type inference issue?". After this bug had been verified fixed, I began seeing the identical symptoms, and added comment 44 to this effect on 2017-3-1. Stephan Herrmann suggested not simply reopening 432541, but to document the issue separately, so a decision to reopen could be made on its own merits.

When viewed in the Error Log, the backtrace leaves no evidence (initial calls are clobbered by the copious StackOverflowError.<init> calls), but occasionally the error notifier popup in the editor contains the following call: 

org.eclipse.jdt.internal.compiler.lookup.ParameterizedTypeBinding.collectInferenceVariables(ParameterizedTypeBinding.java:959)

I've been able to narrow the occurrence of this bug to a single java project (having either a Java or Maven nature), where it fails 100% reproducibly, thereby blocking the project's development. 

The project contains ~1500 source files, and makes very heavy use of generics, and in numerous places, complex, mutually-recursive type signatures. It seems to be happening in a set of types representing digraphs, where some signatures (e.g. Node) have 2 parameters for edges (source and target), and can't be further simplified. Unfortunately, this set of types is woven with many other types, and all attempts to isolate a small repro set have failed. I know of no workaround.

I've seen this failure occur with both java 7 and 8, and in neon-1, neon-2, and oxygen-m5, m6, and m7. 

My current root-cause theory is that the type inferencer is failing to distinguish multiple synonymous type parameters by their positions in member or type declarations, thereby getting confused when traversing some type decl's type graph, basically missing the fact that it's already processed a binding in a previous position, and repetitively reprocesssing the binding.
Comment 1 Richard Steiger CLA 2017-06-10 20:52:23 EDT
As I've mentioned elsewhere, I'm an eclipse development newbie. Following initial suggestions from Ed Merk, I've installed the JDT dev env via Oomph, created a dummy RCP app, am able to launch the app, load the problematic projects, trigger a compile, and repro the SOE.  The overflow stacktrace is also printing to the JDT instance's console.  

Where I'm stuck: despite setting breakpoints in ejc (in org.eclipse.jdt.internal.compiler.lookup.ParameterizedTypeBinding, via the source editor), none of them are hit when ejc runs.  I checked the org.eclipse.jdt.core project's compilation debugging flags, all are enabled.  The editor has no problems manipulating breakpoints in this file.  They simply aren't getting set on the dummy app's vm.  

Help.
Comment 2 Stephan Herrmann CLA 2017-06-11 04:52:21 EDT
Versions all look good.

First to the stack overflow itself: Are you saying the error log doesn't even show, where the error occurs? Are you saying the error log indicates a recursion in StackOverflowError.<init> itself??
Please paster or attach any information you have on that error, either from the log or from the console of the outer Eclipse.

On debugging: Great you succeeded to reproduce in a runtime instance. Regarding breakpoints that don't suspend a few suggestions to try:
- set a breakpoint early in the compiler, e.g.:
  org.eclipse.jdt.internal.compiler.Compiler.beginToCompile(ICompilationUnit[])
  does it stop?
- use a method breakpoint instead of a line breakpoint
  (caveat: method breakpoints slow down the compiler :( )
- manually suspend the target vm while it's busy building the workspace
(I'm sure you know the difference between Run as and Debug as ;p )
Comment 3 Richard Steiger CLA 2017-06-11 14:39:32 EDT
Good questions, I overlooked examining the app's Error Log, and there's indeed an entry that's many identical lines like:

java.lang.StackOverflowError
	at org.eclipse.jdt.internal.compiler.lookup.ParameterizedTypeBinding.collectInferenceVariables(ParameterizedTypeBinding.java:959)

which is the same as originally posted above, and which indeed points to a recursive call-site that matches the symptoms.

I put a static depth-limit in the method and did a Debug as, and hit the limit, stack looks as follows:

RawTypeBinding(ParameterizedTypeBinding).collectInferenceVariables(Set<InferenceVariable>) line: 963	
RawTypeBinding(ParameterizedTypeBinding).collectInferenceVariables(Set<InferenceVariable>) line: 963	
ParameterizedTypeBinding.collectInferenceVariables(Set<InferenceVariable>) line: 963	
ConstraintExceptionFormula(ConstraintFormula).outputVariables(InferenceContext18) line: 41	
InferenceContext18.allOutputVariables(Set<ConstraintFormula>) line: 1539	
InferenceContext18.inferInvocationType(TypeBinding, InvocationSite, MethodBinding) line: 431	
ParameterizedGenericMethodBinding.computeCompatibleMethod18(MethodBinding, TypeBinding[], Scope, InvocationSite) line: 266	
ParameterizedGenericMethodBinding.computeCompatibleMethod(MethodBinding, TypeBinding[], Scope, InvocationSite) line: 88	
MethodScope(Scope).computeCompatibleMethod(MethodBinding, TypeBinding[], InvocationSite, boolean) line: 771	
MethodScope(Scope).computeCompatibleMethod(MethodBinding, TypeBinding[], InvocationSite) line: 728	
MethodScope(Scope).findMethod0(ReferenceBinding, char[], TypeBinding[], InvocationSite, boolean) line: 1686	
MethodScope(Scope).findMethod(ReferenceBinding, char[], TypeBinding[], InvocationSite, boolean) line: 1587	
MethodScope(Scope).getMethod(TypeBinding, char[], TypeBinding[], InvocationSite) line: 2870	
MessageSend.findMethodBinding(BlockScope) line: 934	
MessageSend.resolveType(BlockScope) line: 757	
ReturnStatement.resolve(BlockScope) line: 342	
MethodDeclaration(AbstractMethodDeclaration).resolveStatements() line: 634	
MethodDeclaration.resolveStatements() line: 306	
MethodDeclaration(AbstractMethodDeclaration).resolve(ClassScope) line: 544	
TypeDeclaration.resolve() line: 1195	
TypeDeclaration.resolve(CompilationUnitScope) line: 1308	
CompilationUnitDeclaration.resolve() line: 605	
Compiler.process(CompilationUnitDeclaration, int) line: 867	
ProcessTaskManager.run() line: 141	
Thread.run() line: 745	

The root-cause is as I guessed, namely "this.arguments[i].collectInferenceVariables(variables);" is recursively called without any qualification on whether a given argument has already been passed through the method.  

At this point, it seems most prudent to have someone like yourself who knows ejc internals to determine the best fix.  Please feel free to either suggest specific patches, which I'll verify resolve this issue, or point me toward attempting to investigate on my own (with an undoubtedly steep learning curve).  BTW, are there any docs on ejc internals that might be relevant?

Waiting to hear back.
Comment 4 Stephan Herrmann CLA 2017-06-11 15:02:25 EDT
Thanks for investigating thus far.

A raw type binding referencing itself as a type argument? "Shouldn't happen" :)

We do expect TypeVariableBindings to cause cyclic type structures, still for raw types we didn't see, how they could cause a cycle. BUT, there's a funny method RawTypeBinding.initializeArguments(), which erases the generic type's type variables and sets them as type arguments of the raw type (don't ask me why :) ). If a type variable has a relevant bound, that might be able to cause a reference cycle.

You should be able to check this interpretation in the debugger: does RawTypeBinding#arguments indirectly contain the raw type itself? (Interestingly, direct circles are already detected).

The canonical solution would be to add the following override to RawTypeBinding:

	@Override
	void collectInferenceVariables(Set<InferenceVariable> variables) {
		if (this.inRecursiveFunction)
			return; // nothing seen
		this.inRecursiveFunction = true;
		try {
			super.collectInferenceVariables(variables);
			}
		} finally {
			this.inRecursiveFunction = false;
		}
	}

The field "boolean inRecursiveFunction = false;" still needs to be added to that type.

This should fix the SOE, doesn't it? 
In particular a breakpoing on "return; // nothing seen" should indeed be triggered.
Comment 5 Richard Steiger CLA 2017-06-11 18:08:47 EDT
Your suggested fix works, not seeing any more SOEs!

At this point, having not submitted any eclipse changes, I have a learning curve ahead of me, which I'm inclined to take on, so can become an actual contributor (bot sure where the learning trail picks-up, but can likely find it).  If, on the other hand, you feel like just making the mod yourself, that would be fine with me, as well (I have a lot of lost ground to cover in the now-unblocked project).  Let me know what you advise.

Thanks for the help!

(BTW, found the pointers into the JLS in comments very helpful for grokking context, not a replacement for good internals docs, but way better than nothing :-)
Comment 6 Richard Steiger CLA 2017-06-11 19:36:40 EDT
Created attachment 268861 [details]
Block repeated recursive calls to RawTypeBinding::collectInferenceVariables with same variables

Fix Bug 518095 - ejc failing to build with StackOverflowError
Comment 7 Richard Steiger CLA 2017-06-11 19:39:01 EDT
Attached the patch.
Comment 8 Richard Steiger CLA 2017-06-11 19:39:16 EDT
Attached the patch.
Comment 9 Richard Steiger CLA 2017-06-11 19:39:22 EDT
Attached the patch.
Comment 10 Stephan Herrmann CLA 2017-07-30 17:51:12 EDT
Comment on attachment 268861 [details]
Block repeated recursive calls to RawTypeBinding::collectInferenceVariables with same variables

It would help if you attach an actual patch (= diff), not the full file (or ideally create a gerrit change).
Comment 11 Stephan Herrmann CLA 2017-08-26 20:54:26 EDT
*** Bug 521423 has been marked as a duplicate of this bug. ***
Comment 12 Stephan Herrmann CLA 2017-08-26 20:57:39 EDT
@Richard, as duplicates are starting to come in, can you provide any hints about the code pattern that triggers the bug?
Comment 13 Stephan Herrmann CLA 2017-08-29 03:55:20 EDT
moving here the discussion in a duplicate:

(In reply to Christian Bulitta from bug 521423 comment #3)
> Hello Stephan,
> 
> I'd love to give you a reproducible example. Unfortunately I have none and
> due to IP reasons I cannot give avay the full sourcecode of my company (34
> maven modules, 2500+ .java Files). Is there any type of debugging I can do
> on my side to isolate the problem?
> 
> If I knew what to look for I would be really happy. I'm open for optios to
> avoid certain patterns and change our code.

I'm still hoping, Richard had seen enough during his experiments to describe the pattern that triggered the bug in his case?
Comment 14 Stephan Herrmann CLA 2017-10-31 09:46:55 EDT
Meanwhile this has been fixed via bug 525576 (released for 4.8M3 and pending backport to 4.7.2).

*** This bug has been marked as a duplicate of bug 525576 ***