Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 312929 - exception during terminate & remove of Ant debug test
Summary: exception during terminate & remove of Ant debug test
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Ant (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.8 M5   Edit
Assignee: Satyam Kandula CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-14 11:38 EDT by Darin Wright CLA
Modified: 2012-01-10 10:35 EST (History)
2 users (show)

See Also:
Michael_Rennie: review+


Attachments
Proposed patch (842 bytes, patch)
2011-08-29 11:47 EDT, Satyam Kandula CLA
no flags Details | Diff
Proposed patch (2.00 KB, patch)
2011-09-05 02:27 EDT, Satyam Kandula CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darin Wright CLA 2010-05-14 11:38:54 EDT
Ant had a similar failure in two consecutive builds. The problem ocurrs after the test is complete and the debugger is being termianted. Looks like a possible synch problem in the Ant debug model (where a field is being nulled out as the target is shutdown, but still being used by another thread).

java.lang.NullPointerException
at org.eclipse.ant.internal.launching.debug.model.RemoteAntDebugBuildListener.sendRequest(RemoteAntDebugBuildListener.java:175)
at org.eclipse.ant.internal.launching.debug.model.RemoteAntDebugBuildListener.resume(RemoteAntDebugBuildListener.java:219)
at org.eclipse.ant.internal.launching.debug.model.AntDebugTarget.resume(AntDebugTarget.java:197)
at org.eclipse.ant.internal.launching.debug.model.AntDebugTarget.terminated(AntDebugTarget.java:344)
at org.eclipse.ant.internal.launching.debug.model.AntDebugTarget.terminate(AntDebugTarget.java:168)
at org.eclipse.ant.tests.ui.debug.AbstractAntDebugTest.terminateAndRemove(AbstractAntDebugTest.java:526)
at org.eclipse.ant.tests.ui.debug.AbstractAntDebugTest.terminateAndRemove(AbstractAntDebugTest.java:500)
at org.eclipse.ant.tests.ui.debug.SteppingTests.debugStack(SteppingTests.java:127)
at org.eclipse.ant.tests.ui.debug.SteppingTests.antCallStack(SteppingTests.java:111)
at org.eclipse.ant.tests.ui.debug.SteppingTests.antCallStack(SteppingTests.java:106)
at org.eclipse.ant.tests.ui.debug.SteppingTests.testStepBackFromAntCallSepVM(SteppingTests.java:39)
at org.eclipse.ant.tests.ui.AbstractAntUIBuildTest.access$0(AbstractAntUIBuildTest.java:1)
at org.eclipse.ant.tests.ui.AbstractAntUIBuildTest$1.run(AbstractAntUIBuildTest.java:44)
at java.lang.Thread.run(Thread.java:595)
Comment 1 Darin Wright CLA 2010-05-14 11:39:13 EDT
build: I20100513-1500
Comment 2 Michael Rennie CLA 2010-09-08 09:48:21 EDT
failed again on Windows in I20100907-0800
Comment 3 Satyam Kandula CLA 2011-08-29 07:12:11 EDT
SteppingTests#testStepBackFromAntCallSepVM failed on N20110827-2000 not with the exact callstack but probably because of the same reason. 
Here is the call stack. 
######
java.lang.NullPointerException at org.eclipse.ant.internal.launching.debug.model.AntDebugTarget.terminated(AntDebugTarget.java:345) at org.eclipse.ant.internal.launching.debug.model.AntDebugTarget.terminate(AntDebugTarget.java:168) at org.eclipse.ant.tests.ui.debug.AbstractAntDebugTest.terminateAndRemove(AbstractAntDebugTest.java:526) at org.eclipse.ant.tests.ui.debug.AbstractAntDebugTest.terminateAndRemove(AbstractAntDebugTest.java:500) at org.eclipse.ant.tests.ui.debug.SteppingTests.debugStack(SteppingTests.java:127) at org.eclipse.ant.tests.ui.debug.SteppingTests.antCallStack(SteppingTests.java:111) at org.eclipse.ant.tests.ui.debug.SteppingTests.antCallStack(SteppingTests.java:106) at org.eclipse.ant.tests.ui.debug.SteppingTests.testStepBackFromAntCallSepVM(SteppingTests.java:39) at org.eclipse.ant.tests.ui.AbstractAntUIBuildTest.access$0(AbstractAntUIBuildTest.java:1) at org.eclipse.ant.tests.ui.AbstractAntUIBuildTest$1.run(AbstractAntUIBuildTest.java:44) at java.lang.Thread.run(Thread.java:619)
########
AntDebugTarget.terminated is not synchronized and I believe that is the reason that there is another thread that would have got in.  I will investigate more.
Comment 4 Satyam Kandula CLA 2011-08-29 11:42:10 EDT
Yes, terminated gets called from two other places.
########
Daemon Thread [Ant Build Server Connection] (Suspended (breakpoint at line 341 in AntDebugTarget))	
	AntDebugTarget.terminated() line: 341	
	AntDebugTarget.terminate() line: 168	
	RemoteAntDebugBuildListener.shutDown() line: 191	
	RemoteAntBuildListener$ServerConnection.run() line: 109	
######### 
Thread [Worker-0] (Suspended (breakpoint at line 341 in AntDebugTarget))	
	AntDebugTarget.terminated() line: 341	
	AntDebugTarget.handleDebugEvents(DebugEvent[]) line: 454	
	DebugPlugin$EventNotifier.run() line: 1117	
	SafeRunner.run(ISafeRunnable) line: 42	
	DebugPlugin$EventNotifier.dispatch(DebugEvent[]) line: 1151	
	DebugPlugin$EventDispatchJob.run(IProgressMonitor) line: 415	
	Worker.run() line: 54
#########
Though the call stacks in comment 0 and comment 3 are from the tests, as they are two more stacks just from the code, we should better synchronize the terminated function.
Comment 5 Satyam Kandula CLA 2011-08-29 11:47:11 EDT
Created attachment 202340 [details]
Proposed patch

Made terminated() synchronized
Comment 6 Satyam Kandula CLA 2011-08-29 11:47:43 EDT
Mike, 
Can you please review the patch?
Comment 7 Michael Rennie CLA 2011-08-29 16:31:11 EDT
> Made terminated() synchronized

I think there is more to it than that. The problem arises because we null out the fController object in terminated(), so it is possible that any one of the public methods using it could be called asynchronously and cause the NPE. For example comment #0 shows it coming from a call to resume().

A better solution might be to just not null out fController in terminated(). AntDebugTargets are not reused and to create a debug target you have to pass it a new controller.
Comment 8 Satyam Kandula CLA 2011-09-05 02:26:29 EDT
Actually, the call stack in comment 0, shows resume from terminated(). Actually, the source had subsequentally changed, but both the calls are ineffect coming from terminated. 

However, you are right that there is more to this and there could be some sync problems when terminated is called at the time of other actions. One example I can think of is the target ending at the same time a suspend is called. I think just not nulling fController may just push the NPE to another point, something like the first two methods of the callstack in comment 0. So, I think RemoteAntDebugListener.sendRequest() should also check properly.
Comment 9 Satyam Kandula CLA 2011-09-05 02:27:27 EDT
Created attachment 202733 [details]
Proposed patch

Patch as for my last comment.
Comment 10 Satyam Kandula CLA 2012-01-09 08:15:52 EST
Mike, Can you review this?
Comment 11 Michael Rennie CLA 2012-01-10 10:35:42 EST
+1, the patch makes sense. I can't say definitively if it fixes the tests since they don't fail for me locally.

Pushed fix to master: http://git.eclipse.org/c/platform/eclipse.platform.git/commit/?id=b88619d3d013f09e240bb10a2d0e03291ed29015