Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 204475

Summary: Build failure causes all builds to be canceled
Product: [Eclipse Project] Platform Reporter: John Arthorne <john.arthorne>
Component: DebugAssignee: Darin Wright <darin.eclipse>
Status: VERIFIED FIXED QA Contact:
Severity: major    
Priority: P3 CC: darin.eclipse, Michael_Rennie, philippe_mulet
Version: 3.3Flags: philippe_mulet: pmc_approved+
Target Milestone: 3.3.2   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Attachments:
Description Flags
fix
none
better fix none

Description John Arthorne CLA 2007-09-24 12:21:58 EDT
Build: 3.4 M2

 - Create a Java project, add a class or two.
 - Add an external tool builder to the project, so that it runs before the Java builder. Make the builder just call a simple "test.bat" file.
 - Delete the "test.bat" file in the file system

-> From now on, the Java builder will never run. If you have other projects in the workspace that build after the problematic project, they will never be built either. The problem is that the external tool builder cancels the build when the exception occurs.

LaunchConfiguration.launch, line 770:

} catch (CoreException e) {
	...
	monitor.setCanceled(true);

This cancels the entire build process. It should not cancel on failure, so that other builders can continue to run.
Comment 1 Philipe Mulet CLA 2007-09-24 12:33:24 EDT
Smells like a good candidate for backporting. I wonder how many products out there are hitting this without knowing...
Comment 2 Darin Wright CLA 2007-09-24 12:39:22 EDT
Looks like the "setCanceled(true)" was added in 3.3. We should fix for 3.3.2 and 3.4.
Comment 3 Michael Rennie CLA 2007-09-24 17:00:56 EDT
Created attachment 79092 [details]
fix

There were a few more occurances in the launch method where we canceled the monitor if there is a problem, those have been removed as well.
Comment 4 Michael Rennie CLA 2007-09-25 12:59:57 EDT
maybe there is a batter solution to this issue than simply removing the offending calls. We probably should be creating a new SubProgressMonitor for the scope of the launch method in LaunchConfiguration. Doing so would resolve the following:

1. beginTask will only be called once for the monitor
2. done will only be called once for the monitor
3. any downstream cancellations using the monitor from LaunchConfiguration will not harm the parent builder monitor
Comment 5 Michael Rennie CLA 2007-09-26 11:24:04 EDT
Created attachment 79213 [details]
better fix

this patch is bit better in that it uses a sub progress monitor to shield any parent monitors of the launch method for builders, etc. from them being canceled by any downstream delegates.
Comment 6 John Arthorne CLA 2007-09-26 13:21:35 EDT
Note that a *real* cancellation should still be propagated to cause the entire build to stop. For example, if the user clicks the "Cancel" button during a workspace build, the entire build should be canceled. If you are trapping and discarding the cancel state it may interfere with this. To me, the root of the problem seems to be converting an error into a cancellation - these are quite different things and should behave differently (cancellation should be propagated right up to the UI, errors may be handled lower down).
Comment 7 Philipe Mulet CLA 2007-09-27 04:26:52 EDT
I agree that pressing 'cancel' button should cancel overall build loop, but this could be handled by having the overall progress monitor itself be connected to the cancel button. Now, for individual subprogress monitors, it feels a bit counter intuitive that its own cancel can affect the entire build. Now reading the spec, I don't see anything concrete on this.

Feels like this specific situation could be documented ?
Comment 8 Michael Rennie CLA 2007-09-27 10:03:12 EDT
I agree that more documentation is needed, we should probably document this in all downstream APIs (ILaunchConfigurationDeleate for example) so that consumers know not to set canceled states for errors. 

From testing the patch I found that the overall job loop can still be canceled and more testing confirmed that you can still set the job canceled in downstream delegates (it just didn't appear that way until I added in a bunch of busy waits to make the jobs take longer).

So really I guess we could just remove the offending calls and doc the situation where appropriate.
Comment 9 John Arthorne CLA 2007-09-28 15:04:47 EDT
One specific thing I have done is add more debugging information when a build is canceled (assuming you have build tracing options turned on). That will make it a little easier to track down problems like this.  I'm not sure how/where we would document this though - it's a general property of SubProgressMonitor/SubMonitor that cancelation is always propagated up to the root monitor.
Comment 10 Michael Rennie CLA 2007-10-02 11:01:38 EDT
fixed in HEAD and 3.3.2 maintenance.

since the canceled flag was propagated anyway, the committed solution was to simply not cancel the monitor on errors (essentially the first patch). All of the launch methods and any delegates directly called have been documented to state that the supplied progress monitor must not be canceled this way.
Comment 11 Michael Rennie CLA 2007-10-02 11:02:03 EDT
please verify Darin W
Comment 12 Darin Wright CLA 2007-10-02 14:59:05 EDT
Verified.
Comment 13 Darin Wright CLA 2007-11-20 22:24:50 EST
Needs PMC approval
Comment 14 Philipe Mulet CLA 2007-11-21 12:27:45 EST
+1 for 3.3.2 (I had already mentionned need to backport it)