Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 315839

Summary: [CommonNavigator] NPE in WorkingSetsContentProvider
Product: [Eclipse Project] Platform Reporter: Duc Luu <dluu>
Component: UIAssignee: Platform UI Triaged <platform-ui-triaged>
Status: RESOLVED FIXED QA Contact: Francis Upton IV <francisu>
Severity: major    
Priority: P3 CC: bokowski, daniel_megert, dmisic, francisu, john.arthorne, mbarkhou, Mike_Wilson, pwebster, remy.suen
Version: 3.6Flags: john.arthorne: pmc_approved-
Target Milestone: 3.6.1   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Bug Depends on:    
Bug Blocks: 317803    
Attachments:
Description Flags
Eclipse 3.6 debug session
none
Eclipse 3.4 debug session
none
proposed fix
none
proposed fix (2) none

Description Duc Luu CLA 2010-06-04 16:32:28 EDT
Build Identifier: 36I20100521-windows-sdk

Please note that this is the eclipse 3.6 regression bug.

While creating a special file type via the File > New > Other ..., the method updateRootMode() in org.eclipse.ui.internal.navigator.workingsets. WorkingSetsContentProvider class is invoked.

In the eclipse version 3.4, it works properly. Here is the implementation.
Also see the attached debug session eclipse34_workingSetContentProvider.jpg

private void updateRootMode() {
  if( extensionStateModel.getBooleanProperty(SHOW_TOP_LEVEL_WORKING_SETS) )
    rootMode = WORKING_SETS;
  else 
    rootMode = PROJECTS;
}

But in the eclipse version 3.6, the WorkingSetsContentProvider has a new additional class member called projectExplorer. This variable is not initialized properly and it is null when the method updateRootMode() is called. Thus, eclipse throws Null pointer exception and causes our creation dialog failed to function properly.

Here is the implementation in eclipse 3.6 for the method updateRootMode() in the WorkingSetsContentProvider class.
Also see the attached debug session eclipse36_workingSetContentProvider.jpg

private void updateRootMode() {
  if( extensionStateModel.getBooleanProperty(SHOW_TOP_LEVEL_WORKING_SETS) )
    projectExplorer.setRootMode(ProjectExplorer.WORKING_SETS);
  else
    projectExplorer.setRootMode(ProjectExplorer.PROJECTS);
}



Reproducible: Always

Steps to Reproduce:
Plese see debug session images
eclipse36_workingSetContentProvider.jpg
eclipse34_workingSetContentProvider.jpg
Comment 1 Duc Luu CLA 2010-06-04 16:35:32 EDT
Created attachment 171173 [details]
Eclipse 3.6 debug session
Comment 2 Duc Luu CLA 2010-06-04 16:36:18 EDT
Created attachment 171174 [details]
Eclipse 3.4 debug session
Comment 3 Francis Upton IV CLA 2010-06-04 17:54:46 EDT
Sounds like we need to fix this in 3.6 right? We are pretty much at the end, so if it's really important we can do it, let us know.
Comment 4 Boris Bokowski CLA 2010-06-04 21:46:18 EDT
Yes, sounds like it should be fixed for 3.6.
Comment 5 Francis Upton IV CLA 2010-06-04 22:18:16 EDT
(In reply to comment #4)
> Yes, sounds like it should be fixed for 3.6.
I agree, I will put together a patch, so this has to be in RC4, right?
Comment 6 Michael Barkhouse CLA 2010-06-07 09:59:01 EDT
I'm seeing this in my product too. I'm using the common viewer in a wizard and an editor. The new code in the working set content provider assumes the viewer is in a view. It asks for the navigator (the view part) and it is null, so the NPE occurs.

I believe it is ok to use the common navigator outside of a view right?
Comment 7 Dusko CLA 2010-06-07 10:00:13 EDT
We are seeing this problem in multiple places so yes, I think it should be fixed in RC4.

Thanks.
Comment 8 Francis Upton IV CLA 2010-06-07 10:41:23 EDT
(In reply to comment #6)

> I believe it is ok to use the common navigator outside of a view right?
It's certainly OK to use the Common Navigator outside of a view, but the code in question is not API nor part of the common navigator. It's part of the internal working set support which is tied to the Project Explorer. 

I was unaware (until after RC2 or so when these bugs started to show up) that there were products using this internal code in different ways, and so have no mechanism to test these cases or prevent breaking them, as you have seen.

The code that is "breaking" was introduced in M5 (in January), so there has been plenty of time to test this with downstream products. Reporting the bug after the final release candidate for 3.6 is gone is very inconvenient for everyone.

Furthermore, I'm not sure of the intention of your use of the code. There are other parts of the WorkingSetContentProvider (see line 110 in getChildren()) that use the projectExplorer variable.

I can certainly prepare a patch that will remove the NPE if it's not set in those two cases, but I cannot assure you that other things will not break (in fact quite the opposite, other things are likely to be broken).

Please help me out here so that I can respond to this.
Comment 9 Francis Upton IV CLA 2010-06-07 11:01:11 EDT
Remy, you are late, I was expecting this much earlier :)

https://bugs.eclipse.org/bugs/show_bug.cgi?id=315839
Product/Component: Platform / UI

Remy Suen <remysuen@ca.ibm.com> changed:

          What    |Removed                     |Added
----------------------------------------------------------------------------
                CC|                            |remysuen@ca.ibm.com
Comment 10 Remy Suen CLA 2010-06-07 11:05:04 EDT
(In reply to comment #9)
> Remy, you are late, I was expecting this much earlier :)

I was out watching a movie on Friday night, sorry. :(

Then on Saturday my 24-hour rotation got messed up somehow as I think I modified my query to include only Saturday's bugs without reviewing what I missed on Friday prior to the date change.
Comment 11 Michael Barkhouse CLA 2010-06-07 11:08:10 EDT
In my scenario (and probably Duc's too) I am not accessing the content provider
directly (the internal code). I'm placing the common viewer in a wizard and I
want the content that is seen there to resemble what the user sees in the
project explorer. So in a generic way I'm getting all the content providers
bound to the project explorer and using them with the common viewer in my
wizard. That way I don't have to hard code specific ids and I pickup new
content providers without having to modify my code.

I suppose if a particular content provider is only bound to the project
explorer that perhaps it's not fair for other clients to hijack it to use it
some place else. However, in the past few years in which my code has been
deployed there is typically about 30+ content providers being used and this is
the first one that makes these types of assumptions.

Am I doing something wrong or should the content provide not assume the PE?
Comment 12 Boris Bokowski CLA 2010-06-07 11:31:53 EDT
I am investigating and should be able to tell you soon if there is anything we can do about this in 3.6.
Comment 13 Francis Upton IV CLA 2010-06-07 11:48:50 EDT
(In reply to comment #11)

> 
> Am I doing something wrong or should the content provide not assume the PE?
The content provider interface is pretty simple and beyond that a content provider can make whatever assumptions it wants including understanding and using the viewer to help manage it's state (as the PE does). As far as I know, assuming that any content provider will work with any viewer is making an assumption beyond the interface contracts and will likely get you into trouble in other cases as well. Boris can correct me if I'm wrong here.
Comment 14 Boris Bokowski CLA 2010-06-07 15:55:59 EDT
(In reply to comment #8)
> The code that is "breaking" was introduced in M5 (in January), so there has
> been plenty of time to test this with downstream products.

To be precise, the code changes went in during 3.5 M5, i.e. well over a year ago.

Duc, Michael, would you be able to work around this issue by copying the internal code you rely on, and either using the old version of this code that works outside of the Project Explorer view, or modifying the current code to tolerate the absence of the view?
Comment 15 Francis Upton IV CLA 2010-06-07 16:00:35 EDT
> To be precise, the code changes went in during 3.5 M5, i.e. well over a year
> ago.
Um, yeah I just realized that it was 2009.
> 
> Duc, Michael, would you be able to work around this issue by copying the
> internal code you rely on, and either using the old version of this code that
> works outside of the Project Explorer view, or modifying the current code to
> tolerate the absence of the view?
I think the simplest solution is to skip any content/label provider in the org.eclipse.ui.navigator.resources package. You won't get the working set stuff, but that should be fine for your purposes.
Comment 16 Dusko CLA 2010-06-07 16:45:40 EDT
We cannot skip content/label providers in the "org.eclipse.ui.navigator.resources" the package because these are the base content providers actually required by all of our stuff. Our content providers add their nodes to the projects contributed by the base resource content provider. I suspect this is a common pattern. We use it in a few dialogs where people need to pick an object from embedded explorer whose content has to at least resemble Project Explorer's content. I am aware of a few other product teams using similar pattern.

I agree that content providers in general are free to assume view against which they are registered but I do think the base providers should be held against higher standard because they are base and everybody wants/needs them.

Sorry for late report, we just did not encounter it until recently and then it took some time debugging to discover the root cause.
Comment 17 Boris Bokowski CLA 2010-06-07 16:47:36 EDT
I am hearing that this is going to affect multiple teams in IBM that have implemented dialogs with essentially a project explorer inside, to select resources. We've seen two cases on this bug already and there is at least a third one.

At this late stage, we may be able to get a change into 3.6 that gets rid of the NPE and assumes "no working sets" if there is no associated Project Explorer.  No guarantees - I will have to talk to the PMC about this.
Comment 18 Boris Bokowski CLA 2010-06-07 16:53:11 EDT
Created attachment 171336 [details]
proposed fix

Duc, Michael, could you please run with this patch and confirm that it gets rid of the NPE?  After hearing back from you I can try to get approval for including the fix in 3.6.
Comment 19 Francis Upton IV CLA 2010-06-07 16:55:12 EDT
(In reply to comment #18)
> Created an attachment (id=171336) [details]
> proposed fix
The 2nd part of that won't work (updateRootMode), the else branch will NPE, right? or am I missing something?
Comment 20 Boris Bokowski CLA 2010-06-07 16:57:41 EDT
Thanks Francis! New patch coming in a minute.
Comment 21 Boris Bokowski CLA 2010-06-07 16:59:55 EDT
Created attachment 171339 [details]
proposed fix (2)

That's why you need so many sets of eyes during the end game.
Comment 22 Boris Bokowski CLA 2010-06-07 17:01:01 EDT
(In reply to comment #18)
> Duc, Michael, could you please run with this patch

Obviously, you'd want to run with the patch called "proposed fix (2)".
Comment 23 John Arthorne CLA 2010-06-07 17:26:28 EDT
My initial impression is that this would be a fine fix for 3.6.1 (or a patch for anyone who wants it earlier). I'm having trouble seeing why a bug undetected for 18 months is important enough to risk fixing after RC4. I'm particularly concerned about Francis's comment #8, which suggests the code is being used in completely unexpected ways from the Common Navigator maintainer's perspective and there are quite possibly other bugs with reusing the content provider in this way, including other unguarded references to this field which may be null.
Comment 24 Duc Luu CLA 2010-06-07 17:57:08 EDT
Yes. This “proposed fix (2)” patch fixes the problem of NPE. Our creation wizard dialog is now can continue to the last page and complete successfully.
Comment 25 Boris Bokowski CLA 2010-06-07 20:17:19 EDT
(In reply to comment #23)
> My initial impression is that this would be a fine fix for 3.6.1 (or a patch
> for anyone who wants it earlier). I'm having trouble seeing why a bug
> undetected for 18 months is important enough to risk fixing after RC4.

The bugs has not been detected earlier because testing of scenarios involving working sets was not performed until very recently, and the last released version of the affected products is based on 3.4.

Supplying a patch will just add overhead, for the Platform UI team and those who want to consume the patch. I don't see how the patch may, in any way, cause problems, because it will only add null checks in places where otherwise a NPE would have occurred. 

> I'm
> particularly concerned about Francis's comment #8, which suggests the code is
> being used in completely unexpected ways from the Common Navigator maintainer's
> perspective and there are quite possibly other bugs with reusing the content
> provider in this way, including other unguarded references to this field which
> may be null.

After applying the patch, all accesses to the field will be guarded by null checks.

That being said, I am also concerned about the comment about usage in unexpected ways, but that's a different topic. :-)
Comment 26 Michael Barkhouse CLA 2010-06-08 07:53:53 EDT
The patch resolves the issue for me too.
Comment 27 John Arthorne CLA 2010-06-09 12:16:14 EDT
This missed the boat for 3.6.0, so let's aim to get it into 3.6.1 and 3.7. That will give us some time to sort out any other unexpected issues from reusing the content provider in this way.
Comment 28 Dusko CLA 2010-06-09 13:37:36 EDT
My understanding was that the patch was going to be applied to 3.6?
Comment 29 Boris Bokowski CLA 2010-06-09 15:08:59 EDT
(In reply to comment #28)
> My understanding was that the patch was going to be applied to 3.6?

I talked to the PMC and this problem was not deemed serious enough to warrant a post-RC4 rebuild.
Comment 30 Boris Bokowski CLA 2010-06-24 06:59:47 EDT
Fix released to R3_6_maintenance.