Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 312491

Summary: ReferenceResolverUtil isn't thread safe
Product: [WebTools] WTP Common Tools Reporter: Larry Isaacs <larryisaacs>
Component: wst.commonAssignee: Jason Sholl <jsholl>
Status: VERIFIED FIXED QA Contact: Carl Anderson <ccc>
Severity: major    
Priority: P3 CC: cbridgha, neil.hauge
Version: 3.2Flags: jsholl: pmc_approved? (david_williams)
jsholl: pmc_approved? (raghunathan.srinivasan)
jsholl: pmc_approved? (naci.dai)
jsholl: pmc_approved? (deboer)
neil.hauge: pmc_approved+
jsholl: pmc_approved? (kaloyan)
cbridgha: review+
Target Milestone: 3.2 RC1   
Hardware: PC   
OS: Windows XP   
Whiteboard: PMC_approved
Attachments:
Description Flags
Partial Eclipse .log showing NPE due to thread safety issue in ReferenceResolverUtil
none
patch
none
updated patch none

Description Larry Isaacs CLA 2010-05-11 15:30:40 EDT
Created attachment 168010 [details]
Partial Eclipse .log showing NPE due to thread safety issue in ReferenceResolverUtil

The stack trace in the attached log shows that some other thread is at least part of the way through ReferenceResolverUtil.loadResolvers(), but hasn't yet initialized the "sorted" field.  The thread running the "Initializing Java Tooling" job caught the ReferenceResolverUtil object in its partially initialized state and suffered an NPE.  I'm not certain, but I think the error 'An internal error occurred during: "Initializing Java Tooling".' may mean that the job was blown out of the water and the workspace may be in a partially initialized state.  Setting the severity to "major" for that reason.
Comment 1 Carl Anderson CLA 2010-05-11 16:50:59 EDT
Assigning to Jason for initial investigation.  Please retarget as appropriate.
Comment 2 Jason Sholl CLA 2010-05-11 17:46:36 EDT
Created attachment 168051 [details]
patch

This is a very straight forward problem and fix.  The assignment to resolvers should be the last thing done; this will ensure everything is properly initialized.  If the exact same threading scenario arises again which produced the original stack trace, the load will simply occur twice but all the data will be intact and consistent.
Comment 3 Larry Isaacs CLA 2010-05-12 07:59:24 EDT
I never know how much to worry about Double-Checked-Locking (http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html) type issues for practical purposes, which unfortunately means I usually end up fretting about it.  With respect to the patch, I believe DCL issues would suggest that unless you use "volatile", a thread running in a different processor core could theoretically see the memory updates to "resources" and "sorted" out of order (i.e. due to cache coherency issues, a fresh non-null "resources" could be "seen" along with a stale null for "sorted").  The odds would seem to be miniscule, so it's hard to say if it is worth worrying about.  If it matters, a fix that might be a tiny bit better could be to null check "sorted" in getResolvers() so order doesn't matter.  Since the "resolvers"  field is only used outside of loadResolvers() for that null check, it could be removed.
Comment 4 Jason Sholl CLA 2010-05-12 15:51:35 EDT
Created attachment 168244 [details]
updated patch

Updated the patch to include the additional null check.
Comment 5 Chuck Bridgham CLA 2010-05-12 15:53:02 EDT
approved
Comment 6 Jason Sholl CLA 2010-05-12 16:01:06 EDT
    * Explain why you believe this is a stop-ship defect. Or, if it is a "hotbug" (requested by an adopter) please document it as such. 

See initial description.

    * Is there a work-around? If so, why do you believe the work-around is insufficient? 

No.

    * How has the fix been tested? Is there a test case attached to the bugzilla record? Has a JUnit Test been added? 

This has been tested in UI in both adopter product and in WTP

    * Give a brief technical overview. Who has reviewed this fix? 

Simple fix to ensure both variables are null checked; worse case everything will be initialized twice, but that is OK because everything will still be consistent.

    * What is the risk associated with this fix? 

none
Comment 7 Jason Sholl CLA 2010-05-12 18:34:16 EDT
code checked into head for wtp 3.2 rc1
Comment 8 Larry Isaacs CLA 2010-05-13 10:09:11 EDT
Using the source from the WTP SDK I-3.2.0-20100513045407 build, I've verified via breakpoints that the NPE can no longer occur, though three threads try to access getResolvers() concurrently at startup.