Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 319792 - Performance regression in DependencyGraphImpl.waitForAllUpdates()
Summary: Performance regression in DependencyGraphImpl.waitForAllUpdates()
Status: RESOLVED FIXED
Alias: None
Product: WTP Java EE Tools
Classification: WebTools
Component: jst.j2ee (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows Server 2003
: P3 normal (vote)
Target Milestone: 3.2.1   Edit
Assignee: Jason Sholl CLA
QA Contact: Chuck Bridgham CLA
URL:
Whiteboard: PMC_approved
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-13 17:07 EDT by Jason Sholl CLA
Modified: 2010-07-20 11:22 EDT (History)
2 users (show)

See Also:
david_williams: pmc_approved+
jsholl: pmc_approved? (raghunathan.srinivasan)
jsholl: pmc_approved? (naci.dai)
deboer: pmc_approved+
jsholl: pmc_approved? (neil.hauge)
jsholl: pmc_approved? (kaloyan)
cbridgha: review+


Attachments
patch for 3.2.1 (3.43 KB, patch)
2010-07-13 17:09 EDT, Jason Sholl CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jason Sholl CLA 2010-07-13 17:07:17 EDT
An adopter product has found a performance regression in DependencyGraphImpl.waitForAllUpdates() in some performance tests which involve opening existing workspaces.  This regression was introduce by bug 317120.  In general the new code is much faster, but in this one particular scenario it is not.  The bug only affects single core machines and is very difficult to reproduce.

The while loop inside waitForAllUpdates() was designed with the assumption that the call to graphUpdateJob.schedule() would immediately schedule the job (and thus the loop exit condition of isUpdateNecessary() would then return false and the loop would exit), but this does not appear to be the case.  Instead, the loop runs very very tightly and completely hogs the processor with millions of wasted calls to jobILock.acquire() and jobILock.release().  These calls are expensive and account for minutes of wasted time.

The fix is to force a wait after the job is schedule (to ensure it has time to be schedule) with a call to graphUpdateJob.waitForRun(500).  The implementation is a polling based on 50ms waits for up to a total of 500ms.  There is some local synchronization code to ensure updates are not missed.  While a polling mechanism is not ideal, no pure solution presents itself.

This fix has been extensively tested by our performance team for over a week, has run through all JUnits, and has been tested locally for over a week.
Comment 1 Jason Sholl CLA 2010-07-13 17:09:38 EDT
Created attachment 174235 [details]
patch for 3.2.1
Comment 2 Chuck Bridgham CLA 2010-07-14 10:10:22 EDT
approved
Comment 3 Jason Sholl CLA 2010-07-14 10:34:09 EDT
    * Explain why you believe this is a stop-ship defect. Or, if it is a "hotbug" (requested by an adopter) please document it as such. 

Severe performance regression in some scenarios when starting an existing workspace.

    * Is there a work-around? If so, why do you believe the work-around is insufficient? 

No

    * How has the fix been tested? Is there a test case attached to the bugzilla record? Has a JUnit Test been added? 

JUnits, adhoc, and the specific performance tests that found this.

    * Give a brief technical overview. Who has reviewed this fix? 

Force a wait to avoid a super tight loop which prevents a job from every being scheduled.

    * What is the risk associated with this fix? 

Minimal
Comment 4 David Williams CLA 2010-07-14 11:47:30 EDT
I'll go ahead and mark as 'approved', but can you still describe how and who reviewed this fix? Its seems a bit of an odd fix to me ... I can't quite put my finger on it ... and just want to be sure some performance experts have reviewed. Has someone like Gary or Min reviewed and approved? Thanks.
Comment 5 Jason Sholl CLA 2010-07-14 12:11:14 EDT
Chuck and Min have both reviewed the fix.

code checked into head for wtp 3.2.1