Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 328352 - CommitNotifications overtaking each other
Summary: CommitNotifications overtaking each other
Status: CLOSED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: cdo.core (show other bugs)
Version: 4.0   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Pascal Lehmann CLA
QA Contact: Eike Stepper CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-21 10:03 EDT by Pascal Lehmann CLA
Modified: 2011-06-23 03:42 EDT (History)
2 users (show)

See Also:
stepper: review+


Attachments
Debug Screenshot (317.18 KB, image/png)
2010-10-21 10:07 EDT, Pascal Lehmann CLA
no flags Details
Testcase (6.00 KB, patch)
2010-10-22 04:54 EDT, Pascal Lehmann CLA
no flags Details | Diff
Patch v1 - for future reference (66.38 KB, patch)
2010-10-24 09:23 EDT, Eike Stepper CLA
no flags Details | Diff
Fix v1 (2.33 KB, patch)
2010-10-25 12:09 EDT, Eike Stepper CLA
no flags Details | Diff
Testcase v2 (7.45 KB, patch)
2010-10-25 12:18 EDT, Eike Stepper CLA
no flags Details | Diff
Fix v2 (5.32 KB, patch)
2010-10-26 09:48 EDT, Pascal Lehmann CLA
no flags Details | Diff
Testcase v3 (7.50 KB, patch)
2010-10-26 09:50 EDT, Pascal Lehmann CLA
no flags Details | Diff
Combined patch - ready to be committed (12.88 KB, patch)
2010-10-27 03:18 EDT, Eike Stepper CLA
no flags Details | Diff
Combined patch v2 (13.16 KB, patch)
2010-10-27 04:06 EDT, Eike Stepper CLA
no flags Details | Diff
Additional Fix (1.31 KB, patch)
2010-11-04 11:23 EDT, Pascal Lehmann CLA
no flags Details | Diff
Additional Fix v2 - ready to be committed (1.95 KB, patch)
2010-11-06 05:21 EDT, Eike Stepper CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Pascal Lehmann CLA 2010-10-21 10:03:17 EDT
I noticed that the CommitNotifications seem to be able to overtake each other when there are many commits (for the same object & feature) done quickly after each other. This is also the cause for bug #312404 (for which a workaround has been added). The problem is not only present in the CommitRunnables on the RepositorySynchronizer, but also in normal InvalidationRunnables (see screenshot), which might lead to IndexOutOfBoundsExceptions on the view or changes applied in wrong order.

The problem is probably not limited to CommitNotifications and might not a bug but a feature of Net4J, so I'm unsure if this should be fixed at Net4J level.
Comment 1 Pascal Lehmann CLA 2010-10-21 10:07:39 EDT
Created attachment 181394 [details]
Debug Screenshot

On the screenshot you can see the InvalidationRunnable currently being processed which wants to modify object 259v559 and the InvalidationRunnable in the queue at [7] which wants to modify object 259v588.
Comment 2 Pascal Lehmann CLA 2010-10-21 10:17:06 EDT
Sorry, I meant InvalidationRunnable [7] modifies object 259v558.

I noticed this behavior on both JVM and TCP connections.
Comment 3 Pascal Lehmann CLA 2010-10-22 04:54:28 EDT
Created attachment 181477 [details]
Testcase
Comment 4 Eike Stepper CLA 2010-10-24 03:40:04 EDT
You're right in that I would see this as a feature of Net4j rather than a bug.

That said, I'm inclined to see this as a bug in CDO. And it can probably only be fixed by changing the "id" of a commit from its time stamp to a sequential number (but keep the time stamp as a normal field, of course).
Comment 5 Eike Stepper CLA 2010-10-24 09:19:41 EDT
I've added CDOCommitInfo.getPreviousTimeStamp() to enable waiting for outstanding data...
Comment 6 Eike Stepper CLA 2010-10-24 09:21:39 EDT
NOTE: This fix changes the DB schema. Existing DBs have to be recreated!!!
Comment 7 Eike Stepper CLA 2010-10-24 09:23:57 EDT
Created attachment 181595 [details]
Patch v1 - for future reference

This patch adds the "previous time" to CDOCommitInfo.
Comment 8 Eike Stepper CLA 2010-10-25 12:09:47 EDT
Created attachment 181655 [details]
Fix v1

Not yet tested.
Comment 9 Eike Stepper CLA 2010-10-25 12:18:57 EDT
Created attachment 181656 [details]
Testcase v2
Comment 10 Eike Stepper CLA 2010-10-25 12:29:02 EDT
Pascal, my fix va does not break anything else, but the test case you've attached passes also without that one. Can we discuss this via Skype tomorrow?
Comment 11 Pascal Lehmann CLA 2010-10-26 09:48:59 EDT
Created attachment 181727 [details]
Fix v2

The testcase fails fine here (with or without fix), so I updated your fix. Testcase works fine now.
Comment 12 Pascal Lehmann CLA 2010-10-26 09:50:32 EDT
Created attachment 181729 [details]
Testcase v3

Updated Testcase (checks if also the last commit arrived).
Comment 13 Eike Stepper CLA 2010-10-27 03:18:26 EDT
Created attachment 181801 [details]
Combined patch - ready to be committed

Congratulations Pascal, your first CVS commit ;-)
Comment 14 Eike Stepper CLA 2010-10-27 04:06:28 EDT
Created attachment 181805 [details]
Combined patch v2

This entirely different patch moves the *complete* invalidation processing into the InvalidationRunnable, including reviseOldRevisions, setLastUpdateTime and fireSessionInvalidationEvent!

This not only makes the separate "sequence ordering" in the RepositorySynchronizer unnecessary, but rather solves potential issues with other IListeners on the session.
Comment 15 Eike Stepper CLA 2010-10-27 04:07:37 EDT
Committed combined patch v2 to HEAD.

Pascal, can you please test the raw replication again?
Comment 16 Pascal Lehmann CLA 2010-10-28 02:32:51 EDT
The Testcase also works fine with the optimized fix :)
Comment 17 Eike Stepper CLA 2010-10-29 06:45:30 EDT
Thank you for testing, Pascal ;-)
Comment 18 Pascal Lehmann CLA 2010-11-04 11:18:47 EDT
Encountered deadlocks.
Comment 19 Pascal Lehmann CLA 2010-11-04 11:23:52 EDT
Created attachment 182382 [details]
Additional Fix

I ran into deadlocks so I had to use more restrictive locking.

The deadlock occurred because the invalidation lock of the commiting transaction was released but not all views yet invalidated, while another transaction (which was not yet invalidated) already grabbed the invalidation lock.
Comment 20 Eike Stepper CLA 2010-11-06 05:21:18 EDT
Created attachment 182548 [details]
Additional Fix v2 - ready to be committed

I removed the latch!=null checks. Please go ahead...
Comment 21 Pascal Lehmann CLA 2010-11-08 03:10:22 EST
Committed additional fix v2 to HEAD.
Comment 22 Eike Stepper CLA 2011-06-23 03:42:23 EDT
Available in R20110608-1407