Community
Participate
Working Groups
It would be useful to somehow support partial commits; that is, to somehow allow for only some of the dirty data in a tx (i.e. updates, deletions and new objects) to be committed, instead of all. The use case for this is when, from the user's conceptual perspective, the repository contains distinct groups of data (e.g. "modules"). The user will likely edit multiple groups at the same time, but might want to save them separately. (Viewing the groups in different tx's is not generally a solution, as there will likely be relations across group boundaries.) One difficulty with this idea is that we can't allow inconsistencies to arise. So if, for example, an object A is moved from parent P1 to parent P2, then a partial commit including A, must check that P1 and P2 are also included, as otherwise the resulting persisted state would be inconsistent. Solution ideas that Eike and I discussed elsewhere: 1) Add sth like: CDOTransaction.commit(Set<CDOObject> committables) IMO, this approach is clean from the caller's perspective, but admittedly, the drawback is that this undermines the fundamental idea that a TX is the mechanism for separating one set of changes from another. 2) Implement sth like SomeUtil.copyChanges(sourceTx, targetTx, cdoObjects) This would allow partial commits to be realized by copying the subset of changes over to another tx, and committing only that 2nd tx. The difficulty with this approach, is that we need the committed object's state to transition to CLEAN not only in the 2nd tx, but also in the original sourceTx, and retain their updated state in the sourceTx. Another question is how to avoid/handle the "phantom conflicts" that will arise as a consequence of the sourceTx receiving a CommitNotification from the server, concerning the targetTx's commit, which includes the sourceTx's own changes.
As discussed on Skype between Eike and myself, another possible approach is to introduce something like CDOTransaction.setCommittables(Set<CDOObject> committables) This set would be null by default, and when left null, the commit behavior would revert to what it currently is. This way, the current commit API would not be impacted. When the set is not null, it acts as a filter on the commit. That is, only the dirty/new/detached objects contained in the set, are actually committed. Yet another alternative is to use the inverse concept: CDOTransaction.setExcludeFromCommit(Set<CDOObject> excludes) But the first approach corresponds more closely to what is used in DCVS systems such as Git, and is probably also more straightforward from a user point of view. Will explore further.
The major challenge that I'm encountering, is this: during preCommit, we need to check the integrity of the commit. That is, we need to check that the set of committables includes enough to constitute a coherent set of updates. For example, if an object was moved from parent1 to parent2, then both parent1 and parent2 must be included in the commit. But this is not easy... The difficulty is this: once an object is dirty, the revision that was originally loaded for it, is no longer guaranteed to be available from the revisionManager's cache. It may have been GC'ed, in which case fetching it from the revMgr will actually cause it to be retrieved from the server, and that retrieved revision may be different from what was originally loaded in the client's session. (On a side note: this also gives rise to bug 315026.) So, knowing this, how does one check the consistency of a partial commit? How to check what the old parent was, for example? This info is contained only in the revision that was originally loaded, and that may have disappeared from the session. It may be necessary to introduce a map that for every dirty object, holds on to the revisions that were originally loaded. I'm not sure how complicated this will get. There are quite a few places in the code where setInternalCDORevision(..) is called, and I suspect that each of those cases will have to be thought through. Another possibility is to keep the old values in the featureDeltas. But I'm not sure what currenlty happens in the case of multiple deltas for the same feature, so this too could be less easy than it sounds.
Rebasing all outstanding enhancements requests to version 4.0
Created attachment 178373 [details] Patch Here's a working implementation of the 'partial commit' concept, including a mechanism that thoroughly checks if the collection of objects being committed, are a complete and consistent whole (see CommitIntegrityCheck.java), i.e. will not cause corruption of the graph on the server side.
Forgot to mention: unit tests are also included in the patch: PartialCommitTest.java.
Created attachment 178678 [details] Patch v2 - ready to be committed
I wonder if we really need the get/setCommittables() method in XA transaction. Shouldn't we move them down from CDOUserTransaction to CDOTransaction?
Committed to HEAD
Created attachment 178715 [details] Patch (incremental) (In reply to comment #7) > I wonder if we really need the get/setCommittables() method in XA transaction. > Shouldn't we move them down from CDOUserTransaction to CDOTransaction? Yeah I see your point. I guess it's unnecessary to have get/setCommittables on the XA tx, because the caller can (should?) call these methods on the underlying normal tx's. And the caller probably wants to set different committables for each tx...right? I'm not sure about this; I don't know the use cases that XA tx's are meant to support. So.. up to you really. Here's a patch for it anyway.
Go ahead, please ;-)