Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 334995

Summary: CDOTransaction corrupted by persisted + new resource with same URI
Product: [Modeling] EMF Reporter: Caspar D. <caspar_d>
Component: cdo.coreAssignee: Caspar D. <caspar_d>
Status: CLOSED FIXED QA Contact: Eike Stepper <stepper>
Severity: normal    
Priority: P3 CC: martin.fluegge, saulius.tvarijonas
Version: 4.0Flags: stepper: review+
stepper: review+
Target Milestone: ---   
Hardware: All   
OS: All   
Whiteboard:
Attachments:
Description Flags
Testcase (as patch)
none
Patch (including testcase)
none
Patch v2 - ready to be committed
none
Patch v3
none
Patch v4 none

Description Caspar D. CLA 2011-01-21 06:27:31 EST
Create and commmit a resource, say "/res1". Keep it's persisted CDOID.
In a fresh session/tx on the same repo, do the same thing again, using
the same resource name, but don't commit yet. Then, in the same tx, 
fetch the resource that was already committed, through getObject(CDOID).

At this stage, inspection of the map returned by tx.getNewObjects() reveals
that the object now held in the map, is the persisted resource with state
CLEAN, not the NEW (as yet uncommitted) one.

A subsequent tx.commit() fails, because the state machine refuses to
perform the COMMIT transition on a CLEAN object.
Comment 1 Caspar D. CLA 2011-01-21 06:33:54 EST
Created attachment 187272 [details]
Testcase (as patch)
Comment 2 Caspar D. CLA 2011-01-25 02:25:06 EST
What is the desired behavior here?

This situation could be considered a 'conflict', because
the URI that the new, uncommitted resource intends to use,
is the same as the URI already used by the existing resource.

But up till now, a 'conflict' is a situation where local 
changed to a *persisted* object, conflict with server-side
changes to the same, persisted object. But here we are dealing
with a *new* object that conflicts with an unchanged persisted
object. There are no 'changes' involved here. So I'm not sure
 if our current conflict resolution infrastructure is suitable
for dealing with this.

And there's the question of *when* to deal with it. During
createObject? During getRevision? I'm not sure what makes sense
here. And what do we do when the problem is detected? Clearly the
NEW resource is the problem, not the existing resource. So should we
'fix' its URI? In what way? Or if we throw an exception, is there a
way for the user app to deal with this? And in that case, what do we
do with the loaded revision? Discard it?

I would welcome some input.
Comment 3 Caspar D. CLA 2011-01-25 02:50:29 EST
I think the best place to intercept the problem is
AbstractCDOView.newResourceInstance, as in the mini patch
below. But how to deal with the problem, I'm still not sure..

### Eclipse Workspace Patch 1.0
#P org.eclipse.emf.cdo
Index: src/org/eclipse/emf/internal/cdo/view/AbstractCDOView.java
===================================================================
--- src/org/eclipse/emf/internal/cdo/view/AbstractCDOView.java	(revision 6921)
+++ src/org/eclipse/emf/internal/cdo/view/AbstractCDOView.java	(working copy)
@@ -772,6 +772,15 @@
   private CDOResource newResourceInstance(InternalCDORevision revision)
   {
     String path = getResourcePath(revision);
+    URI uri = CDOURIUtil.createResourceURI(this, path);
+
+    ResourceSet resourceSet = getResourceSet();
+    CDOResource resource = (CDOResource)resourceSet.getResource(uri, false);
+    if (resource != null)
+    {
+      System.out.println("---> Oops! URI of resource being instantiated conflicts with local resource URI");
+    }
+
     return getResource(path, true);
   }
Comment 4 Eike Stepper CLA 2011-01-25 13:55:40 EST
Probably the situation may also be entered if the second attempt to create a resource uses a different URI and only later changing the URI to the one of the existing resource and then trying to commit the tx.

First I thought this situation should be detected on the server, through the normal DuplicateResource detection mechanisms in the stores. But i guess already the local ResourceSet gets confused. Should we generally intercept CDOResourceImpl.setPath()?
Comment 5 Caspar D. CLA 2011-01-31 06:08:29 EST
(In reply to comment #4)
> Should we generally intercept
> CDOResourceImpl.setPath()?

Wherever we intercept the conflict, for me the fundamental question
is what do we do then? Reject the load? Fix the local URI? Sth else?
Comment 6 Caspar D. CLA 2011-02-09 23:54:36 EST
The fundamental problem is that in the case of resources,
the client app is free to choose its path+name, and thus
choose its URI, which could conflict with the URI of a
resource that was already persisted.

When such a conflict is detected, we have no choice but
to change the URI of the local (new) resource. This is 
slightly awkward, but rejecting the load of a valid object
makes even less sense. And in defense of this approach:
the resource being renamed could not be persisted with
its initial name anyway.
Comment 7 Caspar D. CLA 2011-02-09 23:55:13 EST
Created attachment 188645 [details]
Patch (including testcase)
Comment 8 Eike Stepper CLA 2011-02-10 02:38:08 EST
Created attachment 188655 [details]
Patch v2 - ready to be committed
Comment 9 Caspar D. CLA 2011-02-10 03:12:03 EST
Committed to trunk, rev. 7045
Comment 10 Martin Fluegge CLA 2011-02-12 10:12:12 EST
Created attachment 188834 [details]
Patch v3

As I mentioned on Bug 336817 I would like to discuss whether we need to handle the resource reniming mechnisms more carefully.Since the scenario Casper describes only occurs if one create a new resource and then calls getObject I am not sure whether we will have problems at all. Could anyone image a situation where Caspar's mechanism could be active on a CDOView?
Comment 11 Caspar D. CLA 2011-02-13 22:32:27 EST
(In reply to comment #10)

Sorry for the legacy breakage, I should have checked. And thanks for
the fix.

> I would like to discuss whether we need to handle
> the resource reniming mechnisms more carefully. Since the scenario
> Casper describes only occurs if one create a new resource and then
> calls getObject I am not sure whether we will have problems at all.

I think you will have the same problem as what was fixed here. If you
rename a resource so that it's URI ends up being the same as the URI
of a resource that was already persisted, then when you load that
other resource, some conflict arises, and I suppose there's no logic
in place to detect that and deal with it.

> Could anyone image a situation where Caspar's mechanism could be
> active on a CDOView?

You mean a non-writeable view?
Comment 12 Caspar D. CLA 2011-02-14 00:54:53 EST
Created attachment 188869 [details]
Patch v4

Turns out the check I added also guards against renames. I
added a test to verify this.
Comment 13 Caspar D. CLA 2011-02-14 00:58:21 EST
Reopening for additional review
Comment 14 Martin Fluegge CLA 2011-02-14 02:42:37 EST
>Sorry for the legacy breakage, I should have checked. 
No problem. I am here to observe the Legacy outlands ;)
And it discovered a bug an legacy everything is fine ;)

I had a short look to the test case and as far as I understand the problem it occurs because the second transaction create a resource "/res1" with an tempId (oid1) that cannot be mapped to correctId (OD2). If so we should mark this bug somehow because it turn out that the changes are not needed as soon as Eike finished the UUID integration.

I am also wondering why it is  allowed to create second resource with the same id if such resource already exists in the repository. Shouldn't this throw an exception? Otherwise the old resource would be simple overwritten, right?
Comment 15 Caspar D. CLA 2011-02-14 03:47:21 EST
(In reply to comment #14)
> >Sorry for the legacy breakage, I should have checked. 
> No problem. I am here to observe the Legacy outlands ;)
> And it discovered a bug an legacy everything is fine ;)
> 
> I had a short look to the test case and as far as I understand the problem it
> occurs because the second transaction create a resource "/res1" with an tempId
> (oid1) that cannot be mapped to correctId (OD2). If so we should mark this bug
> somehow because it turn out that the changes are not needed as soon as Eike
> finished the UUID integration.
> 
> I am also wondering why it is  allowed to create second resource with the same
> id if such resource already exists in the repository. Shouldn't this throw an
> exception? Otherwise the old resource would be simple overwritten, right?

I'm losing the plot here... as far as my understanding goes, this is
about the resources' paths conflicting. I don't see what the OID's 
have to do with it, or how there is a problem with the OIDs.
Comment 16 Martin Fluegge CLA 2011-02-14 04:09:35 EST
>I'm losing the plot here... as far as my understanding goes, this is
>about the resources' paths conflicting. I don't see what the OID's 
>have to do with it, or how there is a problem with the OIDs.

My idea was not to change the implementation of newResource() but simply to avoid it's call from createObject(CDOID id) if the resource already exists. The problem is that we hand in the tempId of the newly create resource so mapping it to the existing id would not work. But looking twice at the idea it seems to be more rubbish than a brilliant thought. Sorry for the confusion ;)

But what about the second creation of the resource? Don't you think that we should avoid this be throwing an exception?
Comment 17 Caspar D. CLA 2011-02-14 04:26:59 EST
(In reply to comment #16)
> But what about the second creation of the resource? Don't you think that we
> should avoid this be throwing an exception?

Either you're confused, or I'm confused, or both.... :S

Consider the first testcase. In this case, the "second creation" is 
the instantation of a resource that is already persistent. To prevent
this instantiation would be really strange, as it means we give
the client's local state (which we know for sure to be uncommittable) 
precedence over the state on the server. To me, it is the local state
that is the problem; which is why I added logic to fix it.

In the second testcase, both resources are already persistent, but
the client makes a local change (a rename) on one of them, such that
it conflicts with the name of the other persistent resource. Again,
this creates local state that is uncommittable; and so in this case too, 
I believe the right thing is to fix the local state when the client 
becomes aware of the server state that is at odds with it (i.e. when
the other resource gets loaded.)
Comment 18 Caspar D. CLA 2011-02-14 05:39:46 EST
Committed to trunk, rev. 7073

I'm resolving this again; IMO the code as it stands now deals
correctly with both new resources and renamed ones. Martin,
if you disagree, please let me know and we can discuss further.
Comment 19 Eike Stepper CLA 2011-02-14 05:41:53 EST
Resolving
Comment 20 Eike Stepper CLA 2011-06-23 03:40:40 EDT
Available in R20110608-1407