Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 366518

Summary: Invalidate leads to trying to read a old revision (partial collection loading)
Product: [Modeling] EMF Reporter: Ronald Krijgsheld <rkrijgsheld>
Component: cdo.coreAssignee: Eike Stepper <stepper>
Status: ASSIGNED --- QA Contact: Eike Stepper <stepper>
Severity: normal    
Priority: P3 CC: stepper, vaisegid
Version: 4.13   
Target Milestone: ---   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Attachments:
Description Flags
Test none

Description Ronald Krijgsheld CLA 2011-12-13 07:22:09 EST
Build Identifier:  20110301-1815

See, stacktrace below.

The invalidation leads to resolving an old revision. 
I traced this and found that a new revision of the id requested is already in the RefreshSessionResult. 

We are using an non auditing CDO repository, retrieving older revisions is not possible. So, the CDO server gives the following error:

   Can only retrieve current version 54 for OID4610587 - version requested was BranchVersion[Branch[id=0, name=MAIN], v48]

Now, in our case the refresh keeps failing.

Thread [main] (Suspended (breakpoint at line 468 in CDOConnector$8))	
	CDOConnector$8.handleException(CDOSession, int, Exception) line: 468	
	DelegatingSessionProtocol.handleException(int, Exception) line: 795	
	DelegatingSessionProtocol.loadChunk(InternalCDORevision, EStructuralFeature, int, int, int, int) line: 509	
	CDOCollectionLoadingPolicyImpl.resolveProxy(CDOSession, CDORevision, EStructuralFeature, int, int) line: 115	
	CDOConnector$7$1(CDOSessionImpl).resolveElementProxy(CDORevision, EStructuralFeature, int, int) line: 894	
	CDOElementProxyImpl.resolve(CDOSession, CDORevision, EStructuralFeature, int) line: 47	
	PatchedCDOStoreImpl.resolveProxy(InternalCDORevision, EStructuralFeature, int, Object) line: 488	
	PatchedCDOStoreImpl.convertToEMF(EObject, InternalCDORevision, EStructuralFeature, int, Object) line: 532	
	PatchedCDOStoreImpl.get(InternalEObject, EStructuralFeature, int) line: 157	
	EStoreEObjectImpl$BasicEStoreEList<E>.delegateGet(int) line: 247	
	EStoreEObjectImpl$BasicEStoreEList<E>(DelegatingEList<E>).get(int) line: 236	
	AbstractEList$EListIterator<E1>(AbstractEList$EIterator<E1>).doNext() line: 709	
	AbstractEList$EListIterator<E1>(AbstractEList$EIterator<E1>).next() line: 696	
	ECrossReferenceEList$ResolvingFeatureIteratorImpl<E>(EContentsEList$FeatureIteratorImpl<E>).hasNext() line: 486	
	ECrossReferenceEList$ResolvingFeatureIteratorImpl<E>(EContentsEList$FeatureIteratorImpl<E>).next() line: 565	
	PatchedCDOTransactionImpl.removeCrossReferences(Collection<CDOObject>, Set<CDOID>) line: 1757	
	PatchedCDOTransactionImpl.invalidate(long, List<CDORevisionKey>, List<CDOIDAndVersion>, List<CDORevisionDelta>, Map<CDOObject,CDORevisionDelta>, Set<CDOObject>) line: 1737	
	PatchedCDOTransactionImpl(CDOViewImpl).doInvalidate(CDOBranch, long, List<CDORevisionKey>, List<CDOIDAndVersion>, Map<CDOID,InternalCDORevision>) line: 391	
	PatchedCDOTransactionImpl(CDOViewImpl).invalidate(CDOBranch, long, List<CDORevisionKey>, List<CDOIDAndVersion>, Map<CDOID,InternalCDORevision>, boolean) line: 374	
	CDOConnector$7$1(CDOSessionImpl).processRefreshSessionResult(RefreshSessionResult, CDOBranch, List<InternalCDOView>, Map<CDOBranch,Map<CDOID,InternalCDORevision>>) line: 769	
	CDOConnector$7$1(CDOSessionImpl).refresh(boolean) line: 731	
	CDOConnector$7$1(CDOSessionImpl).refresh() line: 705	

Reproducible: Sometimes
Comment 1 Eike Stepper CLA 2011-12-13 07:33:25 EST
That almost certainly sounds like the problem we've just fixed via bug 365204. Please try that and reopen if you think it's different or not properly solved.

*** This bug has been marked as a duplicate of bug 365204 ***
Comment 2 Eike Stepper CLA 2011-12-16 01:21:38 EST
In your stack trace I see classes like PatchedCDOStoreImpl or PatchedCDOTransactionImpl. How are they different from the shipped classes? Could that cause the problem?

And I see DelegatingSessionProtocol.loadChunk(). Have yu eneabled partial collection loading?
Comment 3 Eike Stepper CLA 2011-12-16 01:23:58 EST
[Ronald]

I also had the same problem on rollback.

All scenarios are like this: On the server there exists a new version of a revision of an object. On a client there is an older revision partially loaded. Now, on an invalidate or an rollback, the older revision is forcefully loaded by the method getNewRevision() or by the iterator in removeCrossReferences(). But, that is not possible anymore because we do not have auditing enabled.
Comment 4 Ronald Krijgsheld CLA 2011-12-16 03:11:42 EST
We indeed enabled collection loading a few weeks ago, to see if that improved performance. Is that not allowed in combination with a non auditing repository? 
I will test with that option turned off.

I think we have a DelegatingSessionProtocol because we added an exception handler.
Comment 5 Eike Stepper CLA 2011-12-16 03:14:40 EST
(In reply to comment #4)
> We indeed enabled collection loading a few weeks ago, to see if that improved
> performance. Is that not allowed in combination with a non auditing repository?

And with passiveUpdates disabled!
 
> I will test with that option turned off.

Ok.

> I think we have a DelegatingSessionProtocol because we added an exception
> handler.

That's ok. I was surprised about your "PatchedXYZ" classes, but they may be ok, too. Who knows...
Comment 6 Eike Stepper CLA 2011-12-16 03:50:43 EST
Okay, I could reproduce your problem. BTW. my stack trace is (incl. the error message!):

java.lang.IllegalStateException: List contains proxy elements
	at org.eclipse.emf.cdo.internal.common.revision.delta.CDORevisionDeltaImpl$1.checkNoProxies(CDORevisionDeltaImpl.java:370)
	at org.eclipse.emf.cdo.internal.common.revision.delta.CDORevisionDeltaImpl$1.analyzeLists(CDORevisionDeltaImpl.java:328)
	at org.eclipse.emf.cdo.internal.common.revision.delta.CDORevisionDeltaImpl.compare(CDORevisionDeltaImpl.java:379)
	at org.eclipse.emf.cdo.internal.common.revision.delta.CDORevisionDeltaImpl.<init>(CDORevisionDeltaImpl.java:109)
	at org.eclipse.emf.cdo.spi.common.revision.BaseCDORevision.compare(BaseCDORevision.java:377)
	at org.eclipse.emf.internal.cdo.session.CDOSessionImpl.processRefreshSessionResult(CDOSessionImpl.java:627)
	at org.eclipse.emf.internal.cdo.session.CDOSessionImpl.refresh(CDOSessionImpl.java:609)
	at org.eclipse.emf.internal.cdo.session.CDOSessionImpl.refresh(CDOSessionImpl.java:585)
	at org.eclipse.emf.cdo.tests.bugzilla.Bugzilla_366518_Test.testInvalidatePartiallyLoadedCollection(Bugzilla_366518_Test.java:57)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at junit.framework.TestCase.runTest(TestCase.java:168)
	at org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:218)
	at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:526)
	at junit.framework.TestResult$1.protect(TestResult.java:110)
	at junit.framework.TestResult.runProtected(TestResult.java:128)
	at junit.framework.TestResult.run(TestResult.java:113)
	at junit.framework.TestCase.run(TestCase.java:124)
	at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:264)
	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
	
The root cause seems to be the combination of passiveUpdateEnabled==false *and* partial collection loading:

- We're having a local object with a list that contains element proxies (i.e., partially loaded).
- Passive updates and auditing are disabled.
- Other user commits a change to our object.
- We call session.refresh()

Now the new revision will be loaded, again with element proxies in the list. Of course the refresh() is supposed to lead to EMF notifications from all changed EObjects. But how to compute these deltas if the lists (in particular the old one) do not carry enough information? Since auditing is disabled there's no way to load the historical revision from the server.

I really don't know how to solve this problem ;-(

Perhaps we could argue that change infos are only needed for those list elements that are not proxied...
Comment 7 Ronald Krijgsheld CLA 2011-12-16 07:23:10 EST
The PatchedCDOTransactionImpl contains some fixes backported to our cdo version. But, this problem is, as you noticed, also reproducable with the latest CDO version. 

We generally have passive update enabled. But after some long calculations where we lock the transaction to prevent the invalidator from running, we toggle the passive update flag to force a refresh.

Running without partial collection loading reduces the problem, but was I still able to reproduce the problem. I'll try again and add my stacktrace.
Comment 8 Eike Stepper CLA 2011-12-16 07:32:39 EST
(In reply to comment #7)
> The PatchedCDOTransactionImpl contains some fixes backported to our cdo
> version. But, this problem is, as you noticed, also reproducable with the
> latest CDO version. 

Ok.
 
> We generally have passive update enabled. But after some long calculations
> where we lock the transaction to prevent the invalidator from running, we
> toggle the passive update flag to force a refresh.

The main problem is partial collection loading, so let's see if you can reproduce without that.

> Running without partial collection loading reduces the problem, but was I still
> able to reproduce the problem. I'll try again and add my stacktrace.

That can't be the exact same problem because without partial collection loading you'll never end up with element proxies in your lists.
Comment 9 Eike Stepper CLA 2011-12-16 07:42:03 EST
Created attachment 208489 [details]
Test
Comment 10 Ronald Krijgsheld CLA 2011-12-16 09:33:43 EST
That is correct. 

I have either the stacktrace added in my first comment. or I see the stacktrace I mentioned in the EMF forum.


DelegatingSessionProtocol.handleException(int, Exception)line:830   
DelegatingSessionProtocol.loadChunk(InternalCDORevision,EStructuralFeature,int,int,int,int)line:544
CDOCollectionLoadingPolicyImpl.doResolveProxy(CDORevision,EStructuralFeature,int,int,int)line:137
CDOCollectionLoadingPolicyImpl.resolveAllProxies(CDORevision,EStructuralFeature)line:66
CDOConnector$7$1(CDOSessionImpl).resolveAllElementProxies(CDORevision)line:948
CDOTransactionImpl$ResolvingRevisionMap.get(Object) line:2245   
CDOTransactionImpl$ResolvingRevisionMap.get(Object) line:1   
CDOTransactionImpl.getViewedRevision(InternalCDOObject) line:2221 CDOTransactionImpl(AbstractCDOView).collectViewedRevisions(Map<CDOID,InternalCDORevision>)line:1366
CDOConnector$7$1(CDOSessionImpl).collectViewedRevisions(Map<CDOBranch,List<InternalCDOView>>,Map<CDOBranch,Map<CDOID,InternalCDORevision>>)line:835 CDOConnector$7$1(CDOSessionImpl).refresh(boolean)line:768   
CDOSessionImpl.access$21(CDOSessionImpl, boolean)line:762   
CDONet4jSessionImpl$OptionsImpl(CDOSessionImpl$OptionsImpl).setPassiveUpdateEnabled(boolean)line:1587
Comment 11 Eike Stepper CLA 2011-12-18 02:51:18 EST
Both are executing loadChunk(), which should only happen for partially loaded collections. Are you really really sure you've switched that off?
Comment 12 Ronald Krijgsheld CLA 2011-12-19 03:52:18 EST
really really sure? eh.
I commented out setting the collectionLoadingPolicy. I expected that to be enough, should I change other settings? prefetchingPolicy?
Comment 13 Eike Stepper CLA 2011-12-19 06:07:58 EST
> I commented out setting the collectionLoadingPolicy. I expected that to be
> enough, should I change other settings? prefetchingPolicy?

No, prefetchingPolicy is different und unrelated.
Comment 14 Ronald Krijgsheld CLA 2011-12-19 16:36:54 EST
With collection loading policy set to unchunked, I can not reproduce the problem anymore I do get historic revision stacktraces on the server, but those are part of a commit, which is correct. No, such stacktrace as part of the invalidate process. I probably had an old client running.
Comment 15 Ronald Krijgsheld CLA 2011-12-23 17:45:33 EST
We found another problem with non auditing repository, where one client requests a revision that is just deleted by another client. Maybe not exactly the scenario described by the title, but all problems are related to a non auditing repository, where a client request data that is either changed or deleted by another client. If you want a separate issue for this scenario let me know.

We have the following stacktrace.
java.lang.NullPointerException
	at org.eclipse.emf.cdo.internal.server.Session.collectContainedRevisions(Session.java:307)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.LoadRevisionsIndication.collectRevisions(LoadRevisionsIndication.java:211)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.LoadRevisionsIndication.responding(LoadRevisionsIndication.java:173)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndication.responding(CDOServerIndication.java:133)
	at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedOutput(IndicationWithResponse.java:82)
	at org.eclipse.net4j.signal.Signal.doOutput(Signal.java:247)
	at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:58)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOReadIndication.execute(CDOReadIndication.java:36)
	at org.eclipse.net4j.signal.Signal.runSync(Signal.java:213)
	at org.eclipse.net4j.signal.Signal.run(Signal.java:132)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
Comment 16 Eike Stepper CLA 2012-01-05 04:26:22 EST
(In reply to comment #15)

Is this the problem caused by using the CDOFetchRuleManager you've mentioned in your private email? If so, please submit a separate bugzilla and don't forget to mention the CDOFetchRuleManager.
Comment 17 Eike Stepper CLA 2012-08-14 22:56:00 EDT
Moving all open issues to 4.2. Open bugs can be ported to 4.1 maintenance after they've been fixed in master.
Comment 18 Eike Stepper CLA 2013-06-29 12:17:26 EDT
We'll try to address open problems in 4.3 (master) first and then port fixes back to 4.2.
Comment 19 Eike Stepper CLA 2015-07-14 02:18:59 EDT
Moving all open bugzillas to 4.5.
Comment 20 Eike Stepper CLA 2016-07-31 01:01:43 EDT
Moving all unaddressed bugzillas to 4.6.
Comment 21 Eike Stepper CLA 2017-12-28 01:19:00 EST
Moving all open bugs to 4.7
Comment 22 Eike Stepper CLA 2019-11-08 02:08:10 EST
Moving all unresolved issues to version 4.8-
Comment 23 Eike Stepper CLA 2019-12-13 12:41:49 EST
Moving all unresolved issues to version 4.9
Comment 24 Eike Stepper CLA 2020-12-11 10:43:44 EST
Moving to 4.13.