Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 368223

Summary: LoadRevisionsRequest with CDOFetchRuleManager enabled leads sometimes to NPE
Product: [Modeling] EMF Reporter: Ronald Krijgsheld <rkrijgsheld>
Component: cdo.coreAssignee: Eike Stepper <stepper>
Status: CLOSED FIXED QA Contact: Eike Stepper <stepper>
Severity: normal    
Priority: P3    
Version: 4.0   
Target Milestone: ---   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Attachments:
Description Flags
testcase
none
New test case
none
adjustment of Eike's testcase to propagate exception to main thread.
none
test case v3
none
test case v4 none

Description Ronald Krijgsheld CLA 2012-01-10 03:42:23 EST
Build Identifier: 

Below is the NPE. 

It looks like one of the revisions requested is not available anymore and that is not handled correctly.
We currently disabled the fetchrule manager. However, we find that this has a negative performance impact.


java.lang.NullPointerException
    at
org.eclipse.emf.cdo.internal.server.Session.collectContainedRevisions(Session.java:307)
    at
org.eclipse.emf.cdo.server.internal.net4j.protocol.LoadRevisionsIndication.collectRevisions(LoadRevisionsIndication.java:211)
    at
org.eclipse.emf.cdo.server.internal.net4j.protocol.LoadRevisionsIndication.responding(LoadRevisionsIndication.java:173)
    at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndication.responding(CDOServerIndication.java:133)
    at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedOutput(IndicationWithResponse.java:82)
    at org.eclipse.net4j.signal.Signal.doOutput(Signal.java:247)
    at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:58)
    at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOReadIndication.execute(CDOReadIndication.java:36)
    at org.eclipse.net4j.signal.Signal.runSync(Signal.java:213)
    at org.eclipse.net4j.signal.Signal.run(Signal.java:132)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

Reproducible: Sometimes
Comment 1 Ronald Krijgsheld CLA 2012-01-10 16:58:26 EST
Created attachment 209288 [details]
testcase

Not a typical testcase. If all goes well(i.e. when the bug does not occur) the testcase does not end. However all the times I tried, the problem, manifests itself within the minute.
Comment 2 Ronald Krijgsheld CLA 2012-01-10 17:00:00 EST
The stacktrace again. But now run from the cdo sources in git.

Caused by: java.lang.NullPointerException
	at org.eclipse.emf.cdo.internal.server.Session.collectContainedRevisions(Session.java:323)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.LoadRevisionsIndication.collectRevisions(LoadRevisionsIndication.java:212)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.LoadRevisionsIndication.responding(LoadRevisionsIndication.java:173)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndication.responding(CDOServerIndication.java:133)
	at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedOutput(IndicationWithResponse.java:96)
	at org.eclipse.net4j.signal.Signal.doOutput(Signal.java:296)
	at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:65)
	at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerReadIndication.execute(CDOServerReadIndication.java:36)
Comment 3 Ronald Krijgsheld CLA 2012-01-11 14:06:56 EST
Just to give you my latest update. I changed the LoadRevisionsIndication responding method slightly. 

instead of 
 if (loadRevisionCollectionChunkSize > 0)
      {
        collectRevisions(revisions[i], revisionIDs, additionalRevisions, visitedFetchRules);
      }

I coded:

      if (revisions[i] == null)
      {
        infos[i] = new RevisionInfo.Missing(infos[i].getID(), infos[i].getRequestedBranchPoint());
      }
      else if (loadRevisionCollectionChunkSize > 0)
      {
        collectRevisions(revisions[i], revisionIDs, additionalRevisions, visitedFetchRules);
      }

This will still cause an exception in the testcase I implemented. But that is a stale reference problem.

The reasoning for this was to make the indication always returns with as much information as is available. filling with Missing revisions if necessary. If the client still accesses the Missing revision, than it will fail.  Otherwise the transaction on the client can succeed.
Comment 4 Eike Stepper CLA 2012-01-12 11:02:06 EST
Decreasing severity to normal because functionally the entire fetch rule mechansim is optional, just an optimization.
Comment 5 Eike Stepper CLA 2012-01-12 11:09:44 EST
Analyzing in 4.0
Comment 6 Eike Stepper CLA 2012-01-12 12:27:00 EST
Created attachment 209395 [details]
New test case

(In reply to comment #1)
> Not a typical testcase. If all goes well(i.e. when the bug does not occur) the
> testcase does not end. However all the times I tried, the problem, manifests
> itself within the minute.

I can't get it to fail, not even within 10 minutes. Please have a look at my slightly changed test.
Comment 7 Eike Stepper CLA 2012-01-12 12:30:49 EST
(In reply to comment #2)
> The stacktrace again. But now run from the cdo sources in git.
> 
> Caused by: java.lang.NullPointerException
> at
> org.eclipse.emf.cdo.internal.server.Session.collectContainedRevisions(Session.java:323)

Hm, with my latest 4.0 sources that line can hardly result in an NPE.
Comment 8 Eike Stepper CLA 2012-01-12 12:34:27 EST
(In reply to comment #3)

I would prefer to add this to the top of LoadRevisionsIndication.collectRevisions():

    if (revision == null)
    {
      return;
    }

BUT: Since the test does not fail without this check I'm having problems to add it to a maintenance stream ;-(

Thoughts? Why does my test NOT fail? Or better: why does your test fail?
Comment 9 Ronald Krijgsheld CLA 2012-01-12 12:42:36 EST
I am testing at the trunk. 

the line it fails at is in Session.java at line 323:
    EClass eClass = revision.getEClass();

I try to look at your testcase as soon as i can.
Comment 10 Eike Stepper CLA 2012-01-12 12:45:12 EST
(In reply to comment #9)
> I am testing at the trunk. 

"trunk"? You're not using a ´n SVN checkout anymore, now that we're on Git for a while?!

> the line it fails at is in Session.java at line 323:
>     EClass eClass = revision.getEClass();

No, that's not the code at the tip of our streams/4.0-maintenance Git branch ;-(
Comment 11 Ronald Krijgsheld CLA 2012-01-12 14:24:38 EST
The testcase I wrote and tested on the default "master" branch.
Comment 12 Ronald Krijgsheld CLA 2012-01-12 15:08:41 EST
Created attachment 209407 [details]
adjustment of Eike's testcase to propagate exception to main thread.

The did still fail. But the assertion errors on the threads are not catched by JUnit. So, the other thread will continue.

I tweaked the testcase a little. Forcing a stop when the problem occurs and make sure the assertion error is thrown by the main thread.
Comment 13 Eike Stepper CLA 2012-01-13 03:08:57 EST
Created attachment 209434 [details]
test case v3

Okay, with this new version the test stops and reports the problem when it happens.
Comment 14 Eike Stepper CLA 2012-01-13 03:25:19 EST
Created attachment 209435 [details]
test case v4

Even better...
Comment 15 Eike Stepper CLA 2012-01-13 03:29:07 EST
Ronald, I don't see you on Skype. Please contact me when you're online...
Comment 16 Eike Stepper CLA 2012-01-13 07:35:24 EST
commit 3da641f5a0c8e617c68ba56f376851cab30ecd14
Comment 17 Eike Stepper CLA 2012-01-13 07:36:09 EST
Port to 4.1 via bug 368539.
Comment 18 Eike Stepper CLA 2012-09-21 06:51:11 EDT
Closing.