Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 202064

Summary: More client-side intelligence (SmartReadAhead Thread)
Product: [Modeling] EMF Reporter: Simon Mc Duff <smcduff>
Component: cdo.coreAssignee: Project Inbox <emf.cdo-inbox>
Status: CLOSED WONTFIX QA Contact:
Severity: enhancement    
Priority: P4 CC: andy.rytina, Ed.Merks, slavescu, stepper
Version: 4.2Keywords: helpwanted
Target Milestone: PastFlags: stepper: galileo+
Hardware: PC   
OS: Windows XP   
Whiteboard: Lighter, Faster and Better
Bug Depends on: 202063    
Bug Blocks:    
Attachments:
Description Flags
Analyzer-Beta1
none
Oops.. BETA 2
none
New version - Detect different rules for different graph of instances at the same time
none
Remove CDOReadAhead
none
Update
none
FIX toDO
none
TestCase
none
Session Comments
none
TestCase - EContentAdapter none

Description Simon Mc Duff CLA 2007-09-03 07:46:36 EDT
You know ... that you did allowed to the the following test.

I created a LoadChunkRevisionRequest
It takes in parameters a list of CDOID.
It return a list of CDORevisionImpl.

After we receive the list of CDOID from LoadChunkRequest I call
LoadChunkRevisionRequest with the return list from LoadChunkRequest.

I can go up to 5800 objects/ sec with that!! I was amazed!!!

Test with referenceChunk at 

200- 5861 objects/sec
300- 5827 objects / sec
500 - 5727 objects / sec
1000 - 5147 object/sec
2000 - 4231 objects / sec 

Performance should go faster as I increased referenceChunk .... Why we do not
see that ?.. Easy.. the first chunk of the collection we fetch... it is a list
of CDOID. So it will not call LoadChunkRevision... More the refenceChunk is big
more we will fetch object sequentially for the first part...
If I change that (harcoded in the code to not fetch CDOID the first part).. I
can go up to 7500 objects / sec.. We should even go faster if instead of CDOID
we fetch directly the object.. (didn`t try it)

Anyway we should take advantage of that.. don`t you think ? We can even have a
thread that will fetch object by chunk...to provide objects to the main
thread.. (to not block it) all the time. We can call it... SmartReadAhead
Thread !


In brief, if we can detect when we should load objects by chunk (like when we
iterate through collection) we can win big time!!!

Let me know what you think !
Comment 1 Simon Mc Duff CLA 2007-09-05 21:08:37 EDT
I created as a small prototype (and to see if it could be benefic), the SmartReadAdhead.

Basically, I have one attribute SmartReadAhead per CDOViewImpl.
Each time, the CDOStore.get is called I notify the smartreadahead before the call and after the call.

The goal here is to determine the difference between a end-user trigger.. and a process trigger. Normally, the process will trigger more event per seconds than the user. 

If I show you this... you will probaly know where the user interact with the UI.

X = time 

READBEFORE X+12
READAFTER X+12

READBEFORE X+12 
READAFTER X+15  

READBEFORE X+15
READAFTER X+16

<< USER INTERACT HERE

READBEFORE X+25
READAFTER X+26


In the Before notification I calculate the time between .. the last readafter.

In this case... I will add a CDOFetchRule in the smartReadAhead.
CDOFetchRule wll be sent to the server the next time I read something.


My test case is the following :

I have a collection of PurchaseOrder. Each of them have an attribut Supplier and Order.
My process access my Supplier all the time. Doesn`t access Order.

In this case, my access from PurchaseOrder and Supplier will be detected... and will be added to the proxy rule.

Without the SmartReadAhead.. I was doing 1450 objects / secs
After... 8080 objects/secs. I will try it at work tomorrow... But this simple rule seem to be very efficient and make sense as well.



I only listen to event that have this property

isReference and !isMany

because we have already the mechanism to load in chunk for them...

Where do you think will be the best place to be able to hook up our stuff..
Legacy wrapper doesn`t use CDOStore right ? But I think my feature will only be usable to them.. since the hook... is easier to do.

Comments?

Comment 2 Eike Stepper CLA 2007-09-06 05:50:38 EDT
Generally we don't need to look at CDOLegacyImpl for new optimizations first.
If at all there are optimizations possible I'll investigate later.

I'd like to look at your patch before I can see if it's good.
At least the resulting numbers look good ;-)

Originally I had a completely asynchronous solution in mind, but let's see...
Comment 3 Simon Mc Duff CLA 2007-09-06 08:05:53 EDT
(In reply to comment #2)
> Generally we don't need to look at CDOLegacyImpl for new optimizations first.
> If at all there are optimizations possible I'll investigate later.
> I'd like to look at your patch before I can see if it's good.
> At least the resulting numbers look good ;-)
> Originally I had a completely asynchronous solution in mind, but let's see...

I had also Asynchronously in mind...but since it is costly to do one fetch... I presume it will be more efficient to fetch objects in chunk... if possible.


This feature do not exclude the asyncrhonous one....
Comment 4 Simon Mc Duff CLA 2007-09-06 22:31:31 EDT
Created attachment 77860 [details]
Analyzer-Beta1


Here the first version of my analyzer.

This analyzer can detect path that you would used base on time. At runtime.. it will create Cluster of feature to be fetch at the same time.

This analyzer is running in the same thread.

CDOFeatureAnalyzer will find pattern. If it detect one it will add it to make it available to the rulemanager. 

Right now, it is working well for our model. 

We have one analyzer per CDOView.
CDORevisionmanagerImpl have a IFetchRuleManager.

Each time you are doing a loadRevisionRequest it will ask the IFetchRuleManager rules for a id`s. In my case, I delegate that call to my current Analyzer.

The next things to do is the following :  we can detect the sequence of cluster of feature that we access.... and fetch them in another thread.(asynchronously). 
I don`t believe it will be benefits to do that strategy (readahead) objects/objects... It will go so fast.... that it is not worth it .... but by group!!! Oh yes.

To use it.. for now I create a 
CDORevisionManagerImpl.setFetchRuleManager(new CDOFeeatureRuleManager())
and for each VIEW and setup to 
CDOViewImpl.setFetchFeatureAnalyzer(new CDOFeatureAnalyzer());

Now I will work.. with the other thread to detect the sequence of cluster of fetchRule access.... and load objects in advance based on statistics...

That patch contains also 2 fixes.

CDOStore.lookupAhead
BEFORE
      int toIndex = Math.min(index + chunkSize - 1, list.size());
      Set<CDOID> notRegistered = new HashSet<CDOID>();
      for (int i = fromIndex; i <= toIndex; i++)

AFTER (I changed 2 things)
>>      int toIndex = Math.min(index + chunkSize, list.size());
        Set<CDOID> notRegistered = new HashSet<CDOID>();
>>      for (int i = fromIndex; i < toIndex; i++)


Another problem was in the client side
CDORevisionResolver.TimeLine
   @Override
    public synchronized boolean add(CDORevisionImpl revision)
    {
      int version = revision.getVersion();
      ListIterator<CDORevisionImpl> it = super.listIterator(0);
      while (it.hasNext())
      {
        CDORevisionImpl r = it.next();
        int v = r.getVersion();
        if (v == version)
        {
          // Could have more than 1 version of the same objects...
<<<<<          return false;

Before we were throwing exception. The LoadChunk that we did ... can introduce a problem.. (only in the client side)

On the server side.. maybe it make sense but not in the client side.

These 2 problems are not to fix something that I introduce with that patch... It is to fix things that was introduce with the load in chunk.

Let me know what you think.
Comment 5 Simon Mc Duff CLA 2007-09-06 22:32:36 EDT
Created attachment 77861 [details]
Oops.. BETA 2

Forgot to recreate my patch before sending it.
Comment 6 Eike Stepper CLA 2007-09-07 04:10:45 EDT
Refactored a bit an committed to CVS.
Here some questions/comments:

Is NOOPFetchRuleManager.getFetchRules(CDOID) a relict?
It's not part of the interface and it's not being called from anywhere.
---

CDOStore.get() calls preTraverseFeature() and postTraverseFeature().
Wouldn't it be better to pass "revision" instead of "cdoObject.cdoRevision()"?
I'm not very sure, but especially in the first case cdoObject.cdoRevision() could return null.
This is why getRevisionForReading(cdoObject) is called. But that can already lead to loading...
---

Have changed your fix to loadAhead once more. Want to express that toIndex is a reachable index
in contrast to a size. The same idiom like in LoadChunkRequest/Indication. Is it ok?
---

Why is the default for loadRevisionCollectionChunkSize = 1?
It should be turned off by default.
---

In LoadRevisionRequest.requesting() you forgot to ask "fetchSize > 0" which could lead
to signalling sequence error if fetchSize == 0.
---

I believe there's something wrong with the following code but I wasn't able to see what it should be:
        CDOFetchRule fetchRule = new CDOFetchRule(in, getPackageManager());
        for (CDOFeature feature : fetchRule.getFeatures())
        {
          fetchRules.put(fetchRule.getCDOClass(), fetchRule);
        }
"feature" is never used (warning) and the put() seems to multiply put the same thing...
---

In LoadRevisionIndication.collectRevisionsByRule() there seems to be missing a check in isMany:
	if (!id.isNull() && !containedRevisions.containsKey(id))
---
	
What  is this good for:
	int newReferenceChunk = Math.max(referenceChunk, 1000);
---
	
Didn't you say there's a Thread somewhere? Where is it?
Comment 7 Simon Mc Duff CLA 2007-09-07 05:27:46 EDT
(In reply to comment #6)
> Refactored a bit an committed to CVS.
> Here some questions/comments:
> Is NOOPFetchRuleManager.getFetchRules(CDOID) a relict?
> It's not part of the interface and it's not being called from anywhere.
> ---
> CDOStore.get() calls preTraverseFeature() and postTraverseFeature().
> Wouldn't it be better to pass "revision" instead of "cdoObject.cdoRevision()"?
> I'm not very sure, but especially in the first case cdoObject.cdoRevision()
> could return null.
> This is why getRevisionForReading(cdoObject) is called. But that can already
> lead to loading...

I thought at this point it was sure that we have a revision. I don`t want to cause any fetching....

> ---
> Have changed your fix to loadAhead once more. Want to express that toIndex is a
> reachable index
> in contrast to a size. The same idiom like in LoadChunkRequest/Indication. Is
> it ok?
Sure!! 

> ---
> Why is the default for loadRevisionCollectionChunkSize = 1?
> It should be turned off by default.

loadRevisionCollectionChunkSize  <= 1 .. means desactivate


> ---
> In LoadRevisionRequest.requesting() you forgot to ask "fetchSize > 0" which
> could lead
> to signalling sequence error if fetchSize == 0.

You are right !!

> ---
> I believe there's something wrong with the following code but I wasn't able to
> see what it should be:
>         CDOFetchRule fetchRule = new CDOFetchRule(in, getPackageManager());
>         for (CDOFeature feature : fetchRule.getFeatures())
>         {
>           fetchRules.put(fetchRule.getCDOClass(), fetchRule);
>         }
> "feature" is never used (warning) and the put() seems to multiply put the same
> thing...

I change my code last minutes.. should be only 
fetchRules.put(fetchRule.getCDOClass(), fetchRule);


> ---
> In LoadRevisionIndication.collectRevisionsByRule() there seems to be missing a
> check in isMany:
>         if (!id.isNull() && !containedRevisions.containsKey(id))
yeah.
> ---
> What  is this good for:
>         int newReferenceChunk = Math.max(referenceChunk, 1000);

I wasn`t sure If i want to configure that as well.
But detecting that I need to fetch a collection... i want to be sure that the user set referenceChunk.. so it could benefit from the chunking read

> ---
> Didn't you say there's a Thread somewhere? Where is it?
This will be later. It is coming. What I was saying... is the next step will be the asynchronous call.. The API will not change for that since it will use IFeatureAnalyzer and the IFetchRuleManager and the CDORevisionManagerImpl.

The version we have right now only detect cluster of objects based on time ... I need to work to detect cluster based on time & it needs to be connected...

for that only one things should change it is (for now)

IFeatureAnalyzer.preTraverseFeature(CDORevision cdoRevision, CDOFeature cdoFeature, int index, Object result);

Comment 8 Simon Mc Duff CLA 2007-09-07 05:57:27 EDT
(In reply to comment #7)
> (In reply to comment #6)
> > Refactored a bit an committed to CVS.
> > Here some questions/comments:
> > Is NOOPFetchRuleManager.getFetchRules(CDOID) a relict?
> > It's not part of the interface and it's not being called from anywhere.
> > ---
> > CDOStore.get() calls preTraverseFeature() and postTraverseFeature().
> > Wouldn't it be better to pass "revision" instead of "cdoObject.cdoRevision()"?
> > I'm not very sure, but especially in the first case cdoObject.cdoRevision()
> > could return null.
> > This is why getRevisionForReading(cdoObject) is called. But that can already
> > lead to loading...
> I thought at this point it was sure that we have a revision. I don`t want to
> cause any fetching....
> > ---
> > Have changed your fix to loadAhead once more. Want to express that toIndex is a
> > reachable index
> > in contrast to a size. The same idiom like in LoadChunkRequest/Indication. Is
> > it ok?
> Sure!! 
> > ---
> > Why is the default for loadRevisionCollectionChunkSize = 1?
> > It should be turned off by default.
> loadRevisionCollectionChunkSize  <= 1 .. means desactivate
> > ---
> > In LoadRevisionRequest.requesting() you forgot to ask "fetchSize > 0" which
> > could lead
> > to signalling sequence error if fetchSize == 0.
> You are right !!
> > ---
> > I believe there's something wrong with the following code but I wasn't able to
> > see what it should be:
> >         CDOFetchRule fetchRule = new CDOFetchRule(in, getPackageManager());
> >         for (CDOFeature feature : fetchRule.getFeatures())
> >         {
> >           fetchRules.put(fetchRule.getCDOClass(), fetchRule);
> >         }
> > "feature" is never used (warning) and the put() seems to multiply put the same
> > thing...
> I change my code last minutes.. should be only 
> fetchRules.put(fetchRule.getCDOClass(), fetchRule);
> > ---
> > In LoadRevisionIndication.collectRevisionsByRule() there seems to be missing a
> > check in isMany:
> >         if (!id.isNull() && !containedRevisions.containsKey(id))
> yeah.
> > ---
> > What  is this good for:
> >         int newReferenceChunk = Math.max(referenceChunk, 1000);
> I wasn`t sure If i want to configure that as well.
> But detecting that I need to fetch a collection... i want to be sure that the
> user set referenceChunk.. so it could benefit from the chunking read
> > ---
> > Didn't you say there's a Thread somewhere? Where is it?
> This will be later. It is coming. What I was saying... is the next step will be
> the asynchronous call.. The API will not change for that since it will use
> IFeatureAnalyzer and the IFetchRuleManager and the CDORevisionManagerImpl.
> The version we have right now only detect cluster of objects based on time ...
> I need to work to detect cluster based on time & it needs to be connected...
> for that only one things should change it is (for now)
> IFeatureAnalyzer.preTraverseFeature(CDORevision cdoRevision, CDOFeature
> cdoFeature, int index, Object result);


I`m refactoring and moving everything except(CDOFetchRule) into project:
org.eclipse.emf.cdo since only the client should have these classes.
Comment 9 Eike Stepper CLA 2007-09-07 06:36:39 EDT
 (In reply to comment #7)
> > CDOStore.get() calls preTraverseFeature() and postTraverseFeature().
> > Wouldn't it be better to pass "revision" instead of "cdoObject.cdoRevision()"?
> > I'm not very sure, but especially in the first case cdoObject.cdoRevision()
> > could return null.
> > This is why getRevisionForReading(cdoObject) is called. But that can already
> > lead to loading...
> 
> I thought at this point it was sure that we have a revision. I don`t want to
> cause any fetching....

I will have to investigate this later.
Maybe we need a getRevisionForReading(cdoObject, loadOnDemand)...
Is it possible to pass the InternalCDOObject into preTraverseFeature()
instead of its CDORevision? That would be much simpler and maybe even
more correct, since the revision per object can/will change over time.
I'm not sure about what you need in CDOFeatureAnalyzer.

> > Why is the default for loadRevisionCollectionChunkSize = 1?
> > It should be turned off by default.
> 
> loadRevisionCollectionChunkSize  <= 1 .. means desactivate
That seems a bit strange. I'd expect deactivated to be either
loadRevisionCollectionChunkSize  == 0 or 
loadRevisionCollectionChunkSize  == 1 but not both.
What are the exact semantics of a "loadRevisionCollectionChunk"? 

> > What  is this good for:
> >         int newReferenceChunk = Math.max(referenceChunk, 1000);
> 
> I wasn`t sure If i want to configure that as well.
> But detecting that I need to fetch a collection... i want to be sure that the
> user set referenceChunk.. so it could benefit from the chunking read

I think we need to provide something more obvious/configurable.
Can you please post a proposal?
Comment 10 Simon Mc Duff CLA 2007-09-07 07:04:20 EDT
(In reply to comment #9)
>  (In reply to comment #7)
> > > CDOStore.get() calls preTraverseFeature() and postTraverseFeature().
> > > Wouldn't it be better to pass "revision" instead of "cdoObject.cdoRevision()"?
> > > I'm not very sure, but especially in the first case cdoObject.cdoRevision()
> > > could return null.
> > > This is why getRevisionForReading(cdoObject) is called. But that can already
> > > lead to loading...
> > 
> > I thought at this point it was sure that we have a revision. I don`t want to
> > cause any fetching....
> I will have to investigate this later.
> Maybe we need a getRevisionForReading(cdoObject, loadOnDemand)...
> Is it possible to pass the InternalCDOObject into preTraverseFeature()
> instead of its CDORevision? That would be much simpler and maybe even
> more correct, since the revision per object can/will change over time.
> I'm not sure about what you need in CDOFeatureAnalyzer.

I can use the InternalCDOObject. I will modify it.

> > > Why is the default for loadRevisionCollectionChunkSize = 1?
> > > It should be turned off by default.
> > 
> > loadRevisionCollectionChunkSize  <= 1 .. means desactivate
> That seems a bit strange. I'd expect deactivated to be either
> loadRevisionCollectionChunkSize  == 0 or 
> loadRevisionCollectionChunkSize  == 1 but not both.
> What are the exact semantics of a "loadRevisionCollectionChunk"? 

because usually to be activated.. you need to load at least 2 objects at the time... but we can put it at zero.. it is clearer.
loadRevisionCollectionChunk

For collection it will load at the same time the next X objects
if it is 1... it is the same things as loading 1 objects.. ut maybe not in all cases.

> > > What  is this good for:
> > >         int newReferenceChunk = Math.max(referenceChunk, 1000);
> > 
> > I wasn`t sure If i want to configure that as well.
> > But detecting that I need to fetch a collection... i want to be sure that the
> > user set referenceChunk.. so it could benefit from the chunking read
> I think we need to provide something more obvious/configurable.
> Can you please post a proposal?
Ok, I will use the referenceChunk.. if the user didn't set it.. it will not benefits from it.
For now .. I will use reference Chunk and do some benchmark... 



Comment 11 Simon Mc Duff CLA 2007-09-07 07:17:48 EDT
(In reply to comment #9)
>  (In reply to comment #7)
> > > CDOStore.get() calls preTraverseFeature() and postTraverseFeature().
> > > Wouldn't it be better to pass "revision" instead of "cdoObject.cdoRevision()"?
> > > I'm not very sure, but especially in the first case cdoObject.cdoRevision()
> > > could return null.
> > > This is why getRevisionForReading(cdoObject) is called. But that can already
> > > lead to loading...
> > 
> > I thought at this point it was sure that we have a revision. I don`t want to
> > cause any fetching....
> I will have to investigate this later.
> Maybe we need a getRevisionForReading(cdoObject, loadOnDemand)...
> Is it possible to pass the InternalCDOObject into preTraverseFeature()
> instead of its CDORevision? That would be much simpler and maybe even
> more correct, since the revision per object can/will change over time.
> I'm not sure about what you need in CDOFeatureAnalyzer.
> > > Why is the default for loadRevisionCollectionChunkSize = 1?
> > > It should be turned off by default.
> > 
> > loadRevisionCollectionChunkSize  <= 1 .. means desactivate
> That seems a bit strange. I'd expect deactivated to be either
> loadRevisionCollectionChunkSize  == 0 or 
> loadRevisionCollectionChunkSize  == 1 but not both.
> What are the exact semantics of a "loadRevisionCollectionChunk"? 
> > > What  is this good for:
> > >         int newReferenceChunk = Math.max(referenceChunk, 1000);
> > 
> > I wasn`t sure If i want to configure that as well.
> > But detecting that I need to fetch a collection... i want to be sure that the
> > user set referenceChunk.. so it could benefit from the chunking read
> I think we need to provide something more obvious/configurable.
> Can you please post a proposal?

I was a bit suprised that you checked in my code. I just wanted to know if the design was good... and to let you understand my design and my strategy to detect proxy...

With the GraphConnectedSpace that I'm doing .. it will be more accurate for processing business.

Simon
Comment 12 Eike Stepper CLA 2007-09-07 09:35:02 EDT
Oh, sorry. I thought it was meant as a contribution. Is it a problem?
The main reason was that I often need hours to review/understand your patches.
As you might have noticed I spend notable effort to convert it to my own coding style so that the whole code base forms a homogeneous piece of technology.
I hope you're not angry about that. You know each programmer has his own principles and idioms and that makes it easier for me to maintain it in the future.

I guess one of your next patches will show what GraphConnectedSpace is ;-)

BTW. I'm going to visit some friends over the weekend. I won't be able to respond before Sunday. Enjoy...
Comment 13 Simon Mc Duff CLA 2007-09-07 09:50:37 EDT
(In reply to comment #12)
> Oh, sorry. I thought it was meant as a contribution. Is it a problem?
No no.. I want to contribute... only it is not bullet proof.

> The main reason was that I often need hours to review/understand your patches.
> As you might have noticed I spend notable effort to convert it to my own coding
> style so that the whole code base forms a homogeneous piece of technology.

Can you send me your coding style (in eclipse).. so at least we should have the same looking code ?

> I hope you're not angry about that. You know each programmer has his own
> principles and idioms and that makes it easier for me to maintain it in the
> future.
 No!!Needs me more than that to be angry!!

> I guess one of your next patches will show what GraphConnectedSpace is ;-)
> BTW. I'm going to visit some friends over the weekend. I won't be able to
> respond before Sunday. Enjoy...

Have a nice week-end !!

Comment 14 Simon Mc Duff CLA 2007-09-08 13:46:37 EDT
I did some test with the new analyzer. (loading a model that contains 40000 objects)

Here some results:

Without DynamicFetchRule
1250 request to the server
3590 objects / sec

With DynamicFetchRule
63 request to the server (20 times less request)
4769 objects / sec (33 % faster)

(Don`t forget my computer is slow.. usually the throughput is faster and I run my CD player at the time !!)

Both of them are using LoadChunkCollection(100); 

In a network with high latency... the difference will be bigger between them.
1250 requests in a network with high latency will be affected more than doing 63 requests.

To simulate the latency. I added a Thread.currentThread.sleep(20) in the beginning of the LoadRevisionIndication.responding.

1090 objects / sec (Without DynamicFetchRule) Degradation = 329 % 
3750 objects / sec (With DynamicFetchRule) Degradation = 127 %  

Now <With DynamicFetchRule> is 244 % faster than the <Without DynamicFetchRule> instead of 33 %

The performance degradation is important for the <With DynamicFetchRule>

I will try to build a test case where it is not good to use DynamicFetchRule... to see the impact. (Accessing objects without pattern).
I don`t think the performance that important.. but we will see.



Comment 15 Simon Mc Duff CLA 2007-09-09 10:05:33 EDT
Created attachment 77950 [details]
New version - Detect different rules for different graph of instances at the same time

This algorithm is able to detect group of instances use together. 
In this case we could have different rules on different part of the model calculate at the same time.

You can review me and check-in the code.

For my models I tested... it was really fantastic
I have a graph with different structure. The depth is 4 ... and I used different path to access my data.
It can detect my pattern and will fetch in one shot the 4 depth I used all the time.

Fetch 120 000 objects :
(I removed Session.provideCDOID... otherwise I cannot go more than 6000 objects/sec)

Without DynamicRule
3715 request to the server
8302 objects/sec

With DynamicRule
141 request to the server
11890 objects/sec (I went over  12000 objects/sec ... but after it went down a little bit)

Add the latency in there!!! and the <Without DynamicRule> will go down big time.

Thanks
Comment 16 Eike Stepper CLA 2007-09-10 03:35:38 EDT
The results are very impressive!!! Good work ;-)
I must admit that I did not test it, yet, since I'm fighting other fires at the moment (provide proper features and automatic builds with Nick).

Some comments to CDOReadAhead:
1. Have the feeling that QueueWorker would be a good superclass.
2. If not, IWorkSerializer.dispose() should be called somewhere.
3. Have renamed "session" to "view".
4. Is nested class Test still needed?
5. Could you please provide a test case? Also important as an illustration...

Committed to CVS.
Comment 17 Eike Stepper CLA 2007-09-10 05:48:09 EDT
Should we add a feature to control the maximum recursion depth of Session.collectContainedRevisions()?
Comment 18 Simon Mc Duff CLA 2007-09-10 06:48:26 EDT
(In reply to comment #16)
> The results are very impressive!!! Good work ;-)
> I must admit that I did not test it, yet, since I'm fighting other fires at the
> moment (provide proper features and automatic builds with Nick).
> Some comments to CDOReadAhead:
> 1. Have the feeling that QueueWorker would be a good superclass.
> 2. If not, IWorkSerializer.dispose() should be called somewhere.
> 3. Have renamed "session" to "view".
> 4. Is nested class Test still needed?
> 5. Could you please provide a test case? Also important as an illustration...

OOps.. CDOReadAhead was not meant to be send in the patch. It is the part where a thread will fetch objects in another thread.. I did a quick test.. without good result..

You can remove it from now. I will send you the patch to remove it.

> Committed to CVS.

Comment 19 Simon Mc Duff CLA 2007-09-10 06:50:05 EDT
Created attachment 77969 [details]
Remove CDOReadAhead

Remove CDOReadAhead from the repository (Test file)
Comment 20 Eike Stepper CLA 2007-09-10 09:05:01 EDT
 (In reply to comment #19)
> Remove CDOReadAhead from the repository (Test file)

Done.
Comment 21 Simon Mc Duff CLA 2007-09-11 22:09:32 EDT
Created attachment 78135 [details]
Update 

I have 2 kind of analyzer... one for processing intensive.. the other through UI interaction.


I was starting to do my test cases when I had a problem.

Company company = Model1Factory.eINSTANCE.createCompany();
PurchaseOrder purchaseOrder = Model1Factory.eINSTANCE.createPurchaseOrder();
company.getPurchaseOrders().add(purchaseOrder);
purchaseOrder.setSupplier(supplier);


I get the following

org.eclipse.net4j.util.ImplementationError: Unable to provideCDOID: org.eclipse.emf.cdo.tests.model1.impl.SupplierImpl
	at org.eclipse.emf.internal.cdo.CDOViewImpl.provideCDOID(CDOViewImpl.java:402)
	at org.eclipse.emf.cdo.internal.protocol.revision.CDORevisionImpl.writeValues(CDORevisionImpl.java:691)
	at org.eclipse.emf.cdo.internal.protocol.revision.CDORevisionImpl.write(CDORevisionImpl.java:137)
	at org.eclipse.emf.internal.cdo.protocol.CommitTransactionRequest.writeRevisions(CommitTransactionRequest.java:123)
	at org.eclipse.emf.internal.cdo.protocol.CommitTransactionRequest.writeNewObjects(CommitTransactionRequest.java:102)
	at org.eclipse.emf.internal.cdo.protocol.CommitTransactionRequest.requesting(CommitTransactionRequest.java:64)
	at org.eclipse.net4j.signal.RequestWithConfirmation.execute(RequestWithConfirmation.java:48)
	at org.eclipse.net4j.signal.Signal.runSync(Signal.java:130)

I don`t think I will look at it tonight.. but I believe it should persist the dangling reference.. right?
Comment 22 Eike Stepper CLA 2007-09-12 03:43:49 EDT
Could you provide the whole test case code?
PurchaseOrder.supplier is a non-containment reference and it seems that the supplier instance is not properly attached to the view.
In this case an exception must occur. Maybe I should turn it into a DanglingReferenceException exends IllegalStateException
(instead of an ImplementationError).
Comment 23 Simon Mc Duff CLA 2007-09-12 06:59:53 EDT
(In reply to comment #22)
> Could you provide the whole test case code?
> PurchaseOrder.supplier is a non-containment reference and it seems that the
> supplier instance is not properly attached to the view.
> In this case an exception must occur. Maybe I should turn it into a
> DanglingReferenceException exends IllegalStateException
> (instead of an ImplementationError).

You had the whole test case code.. I was started and what you had.

So we need to persist each objects ... I thought it was suppose to be made persistent in the case of a transient objects. Otherwise it added a reference.

I know hibernate does persistence automatically .. and it is easier much eaisier.. 
Objectivity database also ... does it that way.

Do we not do it by choice .. or we follow a spec ?
Comment 24 Eike Stepper CLA 2007-09-12 07:07:19 EDT
CDO tries to follow the EMF spec/semantics as neat as possible/appropriate.
In this case it is the spec of containment within a ResourceSet (CDOView).
I don't think that any existing persistent backend, including RDBs, have this concept.

Your EMF test code would not be legal with an XMLResourceImpl, too.
You can't persist Ecore model instances with cross references to objects that are not contained in the same ResourceSet.
Comment 25 Simon Mc Duff CLA 2007-09-12 07:28:14 EDT
(In reply to comment #24)
> CDO tries to follow the EMF spec/semantics as neat as possible/appropriate.
> In this case it is the spec of containment within a ResourceSet (CDOView).
> I don't think that any existing persistent backend, including RDBs, have this
> concept.
> Your EMF test code would not be legal with an XMLResourceImpl, too.
> You can't persist Ecore model instances with cross references to objects that
> are not contained in the same ResourceSet.


I don't know why you think that any existing persistent backend   don't have this concept.. but they do.

Here the link
http://docs.huihoo.com/framework/hibernate/hibernate-reference-2.1.7/manipulatingdata.html

9.9. you have the flag 
cascade="save-update" (to delete you need to do it manually)

In teneo you use the same things!!


Do you mean in EMF or in general.. because I know well Objectivity and they do as well (don't use EMF) ?

This make sense as well. You want to keep the integrity of your graph. You have a dangling reference... what do you do ? The only possibilty is to persist it ?
What else can the consumer would like to do ?


Comment 26 Eike Stepper CLA 2007-09-12 08:08:51 EDT
I mean especially the ResourceSet and the Resource.
In any case I withdraw my statement about persistence backends because I don't know many of them and it seems irrelevant for this discussion.

What seems important is that EMF can't serialize a Resource with dangling references. Or am I wrong?
Comment 27 Simon Mc Duff CLA 2007-09-12 08:21:16 EDT
(In reply to comment #26)
> I mean especially the ResourceSet and the Resource.
> In any case I withdraw my statement about persistence backends because I don't
> know many of them and it seems irrelevant for this discussion.
> What seems important is that EMF can't serialize a Resource with dangling
> references. Or am I wrong?

You are totally right. 

In this matter, to keep the integrity of the graph should we not persist the dangling reference in the same resource where the object refer to him.

Case 1:
Create A
Create B
Link A to B
Save A to ResourceX


Case 2:
Create A
Save A to ResourceX
Create B
Link A to B

ResourceX should have A and B to keep its integrity. Right ?

The other solution is to throw an exception.

As a consumer... don't want to bother to persist my objects all the time.. I play with objects they should know when it is time to persist them or not.

Another case is the following :

Process A receive an objectX and objectY. 
Process A doesn't know of the objectX or objectY are persistent or transient...
Process A create objectZ.
Process A needs to link objectX to objectZ and to link objectZ to objectY.

Because the save isn't automatically...
Process A needs to know if objectX is persistent by looking in which resources it belongs to.
Depends of the state of objectX it will save or not objectZ.
After its need to save objectZ and all its transient children only and only if objectX was persistent.


This makes the code very ugly...









Comment 28 Simon Mc Duff CLA 2007-09-12 08:51:02 EDT
Can you update my last patch without the Test ?
Comment 29 Simon Mc Duff CLA 2007-09-12 08:51:38 EDT
(In reply to comment #28)
> Can you update my last patch without the Test ?

I meant 

Can you check in my last patch?

Thank you

Comment 30 Eike Stepper CLA 2007-09-12 11:34:02 EDT
Committed patch #5 to CVS.

Some comments/questions:
- Can CDOAbstractFeatureRuleAnalyzer.didFetch be derived from fetchCount?
- See TODO in CDOClusterOfFetchRule.hashCode()
- See TODO in CDOFetchFeatureInfo.hashCode()

Comment 31 Eike Stepper CLA 2007-09-12 11:38:30 EDT
 (In reply to comment #27)
> In this matter, to keep the integrity of the graph should we not persist the
> dangling reference in the same resource where the object refer to him.
> 
> Case 1:
> Create A
> Create B
> Link A to B
> Save A to ResourceX
> 
> 
> Case 2:
> Create A
> Save A to ResourceX
> Create B
> Link A to B
> 
> ResourceX should have A and B to keep its integrity. Right ?
> 
> The other solution is to throw an exception.

Yes, this is what EMF users expect and what CDO already does.
Comment 32 Simon Mc Duff CLA 2007-09-12 11:44:55 EDT
(In reply to comment #31)
>  (In reply to comment #27)
> > In this matter, to keep the integrity of the graph should we not persist the
> > dangling reference in the same resource where the object refer to him.
> > 
> > Case 1:
> > Create A
> > Create B
> > Link A to B
> > Save A to ResourceX
> > 
> > 
> > Case 2:
> > Create A
> > Save A to ResourceX
> > Create B
> > Link A to B
> > 
> > ResourceX should have A and B to keep its integrity. Right ?
> > 
> > The other solution is to throw an exception.
> Yes, this is what EMF users expect and what CDO already does.

I do not agree 100% with you. I prefer to say it depends of which implementation you are using.

Do you think we could add an option on CDOTransaction to :

- When we link 2 objects together.. and the destination isn't attach to any resource. Attach it to the same resource as the from object.

Simon

Comment 33 Eike Stepper CLA 2007-09-12 11:54:23 EDT
> > Yes, this is what EMF users expect and what CDO already does.
> 
> I do not agree 100% with you. I prefer to say it depends of which
> implementation you are using.
> 
> Do you think we could add an option on CDOTransaction to :
> 
> - When we link 2 objects together.. and the destination isn't attach to any
> resource. Attach it to the same resource as the from object.
> 
> Simon
> 

I'm not completely reluctant to have an additional option on CDOTransaction which is off by default. But I'd like to discuss some implications first.

The most obvious one, what happens if several objects that are contained in multiple resources point to the same non-attached object? Which resource shall become the container of the dangling object?

Ed, I've CC'ed you because you probably had good reasons in mind when you chose to throw an exception if an attempt is being made to serialize a resource with dangling references. What do you think? (Should be enough for you to read 3-4 of the most recent comments)
Comment 34 Ed Merks CLA 2007-09-12 12:04:10 EDT
I think the client themselves should always ensure the EObject.eResource != null for all objects they intend to serialize.  Until you've explicitly put an object in a resource, you've not determined where it should live and having some magic that makes that determination for you seems error prone at best.  The client can quite easily use Resource.getAllContents and iterate over EObject.eCrossReferences to find all dangling references and assign them to an appropriate resource of their choice.
Comment 35 Eike Stepper CLA 2007-09-12 12:10:09 EDT
(In reply to comment #34)

This is what I think.

I've got another proposal for you, Simon:
What, if I add vetoable PreCommitEvents?
You could listen to them to apply the logic Ed outlined.
They could also be used to apply things like constraint validation, ...
Comment 36 Simon Mc Duff CLA 2007-09-12 12:36:27 EDT
(In reply to comment #35)
> (In reply to comment #34)
> This is what I think.
> I've got another proposal for you, Simon:
> What, if I add vetoable PreCommitEvents?
> You could listen to them to apply the logic Ed outlined.
> They could also be used to apply things like constraint validation, ...

I agree that we shouldn't have any magic going around. 

It is why I would like have an option in CDOTransaction or any other mechanisms to allow the user to works in that mode.

In this way.. it is not magic anymore. This is a pattern that always come back. 
You want to play with objects in a resource... and save them. This concept exist in some systems... and this is not magic! :-)

PreCommitEvents could be really nice.

But, I will not be able to add objects in the resource when we set things.

Maybe if we can also add listener to the trackingmodification. (Each time something happen to any objects in a resources the listener could be notify) .. it will be perfect!!!

In this case.. to know I would like to know when we commit to do some validation .

What you think ?
Comment 37 Eike Stepper CLA 2007-09-12 12:41:05 EDT
(In reply to comment #36)
> PreCommitEvents could be really nice.

Please open a feature request for it if you decide it can be helpful for you.

> But, I will not be able to add objects in the resource when we set things.

I didn't think you meant to intervene at the time objects are modified. I don't see how an option on CDOTransaction could help here. For this purpose you can use "ordinary" EMF Adapters, can't you?
Comment 38 Ed Merks CLA 2007-09-12 12:41:30 EDT
It seems to me that an object referenced by two different resources is going to be arbitrarily assigned to one of them, depending on which is saved first, and that doesn't seem good.  It also seems you should use a containment reference if you want something in the resource by virtue of it having been referenced by something already in a resource...
Comment 39 Simon Mc Duff CLA 2007-09-12 13:03:00 EDT
(In reply to comment #38)
> It seems to me that an object referenced by two different resources is going to
> be arbitrarily assigned to one of them, depending on which is saved first, and
> that doesn't seem good.  It also seems you should use a containment reference
> if you want something in the resource by virtue of it having been referenced by
> something already in a resource...

You could assign it to the first one link them. In this case you need to be aware of when an object is set to another objects.

Usually, (I cannot speak for outside my business) but each department have their resource and their model. When we link objects between two differents resources.. objects are 99.5 already persistent. So this is perfect no problem here.

So in my processing that I deal with the same resource, I don't want to bother anymore to save my objects. I just want to play with my objects to do my business. 

Things I don't want to play with .. is persistence (In this case Resource).
When we play with transient objects  (e.g. : transientA)... in most cases (I'm sure you could confirm )  if I link it with another object (e.g.: ObjectB) it will go in the same resource. Otherwise I should attach it in another resource.

objectB -> transientA

It makes more sense to persist the object (transientA) instead of throwing an exception.

Comment 40 Simon Mc Duff CLA 2007-09-12 13:07:30 EDT
(In reply to comment #37)
> (In reply to comment #36)
> > PreCommitEvents could be really nice.
> Please open a feature request for it if you decide it can be helpful for you.
> > But, I will not be able to add objects in the resource when we set things.
> I didn't think you meant to intervene at the time objects are modified. I don't
> see how an option on CDOTransaction could help here. For this purpose you can
> use "ordinary" EMF Adapters, can't you?

The option in CDOTransaction.. will do whatever it takes to enable this features (maybe by using EMF Adapters...don't care)

Maybe I just need to add my adapter to the resource... but will be good if CDO gave me this feature. 

I think it is feasible.

Otherwise I will need to add an adapter on each objects.. for that I will need to extend CDOResourceImpl..

Comment 41 Ed Merks CLA 2007-09-12 13:10:55 EDT
There are these options for how dangling references should be handled:

  /**
   * This can be one of "THROW", "DISCARD", "RECORD", where "THROW" is the default.
   */
  String OPTION_PROCESS_DANGLING_HREF          = "PROCESS_DANGLING_HREF";
  String OPTION_PROCESS_DANGLING_HREF_THROW    = "THROW";
  String OPTION_PROCESS_DANGLING_HREF_DISCARD  = "DISCARD";
  String OPTION_PROCESS_DANGLING_HREF_RECORD   = "RECORD";

So it's possible to just discard them as transient.
Comment 42 Simon Mc Duff CLA 2007-09-12 13:57:38 EDT
(In reply to comment #41)
> There are these options for how dangling references should be handled:
>   /**
>    * This can be one of "THROW", "DISCARD", "RECORD", where "THROW" is the
> default.
>    */
>   String OPTION_PROCESS_DANGLING_HREF          = "PROCESS_DANGLING_HREF";
>   String OPTION_PROCESS_DANGLING_HREF_THROW    = "THROW";
>   String OPTION_PROCESS_DANGLING_HREF_DISCARD  = "DISCARD";
>   String OPTION_PROCESS_DANGLING_HREF_RECORD   = "RECORD";
> So it's possible to just discard them as transient.

I don't want to discard them. I want to persist them.

Do they have an option to persist the dangling reference in the same resource as the parent ? Probably not otherwise you would have mentionned it to me :-)
Comment 43 Eike Stepper CLA 2007-09-12 13:59:36 EDT
(In reply to comment #39)
> You could assign it to the first one link them. In this case you need to be
> aware of when an object is set to another objects.

It's quite usual (never heard of different behavior) that the client who adds a
cross reference actively attaches the referenced object to a resource of his
choice. Doing this automatically at the time the reference is set would break
this expectation.


> Usually, (I cannot speak for outside my business) but each department have
> their resource and their model. When we link objects between two differents
> resources.. objects are 99.5 already persistent. So this is perfect no problem
> here.

I find this quite unusual ;-)
Want to say I never saw this behavior, although I don't want to judge whether
it's good or not in your special environment.

> [...]
> Things I don't want to play with .. is persistence (In this case Resource).

I think it's ok to ignore persistence as long as you stay transient. When it
comes to persisting, EMF requires you to think about it. I think this is ok.

> When we play with transient objects  (e.g. : transientA)... in most cases (I'm
> sure you could confirm )  if I link it with another object (e.g.: ObjectB) it
> will go in the same resource. Otherwise I should attach it in another resource.
> 
> objectB -> transientA
> 
> It makes more sense to persist the object (transientA) instead of throwing an
> exception.

As I said I don't agree here and I can't find any new arguments that support
your thesis.

Especially I'm not happy with developing something that can be achieved with
ordinary/unmodified EMF as well. You don't need to override CDOResourceImpl to
have an EMF adapter attached to all contents. You can simply use an
EContentAdapter for this purpose.
Comment 44 Simon Mc Duff CLA 2007-09-12 17:51:14 EDT
(In reply to comment #43)
> (In reply to comment #39)
> > You could assign it to the first one link them. In this case you need to be
> > aware of when an object is set to another objects.
> It's quite usual (never heard of different behavior) that the client who adds a
> cross reference actively attaches the referenced object to a resource of his
> choice. Doing this automatically at the time the reference is set would break
> this expectation.

I wasn 't clear enough because you didn't understand me correctly
What I was trying to say is usually when we do cross-reference, objects are already persistent:
E.g.:
objectA belongs to ResourceA 
objectB belongs to ReosurceB

if we add a link between objectA and objectB.. they stayed in their resources and everybody is happy.

Usually a service that create objects with one resource doesn't have to bother to save every objects because it belongs only to one resource. Right? For him it is really useful.

The service that deals with 2 resources... will have to manage his resources anyway. So for him he can choice either rules (you need to persist every objects or when you link an object with a transient.. this object will be persisted in the same reosurce as the other object)
For him it will takes the rules he prefers.

Finally, the feature helps a lot for complex graph.. and let say we have very complex graph here with tera-bytes of data. In our EMF kind of Object model created in 2000..... it is one of the first things we have implemented.
 
 I don't say it needs to act that way.. but to give the opportunity for people to reuse this pattern... like hibernate, Objectivity.. any respectful database have that.


> > Usually, (I cannot speak for outside my business) but each department have
> > their resource and their model. When we link objects between two differents
> > resources.. objects are 99.5 already persistent. So this is perfect no problem
> > here.
> I find this quite unusual ;-)
> Want to say I never saw this behavior, although I don't want to judge whether
> it's good or not in your special environment.
> > [...]
> > Things I don't want to play with .. is persistence (In this case Resource).
> I think it's ok to ignore persistence as long as you stay transient. When it
> comes to persisting, EMF requires you to think about it. I think this is ok.
> > When we play with transient objects  (e.g. : transientA)... in most cases (I'm
> > sure you could confirm )  if I link it with another object (e.g.: ObjectB) it
> > will go in the same resource. Otherwise I should attach it in another resource.
> > 
> > objectB -> transientA
> > 
> > It makes more sense to persist the object (transientA) instead of throwing an
> > exception.
> As I said I don't agree here and I can't find any new arguments that support
> your thesis.

This feature removes anything from people that would like to work in the old way because one solution doesn't restrict the other ?? Same here... I don't understand your point view.

> Especially I'm not happy with developing something that can be achieved with
> ordinary/unmodified EMF as well. You don't need to override CDOResourceImpl to
> have an EMF adapter attached to all contents. You can simply use an
> EContentAdapter for this purpose.

To add EMF adapter on resource is fine. It will work for objects that are attach/detach to Resource 

But I will need to listen on everyObject that the resource contains. And I will prefer not to add a adapter on every objects.

I tried to attach it on the resource. Doesn't seems to works.
EContentAdapter adapterA =...;
Resource resource = cdoView.getResourcePath("/test1");
resource.eAdapters().add( adapterA );
EObject object1 = cdoView.getObject(<ID>);
object1.setSupplier( supplier );

adapterA doesn't received any messages. :-( I would like to receive message for objects that I modify in that resource. 
Is is suppose to work ?
Is it a problem in CDO ?

Do you have another suggestion for that problem ?

Comment 45 Simon Mc Duff CLA 2007-09-12 18:20:51 EDT
Created attachment 78236 [details]
FIX toDO

(In reply to comment #30)
> Committed patch #5 to CVS.
> Some comments/questions:
> - Can CDOAbstractFeatureRuleAnalyzer.didFetch be derived from fetchCount?

No it cannot, since didFetch is for a specific call. (Always reset it to false)
Count is total number of fetch for the life of the Analyzer.

> - See TODO in CDOClusterOfFetchRule.hashCode()
> - See TODO in CDOFetchFeatureInfo.hashCode()

Fixed
Comment 46 Simon Mc Duff CLA 2007-09-12 20:53:50 EDT
Created attachment 78242 [details]
TestCase

I update the code tonight.. and it seems that the merge didn`t work correctly because in my patch they was there.


This was missing : It causes NullPointerException all over the place without initializing them correctly.

class LoadRevisionIdication
{
  protected CDOID contextID = CDOIDNull.NULL;
  protected int loadRevisionCollectionChunkSize = 1;

These 2 variables wasn`t initialize 

Also I make my FeatureAnalyzer pass.. this is good!!

I will work on others.

If you can check-in that code it will be fantastic

Thank you!
Comment 47 Simon Mc Duff CLA 2007-09-12 21:06:19 EDT
Created attachment 78243 [details]
Session Comments

Session Comments
/**
   * TODO I can't see how recursion is controlled/limited
   * TODO Should it collect revisions for isMany relationship as well ? 
   *      In this case we will have to serialize CDOViewImpl.loadRevisionCollectionChunkSize 
   *      or CDOFetchRuleManager.getLoadRevisionCollectionChunkSize ?
   */

public void collectContainedRevisions

About the controlled limitation. 
I think it cannot do an infinite loop.. so I believe it will terminate somewhere. It cannot continue indefinitely.. isn`t ?
Comment 48 Eike Stepper CLA 2007-09-13 04:00:27 EDT
(In reply to comment #44)
> [...]
> I tried to attach it on the resource. Doesn't seems to works.
> EContentAdapter adapterA =...;
> Resource resource = cdoView.getResourcePath("/test1");
> resource.eAdapters().add( adapterA );
> EObject object1 = cdoView.getObject(<ID>);
> object1.setSupplier( supplier );
> 
> adapterA doesn't received any messages. :-( I would like to receive message for
> objects that I modify in that resource. 
> Is is suppose to work ?

Yes ;-)

> Is it a problem in CDO ?

It shouldn't. Since the EMF UI (editor, outline, properties, ...) seem to work properly I didn't suspect issues in this area.

> Do you have another suggestion for that problem ?

No, if it doesn't work (and it's not the fault of your client code), please file a bug so that I don't forget about it.
Comment 49 Eike Stepper CLA 2007-09-13 04:08:36 EDT
 (In reply to comment #46)
> These 2 variables wasn`t initialize
Sorry about that, I should be more careful when I change your code!

> If you can check-in that code it will be fantastic
Committed to CVS.
Comment 50 Eike Stepper CLA 2007-09-13 04:12:46 EDT
 (In reply to comment #47)
> About the controlled limitation.
> I think it cannot do an infinite loop.. so I believe it will terminate
> somewhere. It cannot continue indefinitely.. isn`t ?

No, I don't think it can recurse indefinitely.
But it's not deterministic in sime way.
It only depends on the actual model which seems to be loaded 
down to its very ends along the to-one containment references.
Since the whole feature is ooptional and off by default, it's not
that important but I have the feeling that people who use it
will ask sooner or later how this feature scales with very deeply 
nested models...
Comment 51 Simon Mc Duff CLA 2007-09-13 07:18:06 EDT
(In reply to comment #50)
>  (In reply to comment #47)
> > About the controlled limitation.
> > I think it cannot do an infinite loop.. so I believe it will terminate
> > somewhere. It cannot continue indefinitely.. isn`t ?
> No, I don't think it can recurse indefinitely.
> But it's not deterministic in sime way.
> It only depends on the actual model which seems to be loaded 
> down to its very ends along the to-one containment references.
> Since the whole feature is ooptional and off by default, it's not
> that important but I have the feeling that people who use it
> will ask sooner or later how this feature scales with very deeply 
> nested models...
This feature collectionByContainment is always on.... should it be something we add and remove ?

Yes could be..

But rather to add depth..or an option ...   maybe it would be nice to add annotation on EMF model to support these things ? We can even use the same annotation as Hibernate!! :-)

What do you think ?



The feature CollectionByRule is something that its on and off.
Comment 52 Ed Merks CLA 2007-09-13 07:18:38 EDT
With regard to adding an adapter, adding an adapter to the resource will only notify you object things that directly change the resource.  Each EObject is a separate notifier that's observed separately.  You can use an EContentAdapter to automatically add an adapter to all objects in the containment tree; be sure to call super in the notifyChanged method, since handling of notifications is how the adapter maintains itself on the children of the containment tree as they are added.
Comment 53 Simon Mc Duff CLA 2007-09-13 07:28:57 EDT
(In reply to comment #52)
> With regard to adding an adapter, adding an adapter to the resource will only
> notify you object things that directly change the resource.  Each EObject is a
> separate notifier that's observed separately.  You can use an EContentAdapter
> to automatically add an adapter to all objects in the containment tree; be sure
> to call super in the notifyChanged method, since handling of notifications is
> how the adapter maintains itself on the children of the containment tree as
> they are added.

In my case I have millions of objects.. and I do not have them on the client necessarly right ?

If I do that... it will fetch all my objects from the database and bring them to the client space ? 

Comment 54 Ed Merks CLA 2007-09-13 07:40:04 EDT
I suppose it might well do that, yes.  Perhaps setTarget(EObject) could be specialized in some way to deal with that...
Comment 55 Eike Stepper CLA 2007-09-13 08:06:23 EDT
(In reply to comment #53)
> If I do that... it will fetch all my objects from the database and bring them
> to the client space ? 

I don't believe so.
Ed, is the adapter that is attached for the various generated/reflective UIs an EContentAdapter? I guess it is and it's implemented singleton-like.

The above mentioned adapter doesn't lead to loading whole resources in CDO and you don't necessarily need an adapter instance per notifier.
Comment 56 Ed Merks CLA 2007-09-13 08:09:51 EDT
No, in the UI we use a factory to create an adapter for an object is it's viewed so if you haven't viewed it, it won't be adapted.  In EContentAdapter automatically traverses the whole containment tree to attach itself to all existing nodes.
Comment 57 Eike Stepper CLA 2007-09-13 08:13:28 EDT
Ok, that's not ideal for Simon's purpose but I guess there are other alternatives to ignore proxies on traversal...
Comment 58 Ed Merks CLA 2007-09-13 09:17:21 EDT
That's a very good point, which reminds me that EContentAdapter has a resolve method you can override to return false.  In that case it will attach to the proxies themselves and then update itself when the proxies are resolved.  This sounds like it should exactly meet the requirements of attaching only to in-memory instances.
Comment 59 Simon Mc Duff CLA 2007-09-13 09:41:33 EDT
(In reply to comment #58)
> That's a very good point, which reminds me that EContentAdapter has a resolve
> method you can override to return false.  In that case it will attach to the
> proxies themselves and then update itself when the proxies are resolved.  This
> sounds like it should exactly meet the requirements of attaching only to
> in-memory instances.

Does the new objects that comes after will have an adapter as well ?


So the following code should really work ?

EContentAdapter adapterA =...;

Resource resource = cdoView.getResourcePath("/test1");

resource.eAdapters().add( adapterA );

EObject object1 = cdoView.getObject(<ID>);

// My adapter should receive a message.
object1.setSupplier( supplier );

I will create a testcase to reproduce it and open a new bugs in CDO.




Comment 60 Simon Mc Duff CLA 2007-09-13 21:00:49 EDT
Created attachment 78395 [details]
TestCase - EContentAdapter

My test case shows that when we call 

resource.eAdapters().add( contentAdapter );

It will load objects contains by the resource.

But I`ve been notified!!

The expected behavior would be not to load any objects.

After calling 

Supplier supplier = (Supplier)transaction.getObject(supplierID);

the adapter should be added at this point automatically.

So I imagine that the only way to fix that problem using adapter would be to override some method... 
Even if I put adapter on each on them.... I found that I will have some overhead with this approach.. knowing I play with millions of objects... :-(
Comment 61 Eike Stepper CLA 2007-09-14 05:00:00 EDT
 (In reply to comment #60)

 I've committed your test case and I can see the point.
 This is certainly not what one would expect.
 
 There's a line (427) in EContentsEList:
		 if (isIncluded(feature) && (!useIsSet() || eObject.eIsSet(feature)))
 where eIsSet() leads to LoadRevisionRequest in the end.
 I'm currently not sure what to do here.
 I'm a bit surprised that  this code is called although you have overridden resolve() in your adapter.
 Can you please file a separate bug so that I don't forget about it?
Comment 62 Ed Merks CLA 2007-09-14 07:08:19 EDT
There's not an actual problem with EMF though right?  I would expect that the eIsSet guard is used, but of course if the list isn't empty, then we will traverse the list.  But it should traverse the basic list which should avoid resolving the containment proxies.

  public List<E> basicList()
  {
    return
      new EContentsEList<E>(eObject, eStructuralFeatures)
      {
        @Override
        protected boolean resolve()
        {
          return false;
        }
      };
  }
Comment 63 Eike Stepper CLA 2007-09-14 07:49:49 EDT
Ed, I'm not completely sure but my guess is that it relates with CDO proxies which are completely different from EMF proxies. Simon has filed bug# 203426 to track this issue. If you're interested in it, I suggest that you CC yourself. From my point of view you can remove your CC from this bug because it's unlikely that the further discussion here is of interest for you. Of course you're nonetheless invited to stay here ;-)
Comment 64 Marius Slavescu CLA 2007-09-14 11:39:55 EDT
(In reply to comment #15 and comment #53)
Simon, 

> In my case I have millions of objects.. and I do not have them on the client
> necessarly right ?
> 

Are these objects stored in different resources? if yes what is the ration of objects per resource?
Do you have multiple root objects in your resources or you have one root object that holds a large list of children (as you mentioned that the depth is 4, which is pretty shallow)?

> Fetch 120 000 objects :
> ...
> With DynamicRule
> 141 request to the server
> 11890 objects/sec (I went over  12000 objects/sec ... but after it went down a
> little bit)

12000 objects/sec is a pretty god number, please can you provide some information of the size of the objects (resources, equivalent XML/XMI size compressed/uncompressed with ZIP) and if their structure/hierarchy actually allows easy optimization of the load operation.


Comment 65 Simon Mc Duff CLA 2007-09-14 12:29:44 EDT
(In reply to comment #64)
> (In reply to comment #15 and comment #53)
> Simon, 
> > In my case I have millions of objects.. and I do not have them on the client
> > necessarly right ?
> > 
> Are these objects stored in different resources?
Yes, the same resource. 
>if yes what is the ration of objects per resource?

I don't understand the question... you mean ratio...
It really depends of the projects... we have resource that have few thousands...and others with millions...


> Do you have multiple root objects in your resources or you have one root object
> that holds a large list of children (as you mentioned that the depth is 4,
> which is pretty shallow)?

Yes, every accessible objects (except contained relationship) are at the root.

> > Fetch 120 000 objects :
> > ...
> > With DynamicRule
> > 141 request to the server
> > 11890 objects/sec (I went over  12000 objects/sec ... but after it went down a
> > little bit)
> 12000 objects/sec is a pretty god number, please can you provide some
> information of the size of the objects (resources, equivalent XML/XMI size
> compressed/uncompressed with ZIP) and if their structure/hierarchy actually
> allows easy optimization of the load operation.

The model used is the Model1.. so it is Purchaseporder, Company, Supplier, SalesOrder... etc.. 
The benchmark are done using CDO and using a MemoryStore as my persistence layer. In my case, I wanted to be independant of the persistence layer. That being said, I've obtain the same speed using a database as my store... In this case the objects would need to be in CDO cache in the server side ... otherwise I dropped a little bit.. 9000 objects/sec. But you still need to go through the transport layer...

I will investigate to have the size of my objects... also I will try to provide a test case in CDO ... so we can know when performance goes down!!!!

Let me know if you want to have more details !!

Simon
Comment 66 Eike Stepper CLA 2007-10-16 05:25:03 EDT
Fixed in I200710101638.
Comment 67 Nick Boldt CLA 2008-01-28 16:44:32 EST
Move to verified as per bug 206558.
Comment 68 Eike Stepper CLA 2008-06-10 02:30:35 EDT
Reversioned due to graduation
Comment 69 Eike Stepper CLA 2008-08-15 03:37:08 EDT
Hi Simon, I just noticed that in CDOStore the only method in which we use the view.getFeatureAnalyzer() is the get() method. Shouldn't we use it in all methods that also use getRevisionForReading() or getRevisionForWriting()?

Also not sure if this was the appropriate Bugzilla...
Comment 70 Simon Mc Duff CLA 2008-08-15 20:20:21 EDT
(In reply to comment #69)
> Hi Simon, I just noticed that in CDOStore the only method in which we use the
> view.getFeatureAnalyzer() is the get() method. Shouldn't we use it in all
> methods that also use getRevisionForReading() or getRevisionForWriting()?
> Also not sure if this was the appropriate Bugzilla...

In getRevisionForReading, will not be approrpiate since we need to know which relation it goes through.

view.getFeatureAnalyzer().postTraverseFeature(cdoObject, cdoFeature, index, value);

My implementation of smart read ahead was only working to get the data. 
Maybe, in my next version I should be able to optimize it for write as well. :-)

You know that with the changes I'm doing with the CommitManager.. we could send deltas sooner to the server. So once we commit, some data will be already sent. :-)Anyway this is another topic! :-)
Comment 71 Eike Stepper CLA 2008-08-16 02:30:51 EDT
(In reply to comment #70)
> In getRevisionForReading, will not be approrpiate since we need to know which
> relation it goes through.
> 
> view.getFeatureAnalyzer().postTraverseFeature(cdoObject, cdoFeature, index,
> value);

Well, although first I was also thinking about moving the traverse calls into getRevisionForReading my main point is not moving the call but simply add it to the other CDOStore methods that *read* somehow, for example isSet().

Is that a bad idea?


> My implementation of smart read ahead was only working to get the data. 
> Maybe, in my next version I should be able to optimize it for write as well.
> :-)
> 
> You know that with the changes I'm doing with the CommitManager.. we could send
> deltas sooner to the server. So once we commit, some data will be already sent.
> :-)Anyway this is another topic! :-)

Yeah, that sounds like great optimization, too ;-) 

Comment 72 Eike Stepper CLA 2008-08-17 02:55:48 EDT
Reopening for optimization.
Comment 73 Eike Stepper CLA 2009-11-01 05:58:42 EST
Rebasing all unresolved enhancement requests to 3.0
Comment 74 Eike Stepper CLA 2010-06-29 04:50:13 EDT
Rebasing all outstanding enhancements requests to version 4.0
Comment 75 Eike Stepper CLA 2011-06-23 03:57:15 EDT
Moving all open enhancement requests to 4.1
Comment 76 Eike Stepper CLA 2012-08-14 22:55:14 EDT
Moving all open issues to 4.2. Open bugs can be ported to 4.1 maintenance after they've been fixed in master.
Comment 77 Eike Stepper CLA 2012-12-30 10:51:40 EST
No activity on this bug anymore.