Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 352002 - Make logic at getMatchedEObject(EObject) replaceable
Summary: Make logic at getMatchedEObject(EObject) replaceable
Status: CLOSED FIXED
Alias: None
Product: EMFCompare
Classification: Modeling
Component: Core (show other bugs)
Version: 1.3   Edit
Hardware: PC Windows 7
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: EMF Compare CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-13 12:48 EDT by Victor Roldan Betancort CLA
Modified: 2011-08-17 11:34 EDT (History)
3 users (show)

See Also:


Attachments
tentative patch (2.25 KB, patch)
2011-07-20 08:20 EDT, Laurent Goubet CLA
no flags Details | Diff
patch v1 (1.20 MB, patch)
2011-07-21 13:41 EDT, Victor Roldan Betancort CLA
no flags Details | Diff
patch v2 (1.20 MB, patch)
2011-07-22 07:49 EDT, Victor Roldan Betancort CLA
no flags Details | Diff
patch v3 (1.20 MB, patch)
2011-07-22 10:22 EDT, Victor Roldan Betancort CLA
no flags Details | Diff
patch v4 (46.62 KB, patch)
2011-07-22 12:23 EDT, Victor Roldan Betancort CLA
no flags Details | Diff
patch v5 (47.39 KB, patch)
2011-07-25 08:33 EDT, Laurent Goubet CLA
no flags Details | Diff
patch v6 (47.54 KB, patch)
2011-07-25 10:24 EDT, Victor Roldan Betancort CLA
no flags Details | Diff
patch v7 (48.73 KB, patch)
2011-07-25 11:25 EDT, Laurent Goubet CLA
laurent.goubet: iplog+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Victor Roldan Betancort CLA 2011-07-13 12:48:23 EDT
Cross posting from newsgroup:

we've been studying to implement some client use-cases reusing EMF Compare to determine differences between 2 models stored in a CDO Server.

Among several other issues yet to study, I've face scalability issues with GenericDiffEngine.getMatchedEObject(EObject) and AbstractCheck.getMatchedEObject(EObject).

org.eclipse.emf.compare.diff.engine.GenericDiffEngine.getMatchedEObject(EObject)
org.eclipse.emf.compare.diff.engine.check.AbstractCheck.getMatchedEObject(EObject)
org.eclipse.emf.compare.diff.engine.check.AbstractCheck.isUnmatched(EObject)

Both method implementations (which are, by the way, duplicated) do use CrossReferencing to determine which EObjects have been matched during the Match stage. It seems to me that using CrossReferencing to do this may be convenient when objects are in memory, but definitely not when objects are stored remotely. And even when objects are in memory, using CrossRefercing seems like a bit of overhead. While doing some profiling, I found it was called dozens (if not hundred) of times per second.

Hacking a bit the code I found its way faster (and probably would have a negigible effect over memory footprint) to have a "match cache". Something like:

protected Map<EObject, EObject> matches;

and then getMatched would look like this:

protected final EObject getMatchedEObject(EObject from) {
    return matches.get(from);
}

Havent used it, but probably that would need some refinement to work in a Three Way Comparison scenario, I guess.

This seem to work very well to me. I decided to extend and contribute my own GenericMatchEngine and AttributesCheck/ReferencesCheck implementation, overriding those getMatchedEObject methods, but as you already see, those are declared as final.

Using JDT a bit reveals that this getMatchedEObject is used very much in each class (used in 11 methods at GenericDiffEngine), and it also seems to me very much important to leave it unextensible / replaceable.

I would like to ask if it would be feasible to have some intermediate interface, something like "MatchManager" which takes care of this.
The default implementation will use CrossReferencing, but users could contribute different MatchManager implementations. Don't care about the implementation specifics as long as this becomes pluggable / replaceable. Also, locate this concern in a single place, and you'll get rid of duplicate code
Comment 1 Victor Roldan Betancort CLA 2011-07-13 12:52:59 EDT
It should be sufficient to change final methods to non-final. This can be applied to maintenance release 1.1 (the one i'm interested in), as (I believe) it doesn't break binary compatibility.
Comment 2 Laurent Goubet CLA 2011-07-20 08:20:37 EDT
Created attachment 199985 [details]
tentative patch

Hi Victor,

A CrossReferencer is, in the end, a simple map. The only thing that should be costly with them is their initialization. When initializing a Cross Referencer, we browse the whole model in order to populate its map; but when accessing, it's simply a matter of accessing the cached map. The "significant" gain you see in performance is for each of the accesses to the getMatchedEObject method, or only at the first?

But indeed, as Cedric mentionned in the newsgroup, we may have a lot of useless key-value pairs in this cross referencer since we cross reference everything instead of focusing on the MatchElements and UnmatchedElements.

Do you still observe such a performance issue with the attached patch? If yes, we may indeed need to use a map instead of a cross referencer (for us, that would be the same really ... the cross referencer was just a convenient way of populating this cache :)).
Comment 3 Victor Roldan Betancort CLA 2011-07-20 09:06:32 EDT
Hi Laurent,

indeed you are right. The thing is, CDO and CrossReferencer don't go along very well. One of the benefits of CDO is its CDOObject lazy-load and granularity: CDOObjects can be GC'ed when not referenced anymore. A CrossReferencer, as a map, creates Strong references to CDOObjects, and makes eviction impossible. The 
"virtual-memory" effect CDO provides gets disrupted by a mandatory model load. When the model is bigger than the Java Heap, you get an OOME.

So we normally normally request EMF based-framework providers to think that models doesn't necessarily need to be in memory, and also that strong-referencing EObjects makes the load of huge models impossible. This is not always easy.

You may argue: in our case (EMF Compare), there is no other chance, the way it's implemented, model needs to be completely loaded in memory. Such argument is indeed right. But, and due to the great extensibility mechanisms EMF Compare provide (see MetamodelFilter, IMatchScopeProvider, ReferencesCheck.shouldBeIgnored()...) it may be the case that you dont need to consider the whole model during comparison. This is actually our case.

Another argument, that may not be very valid to you, but I'd like to share.
We (OpenCanarias) have CrossReferencer customized. The logic in EcoreUtil has been modified so custom CrossReferencer contributions can be injected. That way we can execute some very optimal CDO Queries that avoids loading the model. We plan to work a contribution on that matter to EMF runtime. Our problem is that this CrossReferencer is optimal when called "a few times per second", but not very optimal when executed "hundred times per second". getMatchedEObject is called with too high frecuency, and that became a bottleneck in our app. Yes, I accept that "thats our problem", but things would be way easier if that mechanism was extensible. Having a FINAL method makes things very difficult (the only way, indeed, is to modify EMF Compare code).

>But indeed, as Cedric mentionned in the newsgroup, we may have a lot of useless
>key-value pairs in this cross referencer since we cross reference everything
>instead of focusing on the MatchElements and UnmatchedElements.
>Do you still observe such a performance issue with the attached patch? If yes,
>we may indeed need to use a map instead of a cross referencer (for us, that
>would be the same really ... the cross referencer was just a convenient way of
>populating this cache :)).

I'll try the patch, but if it keeps executing with high frequency the CrossReferencer, I'm afraid we'll keep having the same problem. We are not targeting optimizing the current implementation (because it is fine the way it was initially designed for), but rather that this mechanism becomes extensible for people like us, that just cannot afford so much CrossReferencing. CrossReferencing is, indeed, a nice way to do it when EObjects are fully loaded in memory, but not in a CDO scenario :(

Thanks for your time and feedback!
Víctor.
Comment 4 Laurent Goubet CLA 2011-07-20 09:28:47 EDT
Victor,

As I mentionned, cross referencing is, in our case, simply a convenient way of populating a cache of our matched objects. I am absolutely not against using this cross referencer _only_ to populate an external map (Map<EObject, EObject>) and getting rid of it after that "initialization" phase.

The only thing to look after is that we don't have "only" matches in this Map : we also need to be able to determine wether an Object is considered "unmatched". Of course, that would also be possible with a "Set<EObject>" populated from the CrossReferencer.

That would make (indeed) a "MatchManager" holding a Map and a Set, initialized from the MatchModel, and that would contain these caches by exposing the "getMatchedEObject" and "isUnmatched" methods. If such a mechanism seem fine by you, I will try and implement something of the sort (or, if you need it in a timely manner, you could put together a patch so that we can apply it).

Such changes, however, won't find their way in the 1.1 stream (we only backport critical fixes to that stream). The target would be 1.2.1 at best. 1.2.1 and 1.1.1 have the same "lowest" compatibility though, and version 1.2 can be used against the same versions of Eclipse (3.4 and above) as does the 1.1 release. Is the "1.2" target a blocker for your?
Comment 5 Victor Roldan Betancort CLA 2011-07-20 09:44:16 EDT
> As I mentionned, cross referencing is, in our case, simply a convenient way of
> populating a cache of our matched objects. I am absolutely not against using
> this cross referencer _only_ to populate an external map (Map<EObject,
> EObject>) and getting rid of it after that "initialization" phase.

It is indeed the CrossReferencer what I'm interested not to be used. I idea is that users could customize that logic.


> The only thing to look after is that we don't have "only" matches in this Map :
> we also need to be able to determine wether an Object is considered
> "unmatched". Of course, that would also be possible with a "Set<EObject>"
> populated from the CrossReferencer.

Yeah, the idea of a MatchManager should easily solve that

> That would make (indeed) a "MatchManager" holding a Map and a Set, initialized
> from the MatchModel, and that would contain these caches by exposing the
> "getMatchedEObject" and "isUnmatched" methods. If such a mechanism seem fine by
> you, I will try and implement something of the sort (or, if you need it in a
> timely manner, you could put together a patch so that we can apply it).

That mechanism is just very much convenient for me :) I'll provide the patch!

> Such changes, however, won't find their way in the 1.1 stream (we only backport
> critical fixes to that stream). The target would be 1.2.1 at best. 1.2.1 and
> 1.1.1 have the same "lowest" compatibility though, and version 1.2 can be used
> against the same versions of Eclipse (3.4 and above) as does the 1.1 release.
> Is the "1.2" target a blocker for your?

I see. We'll, it depends. If EMF Compare explicitly depends on EMF 2.7.0, then its kind of a showstopper for us. However, we *maybe* could consider moving to EMF 2.7.0 and then stay at the Eclipse 3.6 (if that was possible), but that may be unlikely.

I'll have to analyze if we can move to EMF 2.7. I'll study it ASAP.

Right now, lets create a patch. We'll think later about this issue ;)
Comment 6 Laurent Goubet CLA 2011-07-20 09:52:41 EDT
> I see. We'll, it depends. If EMF Compare explicitly depends on EMF 2.7.0, then
> its kind of a showstopper for us. However, we *maybe* could consider moving to
> EMF 2.7.0 and then stay at the Eclipse 3.6 (if that was possible), but that may
> be unlikely.
> 
> I'll have to analyze if we can move to EMF 2.7. I'll study it ASAP.
> 
> Right now, lets create a patch. We'll think later about this issue ;)

EMF 2.7 is for Eclipse 3.7 only, and is not compatible with Eclipse 3.6, 3.5 or 3.4.

The thing is : EMF Compare does not depend explicitely on any version of EMF. We depend on EMF 2.4 and above, and our releases are tested against :

Eclipse 3.4 - EMF 2.4
Eclipse 3.5 - EMF 2.5
Eclipse 3.6 - EMF 2.6
Eclipse 3.7 - EMF 2.7

You can look at the compatibility chart on http://wiki.eclipse.org/EMF_Compare (and I see I haven't updated it for 1.2 ... I'll fix that asap :p.
Comment 7 Victor Roldan Betancort CLA 2011-07-20 09:54:36 EDT
Ah, that's great news, Laurent! Then it shouldn't be a problem to move to any EMF Compare 1.x as long as its compatible with EMF 2.6. Awesome!
Comment 8 Victor Roldan Betancort CLA 2011-07-21 11:51:48 EDT
Hi guys, 

I'm spending some time to provide this patch. One of the issues I'm experimenting is that, to be coherent, I should change the constructor of AbstractCheck to receive an instance of IMatchManager instead of a CrossReferencer. However, AbstractCheck is public, and that implies API breakage. Do you guys have any restriction on that, or should I provide a fix that does not break API? Creating another constructor does not break API, but seems a bit ugly to have 2 constructors that way...

Looking forward your feedback
Comment 9 Cedric Brun CLA 2011-07-21 11:56:39 EDT
We don't want to break API right now, I so please go the API compatible route. On the other hand we might release an EMF Compare 2.0 around the end of the year, we could get rid of this then, please add a deprecation annotation so that we think about it then.
Comment 10 Victor Roldan Betancort CLA 2011-07-21 11:58:20 EDT
(In reply to comment #9)
> We don't want to break API right now, I so please go the API compatible route.
> On the other hand we might release an EMF Compare 2.0 around the end of the
> year, we could get rid of this then, please add a deprecation annotation so
> that we think about it then.

Agreed!
Comment 11 Victor Roldan Betancort CLA 2011-07-21 13:19:05 EDT
Guys, the patch is almost ready. I'm trying to execute your test framework.

I see that Laurent's improvement using a customized CrossReferencer is still valid to enhance performance (probably a lot). Should I merge those changes into my patch, or shall we create another bug to keep track of that new logic?
Comment 12 Victor Roldan Betancort CLA 2011-07-21 13:35:24 EDT
Your test-suite seems to succeed!

I'll attach the patch right away
Comment 13 Victor Roldan Betancort CLA 2011-07-21 13:41:17 EDT
Created attachment 200112 [details]
patch v1

Introduced the new IMatchManager interface, and the default CrossReferencerMatchManager implementation.

Im not an git expert (just learned a bit today), but from what I see in the patch file, there seem to be included some changes that are unrelated with my modifications. I have no clue where they came from, I probably did something from with the common usage workflow for git. I created the changes, committed to my local repository, and then created a patch file through history view.

If the patch doesn't work for you, let me know.
Comment 14 Laurent Goubet CLA 2011-07-22 03:39:39 EDT
That's a lot of messages :p.

Don't worry about my improvement of the cross referencer, I'll add it for now in the "deprecated" implementation, but it should no longer be used anyway with yours.

As for the git patch : IIRC, it creates a patch for your commit, but it also takes into account the delta between your current cloned repository and the distant one. i.e : if you do not "fetch" the changes I pushed on the Eclipse repository, you will have these changes in your patch (but reverted)!

So, please update your local repository (fetch), and create a new patch for your commit.
Comment 15 Victor Roldan Betancort CLA 2011-07-22 05:58:03 EDT
(In reply to comment #14)
> That's a lot of messages :p.
> 
> Don't worry about my improvement of the cross referencer, I'll add it for now
> in the "deprecated" implementation, but it should no longer be used anyway with
> yours.

No, indeed, it should be includeed, IMO. What I did is just refactoring and formalizing an implementation-agnostic interface called IMatchManager. A CrossReferencerMatchManager is still valid, and your optimization should be included, because it seems to be more efficient during initialization.

> As for the git patch : IIRC, it creates a patch for your commit, but it also
> takes into account the delta between your current cloned repository and the
> distant one. i.e : if you do not "fetch" the changes I pushed on the Eclipse
> repository, you will have these changes in your patch (but reverted)!
> 
> So, please update your local repository (fetch), and create a new patch for
> your commit.

I see, it's a bit confusing, this "distributed" model is new to me :P I'll provide a new patch ASAP. I'd like to include your patch in this, too.
Comment 16 Victor Roldan Betancort CLA 2011-07-22 06:01:08 EDT
According to GIT, my repo is up to date so... I assume the current patch is
valid, is that right?
Comment 17 Victor Roldan Betancort CLA 2011-07-22 07:49:52 EDT
Created attachment 200170 [details]
patch v2

I included Laurent's optimized MatchCrossReferencer.
Comment 18 Victor Roldan Betancort CLA 2011-07-22 10:22:15 EDT
Created attachment 200191 [details]
patch v3

While using the new interfaces in our product, I identified some oddities.

IDiffEngine was updated accordingly.
doDiffResourceSet name is now inconsistent, as scope is defined by IMatchManager, so I renamed method to doDiff.
Updated DiffService to use non deprecated methods.

BTW, I don't have the API tools settled up. I focused in not breaking anything, but I appreciate you guys double-check that.

Sorry for so much patching submission.
Comment 19 Victor Roldan Betancort CLA 2011-07-22 10:38:20 EDT
Due to amount of lines of code changed, looks like this may even need some kind of IP process.... :(
Comment 20 Laurent Goubet CLA 2011-07-22 10:55:20 EDT
Victor,

I just took a quick look at the third patch, but it still includes some changes that come from one of my commits... I told you to make a fetch; you will now need to rebase (sorry, in fact what you needed was a "pull" (which does a fetch + merge by default, you should configure it to do a fetch + rebase)).

rebase will properly undo your commits, rebase to the last commit on the repository, then replay your commits on top of that. Then you should have a patch that does not include xml files :).
Comment 21 Victor Roldan Betancort CLA 2011-07-22 10:58:10 EDT
Laurent, I did fetch, and even pull, but no changes came in :( I don't know what was wrong with that. I'll try again, but... couldn't you just apply the patch and ignore all those changes that do not belong to org.eclipse.emf.compare.diff? 

I'll try again anyway...
Comment 22 Laurent Goubet CLA 2011-07-22 11:02:08 EDT
(In reply to comment #21)
> Laurent, I did fetch, and even pull, but no changes came in :( I don't know
> what was wrong with that. I'll try again, but... couldn't you just apply the
> patch and ignore all those changes that do not belong to
> org.eclipse.emf.compare.diff? 
> 
> I'll try again anyway...

I could, and I will :). It's just that I won't apply the patch today (I am on something else right now) and I told you to do the rebase because of your comment on IP process : I opened the patch in order to see the number of changes ^^.

Anyways, you will need the rebase if you plan on contributing others patches as you would otherwise keep increasing the size of the patches if you are not on the latest commit when creating them due to how eGit works.
Comment 23 Victor Roldan Betancort CLA 2011-07-22 11:11:32 EDT
(In reply to comment #22)
> (In reply to comment #21)
> > Laurent, I did fetch, and even pull, but no changes came in :( I don't know
> > what was wrong with that. I'll try again, but... couldn't you just apply the
> > patch and ignore all those changes that do not belong to
> > org.eclipse.emf.compare.diff? 
> > 
> > I'll try again anyway...
> 
> I could, and I will :). It's just that I won't apply the patch today (I am on
> something else right now) and I told you to do the rebase because of your
> comment on IP process : I opened the patch in order to see the number of
> changes ^^.

Yeah, I didn't count your changes, of course!

> Anyways, you will need the rebase if you plan on contributing others patches as
> you would otherwise keep increasing the size of the patches if you are not on
> the latest commit when creating them due to how eGit works.

I know :( I just don't know how to fetch the changes. I do fetch, pull... no changes. Could there be something wrong with the refs configuration?

In fetch from upstream I only have this:
+refs/heads/*:refs/remotes/origin/*
Comment 24 Laurent Goubet CLA 2011-07-22 12:06:26 EDT
> I know :( I just don't know how to fetch the changes. I do fetch, pull... no
> changes. Could there be something wrong with the refs configuration?
> 
> In fetch from upstream I only have this:
> +refs/heads/*:refs/remotes/origin/*

If you have done a pull with the default eGit configuration, then it is most likely that it has done a "default" pull which is equivalent to "fetch" + "merge"... and which isn't what you should have done as you now have your changes + the merges in your commit :S.

Don't worry about this patch, I'll apply it and reject changes that were part of previous commits. You should however change your git configuration to be sure not to get in that state later :

- Go to the "git" perspective
- browse into your repository, then in "Working directory/.git" and open the "config" file

you should have something like this in this file :
	[branch "master"]
		remote = origin
		merge = refs/heads/master

change it to
	[branch "master"]
		remote = origin
		merge = refs/heads/master
		rebase = true

If you plan to work on other branches, you might as well do the same for them (1.2 comes to mind) :
	[branch "1.2"]
		remote = origin
		merge = refs/heads/1.2
		rebase = true

Once that is done, right-click on the repository and hit "Show in History" (beware : the following will reset all your local changes to EMF Compare and revert your local "master" branch to what it is on the Eclipse repo... make sure you have saved everything, or that the patch you attached here contains everything).

- In the history view, locate the last commit before your changes (it should have a gray "origin/master" tag)
- Right click that commit and use "Reset -> Hard".

Your local "master" branch has now been reset to the state it has on the Eclipse repository (i.e : my last push). You should now be able to work, commit and create patches only containing your commit (remember to pull before creating the patch though, you should never reach the state you were in since you now "rebase" on pull ;)).
Comment 25 Victor Roldan Betancort CLA 2011-07-22 12:23:27 EDT
Created attachment 200215 [details]
patch v4

Ok, I think now I got it :P

Laurent, thank you very much. You explanation has been very clear. As you said, I most likely merged yesterday your changes during one of those "pulls" I did. Things make more sense to me now.

Attached a new patch now, doesn't seem to contain any of your changes now ;)
Comment 26 Laurent Goubet CLA 2011-07-25 08:32:38 EDT
Thanks for going through the trouble of re-creating this patch another time. I'm attaching here a modified version of it with the changes to :
1) content checkstyle (mainly "." (dot) additions at the end of javadoc comments)
2) prevent API breakages 
 - We can't remove the "crossReferencer" field from AbstractCheck yet
 - Same issue in GenericDiffEngine
 - We can't add (or remove) a method to (or from) the IDiffEngine interface. Do you really need the doDiff(MatchModel, boolean, IMatchManager) method added, or was that just to mimic the API we had that used a CrossReferencer? For now, and for the sake of API compatibility, I'd rather we still pass the cross referencer to the engine, which in turn will initialize the match manager. Will that still be enough for you (Of course, this is something that must change if we move to 2.0).
3) Use your copyright on the new classes (IMatchManager copyright Obeo? More like Open Canarias and others right? ^^)

Could you verify that this new version of the patch still do the work as you expected it to?

As for the copyright, do you provide this patch under copyright of Open Canarias, or yourself (i.e : "Copyright (c) 2011 Open Canarias and others" or "Copyright (c) 2011 Victor Roldan Betancort and others")? I have assumed the first, please alter it as needed and provide the new patch otherwise.

Other than that if you are not an Eclipse commiter, could you please confirm that :

1. you wrote 100% of the code;

2. you have the right to contribute the code to Eclipse, and

3. you have included the EPL license header in all source files?

(To do this, you can just add a comment to this bug stating, "I confirm
that..." and basically repeat these three points.)

Since this patch indeed modifies a good number of files, we'll indeed need to go through a CQ prior to committing it. Once you are happy with the patch, I'll open a CQ for it.

Sorry for the extra steps, and thanks!
Comment 27 Laurent Goubet CLA 2011-07-25 08:33:30 EDT
Created attachment 200275 [details]
patch v5
Comment 28 Victor Roldan Betancort CLA 2011-07-25 08:42:13 EDT
Hi Laurent,

some inlined comments below:
Do
> you really need the doDiff(MatchModel, boolean, IMatchManager) method added, or
> was that just to mimic the API we had that used a CrossReferencer? For now, and
> for the sake of API compatibility, I'd rather we still pass the cross
> referencer to the engine, which in turn will initialize the match manager. Will
> that still be enough for you (Of course, this is something that must change if
> we move to 2.0).

No, I'm afraid there must some way to pass IMatchManager instances, otherwise, I won't be able to pass my adapted version for CDO. So that method was pretty much necessary for me.


> 3) Use your copyright on the new classes (IMatchManager copyright Obeo? More
> like Open Canarias and others right? ^^)

Yeah, I just wanted to wait for your suggestion, instead of writting something you may not agree with. "Open Canarias and others" is fine :)

> Could you verify that this new version of the patch still do the work as you
> expected it to?

I'll let you know ASAP, but if your removed the new doDiff method, I already can tell you it won't work for me.

> Other than that if you are not an Eclipse commiter, could you please confirm
> that :

I'm committer of the CDO project.


> 1. you wrote 100% of the code;
> 
> 2. you have the right to contribute the code to Eclipse, and
> 
> 3. you have included the EPL license header in all source files?

I confirm that:

1. I wrote the 100% of the code
2. I have rights to contribute the code to Eclipse
3. All the code is licensed under EPL

> Since this patch indeed modifies a good number of files, we'll indeed need to
> go through a CQ prior to committing it. Once you are happy with the patch, I'll
> open a CQ for it.
> 
> Sorry for the extra steps, and thanks!

No worries, thank you for considering the contribution ;)
Comment 29 Victor Roldan Betancort CLA 2011-07-25 08:57:25 EDT
Laurent,

your patch looks good to me. The only issue I found is the missing doDiff method in IDiffEngine. There must be some way to inject customized IMatchManager instances, otherwise my work won't make much sense. I'm not worried *how* is injected (for instance, a mechanism like a Map of options is fine to me, too), as long as there is a way to inject it.

So, everything is fine but the missing injection of IMatchManager instances. There rest seems OK to me :)
Comment 30 Laurent Goubet CLA 2011-07-25 09:05:32 EDT
> > Do you really need the doDiff(MatchModel, boolean, IMatchManager) method added, or
> > was that just to mimic the API we had that used a CrossReferencer? For now, and
> > for the sake of API compatibility, I'd rather we still pass the cross
> > referencer to the engine, which in turn will initialize the match manager. Will
> > that still be enough for you (Of course, this is something that must change if
> > we move to 2.0).
> 
> No, I'm afraid there must some way to pass IMatchManager instances, otherwise,
> I won't be able to pass my adapted version for CDO. So that method was pretty
> much necessary for me.

I feared as much. I know we have some implementers of that interface out there, so modifying it is impossible without switching to a 2.0. I'll try and confirm our plans for Juno with the project lead and what we choose for now between the two possible options I see :
1) break the IDiffEngine interface and switch the version from 1.3.0 to 2.0
2) create an IDiffEngine2 interface with the two necessary new methods (doDiff and doDiffResourceSet, both accepting an IMatchManager
Comment 31 Victor Roldan Betancort CLA 2011-07-25 09:15:05 EDT
(In reply to comment #30)
> > > Do you really need the doDiff(MatchModel, boolean, IMatchManager) method added, or
> > > was that just to mimic the API we had that used a CrossReferencer? For now, and
> > > for the sake of API compatibility, I'd rather we still pass the cross
> > > referencer to the engine, which in turn will initialize the match manager. Will
> > > that still be enough for you (Of course, this is something that must change if
> > > we move to 2.0).
> > 
> > No, I'm afraid there must some way to pass IMatchManager instances, otherwise,
> > I won't be able to pass my adapted version for CDO. So that method was pretty
> > much necessary for me.
> 
> I feared as much. I know we have some implementers of that interface out there,
> so modifying it is impossible without switching to a 2.0. I'll try and confirm
> our plans for Juno with the project lead and what we choose for now between the
> two possible options I see :
> 1) break the IDiffEngine interface and switch the version from 1.3.0 to 2.0
> 2) create an IDiffEngine2 interface with the two necessary new methods (doDiff
> and doDiffResourceSet, both accepting an IMatchManager

I see. For some reason I though breaking extenders was not considered API breakage, but only clients.

solution 2) seems fine to me now. Later, you guys can consider increasing major version and then clean up those interfaces, if it proceeds :)
Comment 32 Victor Roldan Betancort CLA 2011-07-25 09:18:10 EDT
> > two possible options I see :
> > 1) break the IDiffEngine interface and switch the version from 1.3.0 to 2.0
> > 2) create an IDiffEngine2 interface with the two necessary new methods (doDiff
> > and doDiffResourceSet, both accepting an IMatchManager

Regarding those 2 new methods, I would like to ask (and only in regard of API coherence) why there should be a "doDiffResourceSet" method. I appears to me that scope is now defined by IMatchManager implementation, so you shouldn't need such method. What do you think?
Comment 33 Laurent Goubet CLA 2011-07-25 09:25:08 EDT
(In reply to comment #32)
> > > two possible options I see :
> > > 1) break the IDiffEngine interface and switch the version from 1.3.0 to 2.0
> > > 2) create an IDiffEngine2 interface with the two necessary new methods (doDiff
> > > and doDiffResourceSet, both accepting an IMatchManager
> 
> Regarding those 2 new methods, I would like to ask (and only in regard of API
> coherence) why there should be a "doDiffResourceSet" method. I appears to me
> that scope is now defined by IMatchManager implementation, so you shouldn't
> need such method. What do you think?

That's a different bug :p. Long story short, the logic is too duplicated for now for us to consider this change in the scope of an 1.x (yes, once again :D).

At first EMF Compare only considered a single resource when comparing, then we added the possibility to take the whole resource set into account... and later still we added the "scope" mechanism. I do aggree with you : there should only be one "doDiff" method, but that too is one of the "big" changes that need be done in the scope of the rearchitecturing we'll include in the overhaul that will be 2.0.
Comment 34 Victor Roldan Betancort CLA 2011-07-25 09:30:35 EDT
(In reply to comment #33)
> (In reply to comment #32)
> > > > two possible options I see :
> > > > 1) break the IDiffEngine interface and switch the version from 1.3.0 to 2.0
> > > > 2) create an IDiffEngine2 interface with the two necessary new methods (doDiff
> > > > and doDiffResourceSet, both accepting an IMatchManager
> > 
> > Regarding those 2 new methods, I would like to ask (and only in regard of API
> > coherence) why there should be a "doDiffResourceSet" method. I appears to me
> > that scope is now defined by IMatchManager implementation, so you shouldn't
> > need such method. What do you think?
> 
> That's a different bug :p. Long story short, the logic is too duplicated for
> now for us to consider this change in the scope of an 1.x (yes, once again :D).
> 
> At first EMF Compare only considered a single resource when comparing, then we
> added the possibility to take the whole resource set into account... and later
> still we added the "scope" mechanism. I do aggree with you : there should only
> be one "doDiff" method, but that too is one of the "big" changes that need be
> done in the scope of the rearchitecturing we'll include in the overhaul that
> will be 2.0.

Ah, I see. The as long as you guys are aware of that, is fine :)

Do you want me to provide a patch v6 with the IDiffEngine2 interface, or you can take care of that?
Comment 35 Laurent Goubet CLA 2011-07-25 09:38:49 EDT
> Do you want me to provide a patch v6 with the IDiffEngine2 interface, or you
> can take care of that?

Please do, that'll allow us to get the CQ going with that patch. We'll see later whether we keep a "*2" interface or break the API.
Comment 36 Victor Roldan Betancort CLA 2011-07-25 09:40:47 EDT
(In reply to comment #35)
> > Do you want me to provide a patch v6 with the IDiffEngine2 interface, or you
> > can take care of that?
> 
> Please do, that'll allow us to get the CQ going with that patch. We'll see
> later whether we keep a "*2" interface or break the API.

Great, I'll work on that right away :)
Comment 37 Victor Roldan Betancort CLA 2011-07-25 10:24:41 EDT
Created attachment 200284 [details]
patch v6

added IDiffEngine2
GenericDiffEngine implements IDiffEngine2

I finally installed the API Baseline using EMF Compare 1.2.

I didn't update method signatures in IDiffService class because it would break clients (not sure why... its a final class). So in my code I just casted IDiffEngine to IDiffEngine2 when calling DiffService.getBestDiffEngine(MatchModel)

If you think its ok, then we can go to CQ
Comment 38 Laurent Goubet CLA 2011-07-25 10:50:24 EDT
(In reply to comment #37)
> I didn't update method signatures in IDiffService class because it would break
> clients (not sure why... its a final class).

API Compatibility is at both level : binary compatibility (those that call our code don't have to re-build their code to update the version of EMF Compare they use) and source compatibility : clients of the API don't need to change their code to cope with changes in our.

Changing the signature of a public method means that client of our API have to react at source level : if they were assigning the result of the call to diffService to a variable, that variable type needs to change. That, or you changed the parameters : clients must ensure they pass the right new parameters. Even addind a 'throws' directive is an API break : clients must add the necessary try/catch.

Everything that might make client change their code in order to react to our changes means an API break. That is why public methods (and interfaces) aren't so widely used in EMF Compare : they are a pain to maintain in terms of evolution when we go for compatibility :).

> 
> If you think its ok, then we can go to CQ

It should be. I'm finishing my work on another bug, then I'll apply this new patch to make a last check, and I'll open the CQ.
Comment 39 Victor Roldan Betancort CLA 2011-07-25 10:59:12 EDT
(In reply to comment #38)
> (In reply to comment #37)
> > I didn't update method signatures in IDiffService class because it would break
> > clients (not sure why... its a final class).
> 
> API Compatibility is at both level : binary compatibility (those that call our
> code don't have to re-build their code to update the version of EMF Compare
> they use) and source compatibility : clients of the API don't need to change
> their code to cope with changes in our.
> 
> Changing the signature of a public method means that client of our API have to
> react at source level : if they were assigning the result of the call to
> diffService to a variable, that variable type needs to change. That, or you
> changed the parameters : clients must ensure they pass the right new
> parameters. Even addind a 'throws' directive is an API break : clients must add
> the necessary try/catch.

Great explanation! However, I don't quite see how a client would have to react if I specialize the return type of get IDiffService.getBestDiffEngine() from IDiffEngine to IDiffEngine2. The client is receiving a subtype of IDiffEngine, so there is no problem with the call. Also, since IDiffService is a final class, it can not be extended. I guess there must be something at binary compatibility level. I can live with it, anyway ;)


> 
> Everything that might make client change their code in order to react to our
> changes means an API break. That is why public methods (and interfaces) aren't
> so widely used in EMF Compare : they are a pain to maintain in terms of
> evolution when we go for compatibility :).
> 
> > 
> > If you think its ok, then we can go to CQ
> 
> It should be. I'm finishing my work on another bug, then I'll apply this new
> patch to make a last check, and I'll open the CQ.

Thanks again for your time! :D
Comment 40 Laurent Goubet CLA 2011-07-25 11:25:01 EDT
Created attachment 200288 [details]
patch v7

And a v7 ^^. I simply added the doDiffResourceSet(... IMatchManager) to the IDiffEngine2 interface, don't want to have to create and IDiffEngine3 later for that :p.

I'll launch the CQ with this.
Comment 41 Victor Roldan Betancort CLA 2011-07-25 11:27:49 EDT
(In reply to comment #40)
> Created attachment 200288 [details]
> patch v7
> 
> And a v7 ^^. I simply added the doDiffResourceSet(... IMatchManager) to the
> IDiffEngine2 interface, don't want to have to create and IDiffEngine3 later for
> that :p.

Ah, I see :P Thought about it, but wasn't completely sure if it was necessary!

> 
> I'll launch the CQ with this.

Great! If you need anything from my side to accelerate the process, just let me know :)
Comment 42 Cedric Brun CLA 2011-07-29 05:39:23 EDT
The CQ has been created :) Let's wait for the IP team clearance.
Comment 43 Cedric Brun CLA 2011-07-29 06:01:24 EDT
It's CQ #5419 by the way
Comment 44 Laurent Goubet CLA 2011-08-16 11:23:08 EDT
The CQ (https://dev.eclipse.org/ipzilla/show_bug.cgi?id=5419) has been approved. 

However, the patch somehow makes 4 of our new tests fail in NPE, and I could not pinpoint why. I'll have to investigate on that before this can be pushed to master.
Comment 45 Victor Roldan Betancort CLA 2011-08-17 05:11:39 EDT
Hi Laurent,

(In reply to comment #44)
> The CQ (https://dev.eclipse.org/ipzilla/show_bug.cgi?id=5419) has been
> approved. 
> 
> However, the patch somehow makes 4 of our new tests fail in NPE, and I could
> not pinpoint why. I'll have to investigate on that before this can be pushed to
> master.

Good the CQ got resolved! I'll update with master and see whats going on with the new test cases.
Comment 46 Laurent Goubet CLA 2011-08-17 10:09:39 EDT
Victor, I found the culprit (crossReferencer.clear() in the CrossReferencerMatchManager#reset()) ... which would indicate that we have a memory problem since this should only be called after everything has ended and the cross referencer can be safely cleared of all reference.

The patch highlights this potential issue, but is not in itself a problem (that "clear" call is something I added to one of your patch versions IIRC).

It has been pushed on master, I'll now try and determine why the cross referencer cannot be cleared...
Comment 47 Victor Roldan Betancort CLA 2011-08-17 10:32:17 EDT
> Victor, I found the culprit (crossReferencer.clear() in the
> CrossReferencerMatchManager#reset()) ... which would indicate that we have a
> memory problem since this should only be called after everything has ended and
> the cross referencer can be safely cleared of all reference.

May it be related with the CrossReferencer subclassing?

> The patch highlights this potential issue, but is not in itself a problem (that
> "clear" call is something I added to one of your patch versions IIRC).
> It has been pushed on master, I'll now try and determine why the cross
> referencer cannot be cleared...

Thanks, appreciate that!
Comment 48 Laurent Goubet CLA 2011-08-17 11:26:41 EDT
OK so my conclusions are :

1 - the cross referencer might be reused in order not to instantiate a new one (which is what triggered these NPE : clearing the cross referencing and reusing that empty map afterwards give no result :p).
2 - clearing the cross referencer must not be done
3 - the garbage collector knows better than us what should be cleared from memory, "reset"ing the IMatchManager is unnecessary

I have removed the "reset" method from IMatchManager, as well as its implementation and its call. (As this was something I added, I don't think this will impact you too much.)

This has been pushed on master. Closing this bug now :).
Comment 49 Victor Roldan Betancort CLA 2011-08-17 11:34:49 EDT
(In reply to comment #48)
> OK so my conclusions are :
> 
> 1 - the cross referencer might be reused in order not to instantiate a new one
> (which is what triggered these NPE : clearing the cross referencing and reusing
> that empty map afterwards give no result :p).
> 2 - clearing the cross referencer must not be done
> 3 - the garbage collector knows better than us what should be cleared from
> memory, "reset"ing the IMatchManager is unnecessary
> 
> I have removed the "reset" method from IMatchManager, as well as its
> implementation and its call. (As this was something I added, I don't think this
> will impact you too much.)
> 
> This has been pushed on master. Closing this bug now :).

Thank you guys for you feedback and support!