Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 331010 - Track Model-Element contributors
Summary: Track Model-Element contributors
Status: RESOLVED FIXED
Alias: None
Product: e4
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 1.0   Edit
Hardware: PC Mac OS X - Carbon (unsup.)
: P3 normal (vote)
Target Milestone: 4.1 M5   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 332395
Blocks: 306576 324957
  Show dependency tree
 
Reported: 2010-11-24 08:47 EST by Thomas Schindl CLA
Modified: 2011-01-10 11:05 EST (History)
1 user (show)

See Also:


Attachments
patch (62.71 KB, text/plain)
2010-11-24 08:50 EST, Thomas Schindl CLA
no flags Details
patch (63.46 KB, patch)
2010-11-24 09:33 EST, Thomas Schindl CLA
no flags Details | Diff
patch (77.59 KB, patch)
2010-11-26 15:15 EST, Thomas Schindl CLA
no flags Details | Diff
Patch v2 (74.04 KB, patch)
2011-01-07 13:34 EST, Oleg Besedin CLA
no flags Details | Diff
Patch v3 (74.92 KB, patch)
2011-01-10 10:48 EST, Oleg Besedin CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Schindl CLA 2010-11-24 08:47:21 EST
We need to track which bundle contributed an model element to e.g. look up translations in the right package
Comment 1 Thomas Schindl CLA 2010-11-24 08:50:01 EST
Created attachment 183750 [details]
patch
Comment 2 Thomas Schindl CLA 2010-11-24 09:33:48 EST
Created attachment 183760 [details]
patch

* Updated for HEAD
* fixed possible CCEs
* fixed MPart creation from MPartDescriptor to copy contributor
Comment 3 Oleg Besedin CLA 2010-11-24 15:52:35 EST
Nicely done! Some things that might not work well in a long run:

- Representation of contributors
The patch identifies contributor by a bundle's symbolic name. Unless bundles are singletons, a version is also needed to backtrack the contributor. (See, for instance, RegistryContributor class.)
Also, there are two places that use URI.segment(1) - expecting that all URIs we'll see are platform URIs. That is true today, but likely going to change, see, for instance, bug 307458.

What if instead we stored the original URI for the model loader and converted RegistryContributor to a URI that would keep the bundle version? 

- Duplication of contributor information
Everything that implements ApplicationElement now gets a contributor field. In many cases that field will be identical for all elements in the model; in most cases there will be only a handful of distinct values in it.

a) what if we store references to contributor [URIs] rather then duplicate URIs? I have a feeling that EMF will eat good chunk of savings in this case, but at least we'll avoid duplication of contributor info and make it easier to optimize in future.

b) we could store contributors only when they differ from container's contributors. Pro: max savings; Con: a bit more complicated code for model management, such as drag & drop.

- Utilities
It would be very useful to be have methods somewhere that would allow:
  + roundtrip contributorURI <-> bundle
  + get contributor for model elements that my bundle is adding  programmatically
  + find a list of elements {full | top level only } that are coming from a contributor

- Expandability
The trick will be to allow non-bundle contributors to be used in future. As long as it does not require too much extra effort today.
Comment 4 Thomas Schindl CLA 2010-11-26 13:59:24 EST
(In reply to comment #3)
> Nicely done! Some things that might not work well in a long run:
> 
> - Representation of contributors
> The patch identifies contributor by a bundle's symbolic name. Unless bundles
> are singletons, a version is also needed to backtrack the contributor. (See,
> for instance, RegistryContributor class.)
> Also, there are two places that use URI.segment(1) - expecting that all URIs
> we'll see are platform URIs. That is true today, but likely going to change,
> see, for instance, bug 307458.
> 
> What if instead we stored the original URI for the model loader and converted
> RegistryContributor to a URI that would keep the bundle version? 
> 
> - Duplication of contributor information
> Everything that implements ApplicationElement now gets a contributor field. In
> many cases that field will be identical for all elements in the model; in most
> cases there will be only a handful of distinct values in it.
> 
> a) what if we store references to contributor [URIs] rather then duplicate
> URIs? I have a feeling that EMF will eat good chunk of savings in this case,
> but at least we'll avoid duplication of contributor info and make it easier to
> optimize in future.

What about we store simply the Bundle-Instance itself?

> 
> b) we could store contributors only when they differ from container's
> contributors. Pro: max savings; Con: a bit more complicated code for model
> management, such as drag & drop.
> 

-1 because I think this going to be very complex task.

> - Utilities
> It would be very useful to be have methods somewhere that would allow:
>   + roundtrip contributorURI <-> bundle

Ok. Might be best suited in EModelService

>   + get contributor for model elements that my bundle is adding 
> programmatically

for processor we could start add a ChangeRecorder and set the contributor. For others we'd probably need to provide them a factory to create model elements similar to the current factories but who accept a parameter which is the bundle.

>   + find a list of elements {full | top level only } that are coming from a
> contributor
> 

ok best suited in the EModelService

> - Expandability
> The trick will be to allow non-bundle contributors to be used in future. As
> long as it does not require too much extra effort today.

Not sure what you are talking about here, can you give me more info so that i can understand it?
Comment 5 Thomas Schindl CLA 2010-11-26 14:27:53 EST
[...]
> a) what if we store references to contributor [URIs] rather then duplicate
> URIs? I have a feeling that EMF will eat good chunk of savings in this case,
> but at least we'll avoid duplication of contributor info and make it easier to
> optimize in future.
> 

Could we simply use String.intern() for those URIs and then we have only very little different String instances?
Comment 6 Oleg Besedin CLA 2010-11-26 15:07:00 EST
(In reply to comment #4)
> > a) what if we store references to contributor [URIs] ...
> What about we store simply the Bundle-Instance itself?

We'll need to be able to serialize this field.

> > - Expandability
> > The trick will be to allow non-bundle contributors to be used in future. As
> > long as it does not require too much extra effort today.
> Not sure what you are talking about here, can you give me more info so that i
> can understand it?

Model elements not tied to an OSGi bundle, as a wild example, model elements created by Javascript code. That's why it would be cool to assume that a bundle contirbutor is defined by something open, such as URI, and not limited to bundles.

(In reply to comment #5)
> Could we simply use String.intern() for those URIs and then we have only very
> little different String instances?

In the past there were issues with String.intern(). For example, at some point all extension registry strings were interned. However, that lead to problems: 1) CPU performance - depending on VM, interned strings were much slower; 2) memory pool usage - depending on VM, special memory pools, such as PermGen, were used to store interned strings resulting in quickly running out of memory. As a result extension registry has been changed to not use #intern().
Comment 7 Thomas Schindl CLA 2010-11-26 15:15:05 EST
Created attachment 183954 [details]
patch

This is now using a Contributor class instead of a simple string
Comment 8 Thomas Schindl CLA 2010-11-26 15:18:39 EST
(In reply to comment #6)
> (In reply to comment #4)
> > > a) what if we store references to contributor [URIs] ...
> > What about we store simply the Bundle-Instance itself?
> 
> We'll need to be able to serialize this field.
> 

Why serialize, I currently defined it as transient in the EMF model. Serialization is only needed if we switch from delta to storeing the comlete model.
Comment 9 Thomas Schindl CLA 2010-11-26 15:40:25 EST
Well we could address this thing all together including our URI stuff for Contribution.

What about we add the following class:

class BundleURI {
  String bundle;
  String version;
}

class ClassURI extends BundleURI {
  String sourceType; // Java, JavaScript, ...
  String className;  // fully qualified Java-Class, ...
}
Comment 10 Thomas Schindl CLA 2010-11-30 10:52:09 EST
(In reply to comment #3)
> Nicely done! Some things that might not work well in a long run:
> 
> - Representation of contributors
> The patch identifies contributor by a bundle's symbolic name. Unless bundles
> are singletons, a version is also needed to backtrack the contributor. (See,
> for instance, RegistryContributor class.)
> Also, there are two places that use URI.segment(1) - expecting that all URIs
> we'll see are platform URIs. That is true today, but likely going to change,
> see, for instance, bug 307458.

Just rethinking this: I'm not sure having none singleton bundles contributing model elements? I think using String.intern() shouldn't be that big of a problem because we don't have very many different bundles do we?
Comment 11 Oleg Besedin CLA 2010-12-15 16:02:10 EST
What if we say that contributors are represented by URIs and leave API side at that?
Then the model keeps URI or String, no need for special model element to keep Contributor.
Add to that two utility methods URI <-> Bundle and we are done (aside from the ugly duplication of strings all over):
- method to get the OSGi bundle from contributor URI (or null, if it is not a bundle)
- method to create contributor URI from an OSGi Bundle.

====

The duplication of strings still prevents me from having warm and fuzzy feeling about this. There are about 3,000 elements currently being created on 4.1SDK startup; we have products based on 3.x that are two orders of magnitude bigger then that. 

The string interning is something I'd avoid using on a large scale for reasons in the comment 6.

The only semi-elegant solution that comes to mind is to keep two tables in a lazy-loading & partially-loaded state:
 <root model elment Id> -> contributor
   + calculated and cached <non-root model element Id> -> contributor
 <contributor> -> list of model element IDs

conceptually similarly to what we do in the extension registry

The re-parenting on model elements would have to have triggers to update information of re-parented elements.
Comment 12 Thomas Schindl CLA 2010-12-15 16:57:14 EST
I agree on the fact that we should use a String (ApplicationElement#contributor: String) but I don't share resevations against String intern. How many bundles are those big products made off? I don't expect to have more than 500 different bundles *contribute to our model*.

If we take the Eclipse SDK where no more than 200 bundles (I guess it is much less who really contribute model elements) make up the ratio is 1/15.

Which Strings did you interned in extension registry?

The reparenting idea though it sounds good is IMHO too complex and error prone we can much easier protect our model if every element keeps its contributor (e.g once set one can't reset the value, ...)
Comment 13 Oleg Besedin CLA 2011-01-07 13:34:20 EST
Created attachment 186297 [details]
Patch v2

How about this one? I use platform URIs in String form and do #intern(). Probably will need to be adjusted in future, but its better to have some XMI translation even if it is not optimal then none.
Comment 14 Thomas Schindl CLA 2011-01-07 16:59:09 EST
Why not doing the intern() in the model, this would even intern things when we save and restore?
Comment 15 Oleg Besedin CLA 2011-01-10 10:48:55 EST
Created attachment 186398 [details]
Patch v3

(In reply to comment #14)
> Why not doing the intern() in the model, this would even intern things when we
> save and restore?

Yes, good point. I updated the patch to do that.
Comment 16 Oleg Besedin CLA 2011-01-10 11:05:55 EST
Patch applied to CVS Head. Thanks Tom!