Community
Participate
Working Groups
We need to track which bundle contributed an model element to e.g. look up translations in the right package
Created attachment 183750 [details] patch
Created attachment 183760 [details] patch * Updated for HEAD * fixed possible CCEs * fixed MPart creation from MPartDescriptor to copy contributor
Nicely done! Some things that might not work well in a long run: - Representation of contributors The patch identifies contributor by a bundle's symbolic name. Unless bundles are singletons, a version is also needed to backtrack the contributor. (See, for instance, RegistryContributor class.) Also, there are two places that use URI.segment(1) - expecting that all URIs we'll see are platform URIs. That is true today, but likely going to change, see, for instance, bug 307458. What if instead we stored the original URI for the model loader and converted RegistryContributor to a URI that would keep the bundle version? - Duplication of contributor information Everything that implements ApplicationElement now gets a contributor field. In many cases that field will be identical for all elements in the model; in most cases there will be only a handful of distinct values in it. a) what if we store references to contributor [URIs] rather then duplicate URIs? I have a feeling that EMF will eat good chunk of savings in this case, but at least we'll avoid duplication of contributor info and make it easier to optimize in future. b) we could store contributors only when they differ from container's contributors. Pro: max savings; Con: a bit more complicated code for model management, such as drag & drop. - Utilities It would be very useful to be have methods somewhere that would allow: + roundtrip contributorURI <-> bundle + get contributor for model elements that my bundle is adding programmatically + find a list of elements {full | top level only } that are coming from a contributor - Expandability The trick will be to allow non-bundle contributors to be used in future. As long as it does not require too much extra effort today.
(In reply to comment #3) > Nicely done! Some things that might not work well in a long run: > > - Representation of contributors > The patch identifies contributor by a bundle's symbolic name. Unless bundles > are singletons, a version is also needed to backtrack the contributor. (See, > for instance, RegistryContributor class.) > Also, there are two places that use URI.segment(1) - expecting that all URIs > we'll see are platform URIs. That is true today, but likely going to change, > see, for instance, bug 307458. > > What if instead we stored the original URI for the model loader and converted > RegistryContributor to a URI that would keep the bundle version? > > - Duplication of contributor information > Everything that implements ApplicationElement now gets a contributor field. In > many cases that field will be identical for all elements in the model; in most > cases there will be only a handful of distinct values in it. > > a) what if we store references to contributor [URIs] rather then duplicate > URIs? I have a feeling that EMF will eat good chunk of savings in this case, > but at least we'll avoid duplication of contributor info and make it easier to > optimize in future. What about we store simply the Bundle-Instance itself? > > b) we could store contributors only when they differ from container's > contributors. Pro: max savings; Con: a bit more complicated code for model > management, such as drag & drop. > -1 because I think this going to be very complex task. > - Utilities > It would be very useful to be have methods somewhere that would allow: > + roundtrip contributorURI <-> bundle Ok. Might be best suited in EModelService > + get contributor for model elements that my bundle is adding > programmatically for processor we could start add a ChangeRecorder and set the contributor. For others we'd probably need to provide them a factory to create model elements similar to the current factories but who accept a parameter which is the bundle. > + find a list of elements {full | top level only } that are coming from a > contributor > ok best suited in the EModelService > - Expandability > The trick will be to allow non-bundle contributors to be used in future. As > long as it does not require too much extra effort today. Not sure what you are talking about here, can you give me more info so that i can understand it?
[...] > a) what if we store references to contributor [URIs] rather then duplicate > URIs? I have a feeling that EMF will eat good chunk of savings in this case, > but at least we'll avoid duplication of contributor info and make it easier to > optimize in future. > Could we simply use String.intern() for those URIs and then we have only very little different String instances?
(In reply to comment #4) > > a) what if we store references to contributor [URIs] ... > What about we store simply the Bundle-Instance itself? We'll need to be able to serialize this field. > > - Expandability > > The trick will be to allow non-bundle contributors to be used in future. As > > long as it does not require too much extra effort today. > Not sure what you are talking about here, can you give me more info so that i > can understand it? Model elements not tied to an OSGi bundle, as a wild example, model elements created by Javascript code. That's why it would be cool to assume that a bundle contirbutor is defined by something open, such as URI, and not limited to bundles. (In reply to comment #5) > Could we simply use String.intern() for those URIs and then we have only very > little different String instances? In the past there were issues with String.intern(). For example, at some point all extension registry strings were interned. However, that lead to problems: 1) CPU performance - depending on VM, interned strings were much slower; 2) memory pool usage - depending on VM, special memory pools, such as PermGen, were used to store interned strings resulting in quickly running out of memory. As a result extension registry has been changed to not use #intern().
Created attachment 183954 [details] patch This is now using a Contributor class instead of a simple string
(In reply to comment #6) > (In reply to comment #4) > > > a) what if we store references to contributor [URIs] ... > > What about we store simply the Bundle-Instance itself? > > We'll need to be able to serialize this field. > Why serialize, I currently defined it as transient in the EMF model. Serialization is only needed if we switch from delta to storeing the comlete model.
Well we could address this thing all together including our URI stuff for Contribution. What about we add the following class: class BundleURI { String bundle; String version; } class ClassURI extends BundleURI { String sourceType; // Java, JavaScript, ... String className; // fully qualified Java-Class, ... }
(In reply to comment #3) > Nicely done! Some things that might not work well in a long run: > > - Representation of contributors > The patch identifies contributor by a bundle's symbolic name. Unless bundles > are singletons, a version is also needed to backtrack the contributor. (See, > for instance, RegistryContributor class.) > Also, there are two places that use URI.segment(1) - expecting that all URIs > we'll see are platform URIs. That is true today, but likely going to change, > see, for instance, bug 307458. Just rethinking this: I'm not sure having none singleton bundles contributing model elements? I think using String.intern() shouldn't be that big of a problem because we don't have very many different bundles do we?
What if we say that contributors are represented by URIs and leave API side at that? Then the model keeps URI or String, no need for special model element to keep Contributor. Add to that two utility methods URI <-> Bundle and we are done (aside from the ugly duplication of strings all over): - method to get the OSGi bundle from contributor URI (or null, if it is not a bundle) - method to create contributor URI from an OSGi Bundle. ==== The duplication of strings still prevents me from having warm and fuzzy feeling about this. There are about 3,000 elements currently being created on 4.1SDK startup; we have products based on 3.x that are two orders of magnitude bigger then that. The string interning is something I'd avoid using on a large scale for reasons in the comment 6. The only semi-elegant solution that comes to mind is to keep two tables in a lazy-loading & partially-loaded state: <root model elment Id> -> contributor + calculated and cached <non-root model element Id> -> contributor <contributor> -> list of model element IDs conceptually similarly to what we do in the extension registry The re-parenting on model elements would have to have triggers to update information of re-parented elements.
I agree on the fact that we should use a String (ApplicationElement#contributor: String) but I don't share resevations against String intern. How many bundles are those big products made off? I don't expect to have more than 500 different bundles *contribute to our model*. If we take the Eclipse SDK where no more than 200 bundles (I guess it is much less who really contribute model elements) make up the ratio is 1/15. Which Strings did you interned in extension registry? The reparenting idea though it sounds good is IMHO too complex and error prone we can much easier protect our model if every element keeps its contributor (e.g once set one can't reset the value, ...)
Created attachment 186297 [details] Patch v2 How about this one? I use platform URIs in String form and do #intern(). Probably will need to be adjusted in future, but its better to have some XMI translation even if it is not optimal then none.
Why not doing the intern() in the model, this would even intern things when we save and restore?
Created attachment 186398 [details] Patch v3 (In reply to comment #14) > Why not doing the intern() in the model, this would even intern things when we > save and restore? Yes, good point. I updated the patch to do that.
Patch applied to CVS Head. Thanks Tom!