Community
Participate
Working Groups
While working on Bug 307790 I learned that eager bundles register their services differently from lazy bundles. In particular, services specified in a component.xml (ds services) are registered before the start method is called when the bundle is started in a lazy fashion. However, this is not true if the bundle is explicitly started (bundle.start()). I'm worried that this inconsistency will cause problems for others. Tom, do you know if this is by design?
I believe this is by design but I am not finding an obvious section in the DS specification on this behavior. BJ, Stoyan?
I know about this issue and I was wondering how we can settle it. Keep in mind that the DS spec always talks about SCR processing started bundles. This means the bundle must be started before processing its components. Lazy bundles are exception and SCR have to process its components before they are started. This is not guarantee that lazy bundles will always have their DS components processed by SCR before their activator's start method is called. Lets imagine a case when DS bundle is the last bundle that is started in a launch configuration or the DS bundle was stopped/started or updated. In these cases SCR may process the DS components of lazy bundles after they are started. Perhaps the DS spec has to contain a note about this.
I guess this issue is more about educating people than anything else. As Tom mentioned in Bug 307790 comment #5, we should get rid of the activators in bundles that declare DS services. This might be a good long term goal. In the short term, we may want to look for Activator#start methods that try to acquire services. I have seen this pattern in multiple places -- and they all caused problems. To make matters worse, acquiring the services in the activator#start works in some cases (often in the platform), but doesn't appear to work in general. Tom, BJ, Stoyan, do you know of any good reasons to acquire services in an Activator#start method?
(In reply to comment #3) > Tom, BJ, Stoyan, do you know of any good reasons to acquire services in an > Activator#start method? When you say "acquire services" you mean make an assumption that a service is available and always acquire it, and fail if the service is not available? No, in general that would be a bad practice and as you say we have seen where this kind of assumption causes start order problems. If you do not use DS then you have to do the work yourself to track if a service is available. Your bundle should not fail to start just because a service is not available. Instead the bundle should track the needed service and enable additional functionality if/when the required service is available. Of coarse this is what DS is supposed to be used for to simplify this kind of work.
Tracking services bad. Just say no. Declarative services good. Just say yes. (quotes taken out of context from EclipseCon)
(In reply to comment #0) > While working on Bug 307790 I learned that eager bundles register their > services differently from lazy bundles. In particular, services specified in a > component.xml (ds services) are registered before the start method is called > when the bundle is started in a lazy fashion. However, this is not true if the > bundle is explicitly started (bundle.start()). > > I'm worried that this inconsistency will cause problems for others. Tom, do you > know if this is by design? Yes. For a lazy activated bundle, SCR has no way of knowing when the activation will be triggered but it needs to register any declared services when the bundle is started. So SCR must register the services prior to activation. It is then quite possible that some bundle using the service will then trigger activation. (In reply to comment #2) > Lazy bundles are exception and SCR have to process its components before they > are started. This is not technically accurate. SCR must not process components before the bundle is started. However for lazy activation bundles, SCR must process the components before *activation*, since SCR has no idea when that activation will occur. It is very important to understand the distinction between starting and activation for a bundle. For a non-lazy activation bundle, these occur together. For a lazy activation bundle, activation can occur at some future time after being started. (In reply to comment #3) > Tom, BJ, Stoyan, do you know of any good reasons to acquire services in an > Activator#start method? If a bundle has a BundleActivator that attempts to acquire service that are provided by the same bundle using DS components, then this will only work if DS is known to have processed the components before the activator is run. This is only possible with lazy activation, assuming there is no race between the DS processing and the lazy activation trigger. Generally a DS bundle should not use a BundleActivator. Also, using an immediate component defeats the purpose of making the bundle lazy activated since DS will need to load the component implementation class. Using a lazy activated BundleActivator and DS services only makes sense if there is some expensive initialization that needs to be delayed until some class (like a DS service implementation class) is loaded from the bundle. I don't see any specific bug in the framework or DS here.
(In reply to comment #6) > If a bundle has a BundleActivator that attempts to acquire service that are > provided by the same bundle using DS components, then this will only work if DS > is known to have processed the components before the activator is run. This is > only possible with lazy activation, assuming there is no race between the DS > processing and the lazy activation trigger. AFAICT there is no way to ensure that there is no race as the order of DS processing vs activator running is not guaranteed. Or is it? Tom and I had a discussion about the use of synchronous bundle listeners etc but i'm not sure if that applies or when the event is broadcast. > Generally a DS bundle should not use a BundleActivator. Also, using an > immediate component defeats the purpose of making the bundle lazy activated > since DS will need to load the component implementation class. > > Using a lazy activated BundleActivator and DS services only makes sense if > there is some expensive initialization that needs to be delayed until some > class (like a DS service implementation class) is loaded from the bundle. Certainly the expensive stuff is important. Using lazy activation and DS also eliminates the need for a system integrator to manage which bundles to start. At least in Equinox lazy bundles are start()'d with the activation policy flag automatically so no one has to thing about it. > I don't see any specific bug in the framework or DS here. Ian was not pointing out a bug so much as an issue. We have cases where people write lazy bundles with DS components and activators. They work if the bundles are not explicitly activated but not if they bundle is explicitly activated. This of course leads to confusion and broken systems. It could be that this is just a bad programming practice or it could be that there is something that could be done at the framework/DS level.
(In reply to comment #7) > AFAICT there is no way to ensure that there is no race as the order of DS > processing vs activator running is not guaranteed. Or is it? Tom and I had a > discussion about the use of synchronous bundle listeners etc but i'm not sure > if that applies or when the event is broadcast. We talking about the BundleEvent.LAZY_ACTIVATION event. This must be fired synchrounously as it is ONLY fired to SynchronousBundleListeners. The assumption that is being made here is the SynchrounsBundleListener for DS is processing these events synchronously and processing all the resolvable component definitions and making them available (as services) synchronously. > Ian was not pointing out a bug so much as an issue. We have cases where people > write lazy bundles with DS components and activators. They work if the bundles > are not explicitly activated but not if they bundle is explicitly activated. > This of course leads to confusion and broken systems. It could be that this is > just a bad programming practice or it could be that there is something that > could be done at the framework/DS level. Right, this is the general issue. calling Bundle.start(Bundle.START_ACTIVATION_POLICY) vs Bundle.start() causes different behavior in what services are available before entering the BundleActivator.start() method. One way to bridge that inconsistency could be to always fire the BundleEvent.LAZY_ACTIVATION event (for bundles that specify the lazy activation policy) even when the bundle is "eagerly" started. This way DS could process the LAZY_ACTIVATION and keep the behavior consistent in both cases. But this still makes an invalid assumption that DS is running and can process the LAZY_ACTIVATION event before we enter the BundleActivator.start() method. That is still an invalid ordering assumption IMO.
Basically it is never safe for the BundleActivator to assume that SCR has processed the bundle's components. It is however safe for the bundle's components to assume that the BundleActivator has completed.
(In reply to comment #8) > We talking about the BundleEvent.LAZY_ACTIVATION event. This must be fired > synchrounously as it is ONLY fired to SynchronousBundleListeners. The > assumption that is being made here is the SynchrounsBundleListener for DS is > processing these events synchronously and processing all the resolvable > component definitions and making them available (as services) synchronously. (In reply to comment #9) > Basically it is never safe for the BundleActivator to assume that SCR has > processed the bundle's components. It is however safe for the bundle's > components to assume that the BundleActivator has completed. This clarifies then. It is pure chance that these lazy bundles are getting their components registered before the start() method runs. But what does the spec say about when start() is called wrt the event being fired and processed synchronously? is the even before, during or after start()? > Right, this is the general issue. calling > Bundle.start(Bundle.START_ACTIVATION_POLICY) vs Bundle.start() causes different > behavior in what services are available before entering the > BundleActivator.start() method. One way to bridge that inconsistency could be > to always fire the BundleEvent.LAZY_ACTIVATION event (for bundles that specify > the lazy activation policy) even when the bundle is "eagerly" started. This > way DS could process the LAZY_ACTIVATION and keep the behavior consistent in > both cases. But this still makes an invalid assumption that DS is running and > can process the LAZY_ACTIVATION event before we enter the > BundleActivator.start() method. That is still an invalid ordering assumption > IMO. I'm torn here. In essence anyone writing code that is sensitive to this is doing "a bad thing". On the other hand, there is an inconsistency in the behavior simply based on how the bundle is started. That does not seem like "a good thing"
(In reply to comment #10) > > Basically it is never safe for the BundleActivator to assume that SCR has > > processed the bundle's components. Any BundleActivator that relies upon SCR having processed the bundle's components is wrong. Lazy activation or not. It is possible the SCR bundle is not even started when the bundle's BundleActivator is run. > But what does the > spec say about when start() is called wrt the event being fired and processed > synchronously? is the even before, during or after start()? The spec cannot guarantee anything here. Since the BundleEvent.LAZY_ACTIVATION event is synchronously delivered, the framework cannot define any ordering with respect to other events. It is possible some other thread could trigger the actual activation (via a class load from the bundle) while the framework is correctly delivering the BundleEvent.LAZY_ACTIVATION event. > I'm torn here. In essence anyone writing code that is sensitive to this is > doing "a bad thing". Yes. Any BundleActivator which relies upon SCR having processed the bundle's components is wrong. > On the other hand, there is an inconsistency in the > behavior simply based on how the bundle is started. That does not seem like "a > good thing" This is not an inconsistency. It is a race condition that sometimes works and sometimes fails. It works if lazy activation is used *and* no other thread triggers activation before SCR has processed the bundle's components. It fails otherwise.
(In reply to comment #11) > > On the other hand, there is an inconsistency in the > > behavior simply based on how the bundle is started. That does not seem like "a > > good thing" > > This is not an inconsistency. It is a race condition that sometimes works and > sometimes fails. It works if lazy activation is used *and* no other thread > triggers activation before SCR has processed the bundle's components. It fails > otherwise. This is very well put, than-you BJ. I opened this bug to find out if A) this behaviour was expected, and B) this behaviour was common knowledge. From BJ's comment above, it appears that this behaviour is expected, however, I don't think that many people know this. Over the past few weeks I have debugged 3 separate issues (in p2) related to services. Two of them were a variant on this. I wonder if we should start to document some Service Anti-Patterns (or more general OSGi Anti-Patterns). If Equinox committers (who, btw, I consider some of the most talented developers I've every worked with), are making these mistakes, I imagine that others are too.
We discussed this at the Equinox call today. We agreed to do ... 1) Move this to p2 as an umbrella for evaluating p2's usage of DS and the OSGi service registry. 2) Tom will take a first pass over p2's usage of DS and OSGI service registry and open bugs as needed. 3) This bug will be used as an umbrella to track the various issues found in the p2 code. 4) General discussions around DS and usage of the OSGi services can be kept to this bug. Individual issues should be track/discussed in other (to be opened) bugs. 5) We should avoid any pervasive changes to the code structure in p2 for 3.6 unless the common usage of p2 is broken.
I will take ownership of this bug to do the initial review of DS and OSGi service usage in p2.
For completeness, here are the 3 p2 service related bugs that I have worked on: bug 305588 bug 306468 bug 307790 Two of them are already fixed, and the third one (bug 305588) should be "going away" (the service is currently tagged for removal).
(In reply to comment #15) > For completeness, here are the 3 p2 service related bugs that I have worked on: > bug 305588 > bug 306468 > bug 307790 > > Two of them are already fixed, and the third one (bug 305588) should be "going > away" (the service is currently tagged for removal). Thanks, I opened bug 308409 to investigate improvements to the solution for bug 307790
I reviewed all the bundles in p2 that declare DS components. I have not found any others that depend on their own service components to be registered before entering their BundleActivator.start methods. There is an overall assumption in p2's service usage that does depend on the resolvable DS service components from lazy activated bundles to be available before any eagerly activated bundles are started within a particular start-level (see bug 306181). This is a far reaching assumption for which a large amount of restructuring would have to occur to fully move over to using DS injection instead of calling ServiceHelper.getService(BundleContext, String) all over the place and assuming the service is there because we know the system is fully "initialized". I don't think there is much more we can do here for 3.6.
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie.
I'm closing this as wontfix for now. I have not revisited the p2 code to confirm that this is still an issue (but I suspect it is). But I also don't see us performing any major refactoring to move all of p2 over to DS at this time. For now the behavior introduced in bug 306181 seems to have allowed us to continue with the current structure of the code without errors for many years.