| Summary: | EcorePlugin.getWorkspaceRoot() called before workspace is initialized | ||
|---|---|---|---|
| Product: | [Modeling] EMF | Reporter: | Bryan Hunt <bhunt> |
| Component: | Core | Assignee: | Ed Merks <Ed.Merks> |
| Status: | RESOLVED WONTFIX | QA Contact: | |
| Severity: | critical | ||
| Priority: | P3 | CC: | smccants, tjwatson |
| Version: | 2.5.0 | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
|
Description
Bryan Hunt
After some additional investigation, it is not 100% clear that the problem description is the root cause. We will continue to investigate tomorrow and update the bug when we know more. There's definitely a problem here, we just want to be completely sure of the root cause. Given that "start" does the initialization, I don't see now synchronized will help. Also, given that the EcorePlugin.Implementation is the first class loaded when any class within the ecore bundle needs to be loaded, it's hard to imagine how it's possible for clients to be using it fully before start has been called. Doesn't OSGi ensure the bundle properly initialized before clients can start using it? Bryan Hunt and I made some changes to EcorePlugin to try and unwind/prevent this problem. Here are the changed functions:
public static final ReentrantLock lock = new ReentrantLock();
public static IWorkspaceRoot getWorkspaceRoot()
{
lock.lock();
try
{
System.out.println("Getting workspace root which is "+workspaceRoot+" while plugin is "+INSTANCE.hashCode()+".");
return workspaceRoot;
}
finally
{
lock.unlock();
}
}
Also changed EcorePlugin.Implementation.start() at the top of the start function:
public void start(BundleContext context) throws Exception
{
lock.lock();
try
{
System.out.println("ECore bundle is starting.");
super.start(context);
if (IS_RESOURCES_BUNDLE_AVAILABLE && System.getProperty("org.eclipse.emf.ecore.plugin.EcorePlugin.doNotLoadResourcesPlugin") == null)
{
System.out.println("Initializing workspace root. Plugin = "+INSTANCE.hashCode());
workspaceRoot = ResourcesPlugin.getWorkspace().getRoot();
System.out.println("Workspace root has been set to "+workspaceRoot+".");
}
}
finally
{
lock.unlock();
}
In theory, this lock should prevent any threads from grabbing the workspaceRoot before it has been set. In practice, here is the stdout from one of our runs:
ECore bundle is starting.
Initializing workspace root. Plugin = 1187464903
Getting workspace root which is null while plugin is 1187464903.
Getting workspace root which is null while plugin is 1187464903.
Workspace root has been set to R/.
As you can see, someone is getting the workspace root before it is set despite the lock around the setting and acquisition of the workspace root. This is only possible if the setter and the acquirer are the same thread. It turns out that that is the case.
Below is the stack trace that shows both initializing the workspace root and getting the workspace root in the same thread. Before reading the stack trace (which is very long), let me explain what I think is happening:
1) Declaritive Services (DS) is starting a service that Bryan and I wrote named "ObjectRelationalMappingServiceMetaTypeProvider", which accesses EMF causing the class loader to load and start ECorePlugin$Implementation.
ObjectRelationalMappingServiceMetaTypeProvider.<init>() line: 18
2) ECorePlugin$Implementation.start() begins to initialize the workspace root.
EcorePlugin$Implementation.start(BundleContext) line: 527
3) This causes the class loader to load and start the ResourcesPlugin.
ResourcesPlugin.start(BundleContext) line: 379
4) The ResourcesPlugin.start(..) function makes a call to workspace.open(). So far so good. The Workspace.startup(..) function is called and it creates and starts a CharSetManager.
5) The CharSetManager creates a CharsetDeltaJob. This causes the class loader to load and start the org.eclipse.core.internal.content.Activator()
Activator.start(BundleContext) line: 49
6) The Activator registers a service called IContentTypeManager.
Activator.start(BundleContext) line: 49
7) This causes a ServiceEvent to be dispatched announcing the new service.
ServiceRegistry.publishServiceEventPrivileged(ServiceEvent) line: 756
8) The SCRManager gets the event and calls Resolver.getEligible(..)
SCRManager.serviceChanged(ServiceEvent) line: 226
9) The Resolver decides to build all the resolved components. In our case there is one resolved component, which is our DatabaseService (com.ibm.hdwb.persistence.common.jdbc.database). This causes the DatabaseService to be activated. This is were we think things start to go awry.
Resolver.getEligible(ServiceEvent) line: 292
10) The DatabaseService.activate(..) function makes a call to EMF's ResourceSet.getResource() to read in a configuration file.
DatabaseService.activate(ComponentContext) line: 67
11) EMF's ResourceSetImpl.getResource(...) ends up calling EcorePlugin.getWorkspaceRoot(), but the workspace is not yet initialized, so the workspace root is still null.
EcorePlugin.getWorkspaceRoot() line: 637 [local variables unavailable]
12) This null value for the workspace root is cached by the ExtensibleURIConverterImpl, meaning that no EMF classes in the future that use that URI Converter can successfully access the workspace.
ExtensibleURIConverterImpl.<clinit>() line: 334 [local variables unavailable]
We've discussed this some with Tom Watson and at his request are adding him to the cc list.
Our initial statement that this is a multi-threading issue was incorrect. Synchronizing around the workspace root does not fix the problem. We currently do not have a work around for this problem and are broken.
Here is the full stack trace that we just described:
Daemon Thread [Component Resolve Thread (Bundle 251)] (Suspended (breakpoint at line 637 in EcorePlugin))
EcorePlugin.getWorkspaceRoot() line: 637 [local variables unavailable]
ExtensibleURIConverterImpl.<clinit>() line: 334
J9VMInternals.initializeImpl(Class) line: not available [native method]
J9VMInternals.initialize(Class) line: 200
ResourceSetImpl.getURIConverter() line: 490
ResourceSetImpl.getResource(URI, boolean) line: 360
DatabaseConnectionManager.<init>(URI) line: 29
DatabaseService.activate(ComponentContext) line: 67
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 48
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 37
Method.invoke(Object, Object...) line: 600
ServiceComponent.activate(Object, ComponentContext) line: 206
ServiceComponentProp.activate(Bundle, ComponentInstanceImpl) line: 140
ServiceComponentProp.build(Bundle, Object, boolean) line: 330
InstanceProcess.buildComponent(Bundle, ServiceComponentProp, Object, boolean) line: 560
InstanceProcess.buildComponents(Vector, boolean) line: 182
Resolver.getEligible(ServiceEvent) line: 292
SCRManager.serviceChanged(ServiceEvent) line: 226
FilteredServiceListener.serviceChanged(ServiceEvent) line: 104
BundleContextImpl.dispatchEvent(Object, Object, int, Object) line: 933
EventManager.dispatchEvent(Set, EventDispatcher, int, Object) line: 227
ListenerQueue.dispatchEventSynchronous(int, Object) line: 149
ServiceRegistry.publishServiceEventPrivileged(ServiceEvent) line: 756
ServiceRegistry.publishServiceEvent(ServiceEvent) line: 711
ServiceRegistrationImpl.register(Dictionary) line: 129
ServiceRegistry.registerService(BundleContextImpl, String[], Object, Dictionary) line: 206
BundleContextImpl.registerService(String[], Object, Dictionary) line: 507
BundleContextImpl.registerService(String, Object, Dictionary) line: 525
Activator.start(BundleContext) line: 49
BundleContextImpl$1.run() line: 783
AccessController.doPrivileged(PrivilegedExceptionAction<T>) line: 251
BundleContextImpl.startActivator(BundleActivator) line: 774
BundleContextImpl.start() line: 755
BundleHost.startWorker(int) line: 352
BundleHost(AbstractBundle).start(int) line: 280
SecureAction.start(Bundle, int) line: 408
EclipseLazyStarter.postFindLocalClass(String, Class, ClasspathManager) line: 111
ClasspathManager.findLocalClass(String) line: 449
DefaultClassLoader.findLocalClass(String) line: 211
BundleLoader.findLocalClass(String) line: 381
SingleSourcePackage.loadClass(String) line: 33
BundleLoader.findClassInternal(String, boolean, ClassLoader) line: 454
BundleLoader.findClass(String, boolean) line: 410
BundleLoader.findClass(String) line: 398
DefaultClassLoader.loadClass(String, boolean) line: 105
DefaultClassLoader(ClassLoader).loadClass(String) line: 618
ClassLoader.defineClassImpl(String, byte[], int, int, Object) line: not available [native method]
DefaultClassLoader(ClassLoader).defineClass(String, byte[], int, int, ProtectionDomain) line: 274
DefaultClassLoader.defineClass(String, byte[], ClasspathEntry, BundleEntry) line: 183
ClasspathManager.defineClass(String, byte[], ClasspathEntry, BundleEntry, ClassLoadingStatsHook[]) line: 576
ClasspathManager.findClassImpl(String, ClasspathEntry, ClassLoadingStatsHook[]) line: 546
ClasspathManager.findLocalClassImpl(String, ClassLoadingStatsHook[]) line: 477
ClasspathManager.findLocalClass_LockClassLoader(String, ClassLoadingStatsHook[]) line: 465
ClasspathManager.findLocalClass(String) line: 445
DefaultClassLoader.findLocalClass(String) line: 211
BundleLoader.findLocalClass(String) line: 381
BundleLoader.findClassInternal(String, boolean, ClassLoader) line: 457
BundleLoader.findClass(String, boolean) line: 410
BundleLoader.findClass(String) line: 398
DefaultClassLoader.loadClass(String, boolean) line: 105
DefaultClassLoader(ClassLoader).loadClass(String) line: 618
CharsetManager.startup(IProgressMonitor) line: 301
Workspace.startup(IProgressMonitor) line: 1965
Workspace.open(IProgressMonitor) line: 1716
ResourcesPlugin.start(BundleContext) line: 379
BundleContextImpl$1.run() line: 783
AccessController.doPrivileged(PrivilegedExceptionAction<T>) line: 251
BundleContextImpl.startActivator(BundleActivator) line: 774
BundleContextImpl.start() line: 755
BundleHost.startWorker(int) line: 352
BundleHost(AbstractBundle).start(int) line: 280
SecureAction.start(Bundle, int) line: 408
EclipseLazyStarter.postFindLocalClass(String, Class, ClasspathManager) line: 111
ClasspathManager.findLocalClass(String) line: 449
DefaultClassLoader.findLocalClass(String) line: 211
BundleLoader.findLocalClass(String) line: 381
SingleSourcePackage.loadClass(String) line: 33
BundleLoader.findClassInternal(String, boolean, ClassLoader) line: 454
BundleLoader.findClass(String, boolean) line: 410
BundleLoader.findClass(String) line: 398
DefaultClassLoader.loadClass(String, boolean) line: 105
DefaultClassLoader(ClassLoader).loadClass(String) line: 618
EcorePlugin$Implementation.start(BundleContext) line: 527
BundleContextImpl$1.run() line: 783
AccessController.doPrivileged(PrivilegedExceptionAction<T>) line: 251
BundleContextImpl.startActivator(BundleActivator) line: 774
BundleContextImpl.start() line: 755
BundleHost.startWorker(int) line: 352
BundleHost(AbstractBundle).start(int) line: 280
SecureAction.start(Bundle, int) line: 408
EclipseLazyStarter.postFindLocalClass(String, Class, ClasspathManager) line: 111
ClasspathManager.findLocalClass(String) line: 449
DefaultClassLoader.findLocalClass(String) line: 211
BundleLoader.findLocalClass(String) line: 381
SingleSourcePackage.loadClass(String) line: 33
BundleLoader.findClassInternal(String, boolean, ClassLoader) line: 454
BundleLoader.findClass(String, boolean) line: 410
BundleLoader.findClass(String) line: 398
DefaultClassLoader.loadClass(String, boolean) line: 105
DefaultClassLoader(ClassLoader).loadClass(String) line: 618
ObjectRelationalMappingServiceMetaTypeProvider.<init>() line: 18
J9VMInternals.newInstanceImpl(Class) line: not available [native method]
Class<T>.newInstance() line: 1325
ServiceComponent.createInstance() line: 454
ServiceComponentProp.createInstance() line: 255
ServiceComponentProp.build(Bundle, Object, boolean) line: 316
InstanceProcess.buildComponent(Bundle, ServiceComponentProp, Object, boolean) line: 560
InstanceProcess.buildComponents(Vector, boolean) line: 182
Resolver.buildNewlySatisfied(boolean) line: 393
Resolver.enableComponents(Vector) line: 176
SCRManager.startedBundle(Bundle) line: 634
SCRManager.bundleChanged(BundleEvent) line: 239
BundleContextImpl.dispatchEvent(Object, Object, int, Object) line: 919
EventManager.dispatchEvent(Set, EventDispatcher, int, Object) line: 227
ListenerQueue.dispatchEventSynchronous(int, Object) line: 149
Framework.publishBundleEventPrivileged(BundleEvent) line: 1350
Framework.publishBundleEvent(int, Bundle) line: 1301
BundleHost.startWorker(int) line: 362
BundleHost(AbstractBundle).start(int) line: 280
SecureAction.start(Bundle, int) line: 408
EclipseLazyStarter.postFindLocalClass(String, Class, ClasspathManager) line: 111
ClasspathManager.findLocalClass(String) line: 449
DefaultClassLoader.findLocalClass(String) line: 211
BundleLoader.findLocalClass(String) line: 381
SingleSourcePackage.loadClass(String) line: 33
BundleLoader.findClassInternal(String, boolean, ClassLoader) line: 454
BundleLoader.findClass(String, boolean) line: 410
BundleLoader.findClass(String) line: 398
DefaultClassLoader.loadClass(String, boolean) line: 105
DefaultClassLoader(ClassLoader).loadClass(String) line: 618
Activator.start(BundleContext) line: 39
BundleContextImpl$1.run() line: 783
AccessController.doPrivileged(PrivilegedExceptionAction<T>) line: 251
BundleContextImpl.startActivator(BundleActivator) line: 774
BundleContextImpl.start() line: 755
BundleHost.startWorker(int) line: 352
BundleHost(AbstractBundle).start(int) line: 280
SecureAction.start(Bundle, int) line: 408
EclipseLazyStarter.postFindLocalClass(String, Class, ClasspathManager) line: 111
ClasspathManager.findLocalClass(String) line: 449
DefaultClassLoader.findLocalClass(String) line: 211
BundleLoader.findLocalClass(String) line: 381
BundleLoader.findClassInternal(String, boolean, ClassLoader) line: 457
BundleLoader.findClass(String, boolean) line: 410
BundleLoader.findClass(String) line: 398
DefaultClassLoader.loadClass(String, boolean) line: 105
DefaultClassLoader(ClassLoader).loadClass(String) line: 618
BundleLoader.loadClass(String) line: 326
BundleHost.loadClass(String, boolean) line: 231
BundleHost(AbstractBundle).loadClass(String) line: 1193
ServiceComponent.createInstance() line: 454
ServiceComponentProp.createInstance() line: 255
ServiceComponentProp.build(Bundle, Object, boolean) line: 316
InstanceProcess.buildComponent(Bundle, ServiceComponentProp, Object, boolean) line: 560
InstanceProcess.buildComponents(Vector, boolean) line: 182
Resolver.buildNewlySatisfied(boolean) line: 393
Resolver.enableComponents(Vector) line: 176
SCRManager.performWork(int, Object) line: 791
SCRManager$QueuedJob.dispatch() line: 758
WorkThread.run() line: 90
Executor.run() line: 70
One possible work around is to have an immediate component that depends on nothing but causes the activation of the ecore bundle and then enables your other components only after you ensure the ecore bundle is started. I implemented Tom's suggested workaround and it seems to be working. Thanks Tom. Sounds like the workaround is a good general solution, but not one the framework itself can provide. Ed, is it okay for the Framework to start unrelated services in the middle of starting a service? All on the same thread. That seems like it is asking for trouble. Big, nasty, hard to debug trouble, much like we are seeing. While the work around may work, I think there is a real problem with the framework haphazardly starting things and that shouldn't be swept under the rug. Given there's no test case that I can use to see a problem it's pretty hard to comment and impossible to verify that any change I might make is actually an improvement. EMF doesn't use declarative services, so I'm not sure what you're alluding to when referring to the framework starting an unrelated service in the middle of starting a service. Given we've never had a problem with this before, I'm not sure what particular aspect is haphazard. Do you have a suggestion? Would moving the initialization logic to getWorkspaceRoot help? As I said, I'm not in a position where I can verify that changes are actually fixes. Ed, I suspect this is not actually an EMF problem, although I could be mistaken. My suggestion is to not start additional services while starting a service, unless required. The start of the DatabaseService is not related to it being required or used by ObjectRelationalMappingServiceMetaTypeProvider service. The idea of moving the initialization of the Workspace Root has some merit, but I don't think it addresses the larger issue. There are likely other shared resources that could exhibit similar problems. This problem is very difficult to reproduce as it depends on start orders that are more or less random. We've seen it in the past and seen it go away. This time it, for some reason it hung around, but only with one particular launcher on one particular computer. However, just because it is difficult to reproduce with doesn't mean an attempt to fix it should be made, especially after Bryan and I were able to debug it to a point of reasonable understanding. |