Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 364048 - Possible deadlock in ServiceRegistry (patch included)
Summary: Possible deadlock in ServiceRegistry (patch included)
Status: RESOLVED WORKSFORME
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Framework (show other bugs)
Version: 3.7.1   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: equinox.framework-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-17 10:37 EST by Michel Krämer CLA
Modified: 2012-01-06 14:56 EST (History)
2 users (show)

See Also:


Attachments
Stacktrace for the deadlock (7.57 KB, text/plain)
2011-11-17 10:38 EST, Michel Krämer CLA
no flags Details
Proposed patch to fix the deadlock (1.28 KB, patch)
2011-11-17 10:39 EST, Michel Krämer CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michel Krämer CLA 2011-11-17 10:37:50 EST
Build Identifier: R37x_v20110808-1106

We are experiencing the same issue as described in #359535, but our stacktrace is quite different (see attached file). There is a deadlock in ServiceRegistry.getNextServiceId() which is unnecessarily locking the whole class.

We were able to resolve this issue by introducing a separate lock for the "serviceid" member variable. Our patch is attached below.

Reproducible: Sometimes

Steps to Reproduce:
1. Start framework
2. Publish some services from several threads at the same time
Comment 1 Michel Krämer CLA 2011-11-17 10:38:28 EST
Created attachment 207150 [details]
Stacktrace for the deadlock
Comment 2 Michel Krämer CLA 2011-11-17 10:39:40 EST
Created attachment 207151 [details]
Proposed patch to fix the deadlock
Comment 3 Thomas Watson CLA 2011-11-17 10:49:58 EST
Are you using osgi.classloader.singleThreadLoads=true option?  If so please try without that option.  That is a rather dangerous option on modern VMs and I have removed it from the Juno builds.
Comment 4 Michel Krämer CLA 2011-11-17 10:51:32 EST
We know this option and we already removed it.
Comment 5 BJ Hargrave CLA 2011-11-17 14:59:08 EST
There seems to be some other thing at work here (like the single threaded loads option). Also the stacktrace does not reveal who owns DefaultClassLoader  (id=73)? So it does not show the closure of the embrace.
Comment 6 Michel Krämer CLA 2011-11-18 03:11:12 EST
Yes, you're right. The stacktrace does not show who owns DefaultClassLoader (id=73). Bummer. I can't reproduce it anymore now, but the patch definitely solved the problem yesterday.

SpringOsgiExtenderThread-4 waits for ServiceRegistry (id=76) which is owned by SpringOsgiExtenderThread-5 which itself waits for DefaultClassLoader (id=73) which is locked for some unknown reason. SpringOsgiExtenderThread-4 only has to wait because ServiceRegistry.getNextServiceId() locks the whole class. However, this is not necessary, because it just increments a single member variable which is not used in any other method anyhow.

This bug is also related to #359535. We are also using Spring OSGi to publish most of our services.
Comment 7 Thomas Watson CLA 2012-01-06 14:56:04 EST
(In reply to comment #4)
> We know this option and we already removed it.

I want to reconfirm that you were not using that option when you produced the following stacktrace:

Thread [SpringOsgiExtenderThread-1] (Suspended)	
	waiting for: DefaultClassLoader  (id=86)	
	Object.wait(long) line: not available [native method]	
	DefaultClassLoader(Object).wait() line: 485	
	BundleLoader.lock(Object) line: 1284 <<< RED FLAG!! should NOT be called
	BundleLoader.findClass(String, boolean) line: 428	
	BundleLoader.findClass(String) line: 417	
	DefaultClassLoader.loadClass(String, boolean) line: 107	
	DefaultClassLoader(ClassLoader).loadClass(String) line: not available	
	DependencyServiceManager.register() line: 364	
	DependencyWaiterApplicationContextExecutor.stageOne() line: 270	
	DependencyWaiterApplicationContextExecutor.refresh() line: 178	
	OsgiBundleXmlApplicationContext(AbstractDelegatedExecutionApplicationContext).refresh() line: 159	
	LifecycleManager$1.run() line: 223	
	Thread.run() line: not available


The following call should never happen unless osgi.classloader.singleThreadLoads=true

BundleLoader.lock(Object) line: 1284 <<< RED FLAG!! should NOT be called

Without this option, you can no longer reproduce this deadlock.  I would like to close as works for me.  Please reopen if you find another case of deadlock and we can confirm that the option osgi.classloader.singleThreadLoads=true is not being used.  Thanks.