Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 333385 - Concurrency problem when reading objects
Summary: Concurrency problem when reading objects
Status: RESOLVED NOT_ECLIPSE
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: unspecified   Edit
Hardware: Macintosh Mac OS X - Carbon (unsup.)
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-02 14:04 EST by Robin Rosenberg CLA
Modified: 2011-01-02 17:17 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Rosenberg CLA 2011-01-02 14:04:11 EST
Hi,

I get some funny problem when two threads try to access the same commit. Here are traces from inside the debugger. This is the very first commit that is read so no complicated setup is needed. I do have alternate repositories (and other corrupt objects, but these are not touched from what I can see). The object being read here is not corrupt, but the deflater produces garbage.

This was not a problem, but started all of a sudden, after some experiments with rebase. The failing commit and its loose object file is a couple weeks old.

Thread [Worker-4] (Suspended (exception CorruptObjectException))	
	Constants.decodeTypeString(AnyObjectId, byte[], byte, MutableInteger) line: 456	
	UnpackedObject.open(InputStream, File, AnyObjectId, WindowCursor) line: 120	
	ObjectDirectory.openObject2(WindowCursor, String, AnyObjectId) line: 447	
	ObjectDirectory(FileObjectDatabase).openObjectImpl2(WindowCursor, String, AnyObjectId) line: 191	
	ObjectDirectory(FileObjectDatabase).openObject(WindowCursor, AnyObjectId) line: 156	
	WindowCursor.open(AnyObjectId, int) line: 108	
	WindowCursor(ObjectReader).open(AnyObjectId) line: 228	
	RevWalk.parseAny(AnyObjectId) line: 811	
	RevWalk.parseCommit(AnyObjectId) line: 724	
	GitDocument.populate() line: 128	
	GitDocument.create(IResource) line: 59	
	GitQuickDiffProvider.getReference(IProgressMonitor) line: 74	
	DocumentLineDiffer$2.run(IProgressMonitor) line: 515	
	Worker.run() line: 55	
Thread [Worker-6] (Suspended (exception CorruptObjectException))	
	Constants.decodeTypeString(AnyObjectId, byte[], byte, MutableInteger) line: 456	
	UnpackedObject.open(InputStream, File, AnyObjectId, WindowCursor) line: 120	
	ObjectDirectory.openObject2(WindowCursor, String, AnyObjectId) line: 447	
	ObjectDirectory.openObject1(WindowCursor, AnyObjectId) line: 338	
	ObjectDirectory(FileObjectDatabase).openObjectImpl1(WindowCursor, AnyObjectId) line: 167	
	ObjectDirectory(FileObjectDatabase).openObject(WindowCursor, AnyObjectId) line: 152	
	WindowCursor.open(AnyObjectId, int) line: 108	
	WindowCursor(ObjectReader).open(AnyObjectId) line: 228	
	RevWalk.parseAny(AnyObjectId) line: 811	
	RevWalk.parseTree(AnyObjectId) line: 751	
	DecoratableResourceAdapter.createThreeWayTreeWalk() line: 417	
	DecoratableResourceAdapter.<init>(IResource) line: 114	
	GitLightweightDecorator.decorate(Object, IDecoration) line: 226	
	LightweightDecoratorDefinition.decorate(Object, IDecoration) line: 263	
	LightweightDecoratorManager$LightweightRunnable.run() line: 81	
	SafeRunner.run(ISafeRunnable) line: 42	
	LightweightDecoratorManager.decorate(Object, DecorationBuilder, LightweightDecoratorDefinition) line: 365	
	LightweightDecoratorManager.getDecorations(Object, DecorationBuilder) line: 347	
	DecorationScheduler$1.ensureResultCached(Object, boolean, IDecorationContext) line: 371	
	DecorationScheduler$1.run(IProgressMonitor) line: 331	
	Worker.run() line: 55	


For my own reference: commit id is: a3b8c5b7463c72ed68320e2fe094f039ce0f95d4
Comment 1 Shawn Pearce CLA 2011-01-02 16:14:24 EST
Can you reproduce this at will?

Can you look to see if the WindowCursor or the Inflater inside of them are the same object reference between the two threads?  That's the only way I can see that we would have data corruption during decompression of a loose object... the WindowCursor isn't thread-safe, but if EGit managed to reuse the same WindowCursor instance between two threads, you'd have the same Inflater instance in both threads, and the Inflater isn't thread-safe.
Comment 2 Robin Rosenberg CLA 2011-01-02 16:37:15 EST
At the moment at least the problem appears every time. 

The WindowCursor and Inflater instances as well as streams are different. The garbage produces is the same in both cases.

I had some printouts for a while, so it seems there are no other threads involved.
Comment 3 Robin Rosenberg CLA 2011-01-02 16:40:36 EST
D***. I was running an OpenJDK for OS X.
Comment 4 Robin Rosenberg CLA 2011-01-02 16:45:19 EST
The available OS X builds of OpenJDK are, I think, nothing we support.
Comment 5 Shawn Pearce CLA 2011-01-02 17:05:41 EST
(In reply to comment #4)
> The available OS X builds of OpenJDK are, I think, nothing we support.

I don't understand why this should be an issue.  Is the Inflater object in OpenJDK just horribly busted?  Similar to the JIT issue with the IBM JRE that caused problems with our unit tests?
Comment 6 Robin Rosenberg CLA 2011-01-02 17:11:41 EST
(In reply to comment #5)
> (In reply to comment #4)
> > The available OS X builds of OpenJDK are, I think, nothing we support.
> 
> I don't understand why this should be an issue.  Is the Inflater object in
> OpenJDK just horribly busted?  Similar to the JIT issue with the IBM JRE that
> caused problems with our unit tests?

I guess it is. This was a build of the 1.7 version so it is not production quality in any sense. 

Maybe someone could try on another platform and see. I hold my bug reports on the OS X build until official builds comes out.
Comment 7 Shawn Pearce CLA 2011-01-02 17:17:20 EST
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > The available OS X builds of OpenJDK are, I think, nothing we support.
> > 
> > I don't understand why this should be an issue.  Is the Inflater object in
> > OpenJDK just horribly busted?  Similar to the JIT issue with the IBM JRE that
> > caused problems with our unit tests?
> 
> I guess it is. This was a build of the 1.7 version so it is not production
> quality in any sense. 
> 
> Maybe someone could try on another platform and see. I hold my bug reports on
> the OS X build until official builds comes out.

OK.  I'm inclined to believe its a JRE bug then, since its a non-production JRE and you verified that the Inflater and WindowCache instances were in fact different between the threads.  Though I'm surprised there is a problem with the JRE, Inflater is an important part of the URLClassLoader code path.  Maybe they tried to optimize something and the optimization is being tripped up by the way we use Inflater.