Community
Participate
Working Groups
Hi, I get some funny problem when two threads try to access the same commit. Here are traces from inside the debugger. This is the very first commit that is read so no complicated setup is needed. I do have alternate repositories (and other corrupt objects, but these are not touched from what I can see). The object being read here is not corrupt, but the deflater produces garbage. This was not a problem, but started all of a sudden, after some experiments with rebase. The failing commit and its loose object file is a couple weeks old. Thread [Worker-4] (Suspended (exception CorruptObjectException)) Constants.decodeTypeString(AnyObjectId, byte[], byte, MutableInteger) line: 456 UnpackedObject.open(InputStream, File, AnyObjectId, WindowCursor) line: 120 ObjectDirectory.openObject2(WindowCursor, String, AnyObjectId) line: 447 ObjectDirectory(FileObjectDatabase).openObjectImpl2(WindowCursor, String, AnyObjectId) line: 191 ObjectDirectory(FileObjectDatabase).openObject(WindowCursor, AnyObjectId) line: 156 WindowCursor.open(AnyObjectId, int) line: 108 WindowCursor(ObjectReader).open(AnyObjectId) line: 228 RevWalk.parseAny(AnyObjectId) line: 811 RevWalk.parseCommit(AnyObjectId) line: 724 GitDocument.populate() line: 128 GitDocument.create(IResource) line: 59 GitQuickDiffProvider.getReference(IProgressMonitor) line: 74 DocumentLineDiffer$2.run(IProgressMonitor) line: 515 Worker.run() line: 55 Thread [Worker-6] (Suspended (exception CorruptObjectException)) Constants.decodeTypeString(AnyObjectId, byte[], byte, MutableInteger) line: 456 UnpackedObject.open(InputStream, File, AnyObjectId, WindowCursor) line: 120 ObjectDirectory.openObject2(WindowCursor, String, AnyObjectId) line: 447 ObjectDirectory.openObject1(WindowCursor, AnyObjectId) line: 338 ObjectDirectory(FileObjectDatabase).openObjectImpl1(WindowCursor, AnyObjectId) line: 167 ObjectDirectory(FileObjectDatabase).openObject(WindowCursor, AnyObjectId) line: 152 WindowCursor.open(AnyObjectId, int) line: 108 WindowCursor(ObjectReader).open(AnyObjectId) line: 228 RevWalk.parseAny(AnyObjectId) line: 811 RevWalk.parseTree(AnyObjectId) line: 751 DecoratableResourceAdapter.createThreeWayTreeWalk() line: 417 DecoratableResourceAdapter.<init>(IResource) line: 114 GitLightweightDecorator.decorate(Object, IDecoration) line: 226 LightweightDecoratorDefinition.decorate(Object, IDecoration) line: 263 LightweightDecoratorManager$LightweightRunnable.run() line: 81 SafeRunner.run(ISafeRunnable) line: 42 LightweightDecoratorManager.decorate(Object, DecorationBuilder, LightweightDecoratorDefinition) line: 365 LightweightDecoratorManager.getDecorations(Object, DecorationBuilder) line: 347 DecorationScheduler$1.ensureResultCached(Object, boolean, IDecorationContext) line: 371 DecorationScheduler$1.run(IProgressMonitor) line: 331 Worker.run() line: 55 For my own reference: commit id is: a3b8c5b7463c72ed68320e2fe094f039ce0f95d4
Can you reproduce this at will? Can you look to see if the WindowCursor or the Inflater inside of them are the same object reference between the two threads? That's the only way I can see that we would have data corruption during decompression of a loose object... the WindowCursor isn't thread-safe, but if EGit managed to reuse the same WindowCursor instance between two threads, you'd have the same Inflater instance in both threads, and the Inflater isn't thread-safe.
At the moment at least the problem appears every time. The WindowCursor and Inflater instances as well as streams are different. The garbage produces is the same in both cases. I had some printouts for a while, so it seems there are no other threads involved.
D***. I was running an OpenJDK for OS X.
The available OS X builds of OpenJDK are, I think, nothing we support.
(In reply to comment #4) > The available OS X builds of OpenJDK are, I think, nothing we support. I don't understand why this should be an issue. Is the Inflater object in OpenJDK just horribly busted? Similar to the JIT issue with the IBM JRE that caused problems with our unit tests?
(In reply to comment #5) > (In reply to comment #4) > > The available OS X builds of OpenJDK are, I think, nothing we support. > > I don't understand why this should be an issue. Is the Inflater object in > OpenJDK just horribly busted? Similar to the JIT issue with the IBM JRE that > caused problems with our unit tests? I guess it is. This was a build of the 1.7 version so it is not production quality in any sense. Maybe someone could try on another platform and see. I hold my bug reports on the OS X build until official builds comes out.
(In reply to comment #6) > (In reply to comment #5) > > (In reply to comment #4) > > > The available OS X builds of OpenJDK are, I think, nothing we support. > > > > I don't understand why this should be an issue. Is the Inflater object in > > OpenJDK just horribly busted? Similar to the JIT issue with the IBM JRE that > > caused problems with our unit tests? > > I guess it is. This was a build of the 1.7 version so it is not production > quality in any sense. > > Maybe someone could try on another platform and see. I hold my bug reports on > the OS X build until official builds comes out. OK. I'm inclined to believe its a JRE bug then, since its a non-production JRE and you verified that the Inflater and WindowCache instances were in fact different between the threads. Though I'm surprised there is a problem with the JRE, Inflater is an important part of the URLClassLoader code path. Maybe they tried to optimize something and the optimization is being tripped up by the way we use Inflater.