Community
Participate
Working Groups
It seems that the hung up appears when generated id is not in the testcase in which the em.find() resides. It hung up: 1. At "dt = em.find(DateTime.class, datetimeId);" in testNullifySqlDate() in org.eclipse.persistence.testing.tests.jpa.datetime.NullBindingJUnitTestCase.java. 2. At "OneToOneVirtualAttributeHolder holder = createEntityManager().find(OneToOneVirtualAttributeHolder.class, id);" in testReadVirtualAttribute() in org.eclipse.persistence.testing.tests.jpa.relationships.VirtualAttributeTestSuite.java But it doesn't hung up: 1. At "Customer newOwen = em.find(Customer.class, owenId);" in selfReferencingManyToManyCreateTest() in org.eclipse.persistence.testing.tests.jpa.relationships.UniAndBiDirectionalMappingTestSuite.java Stacktrace in server log: [1/5/09 9:25:53:531 EST] 00000004 TimeoutManage I WTRN0006W: Transaction 0000011EA72DA6AB00000002000000029D3259B34B45596246888242073244077CB75B960000011EA72DA6AB00000002000000029D3259B34B45596246888242073244077CB75B9600000001 has timed out after 120 seconds. [1/5/09 9:25:53:531 EST] 00000004 TimeoutManage I WTRN0124I: When the timeout occurred the thread with which the transaction is, or was most recently, associated was Thread[ORB.thread.pool : 0,5,main]. The stack trace of this thread when the timeout occurred was: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:196) org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:146) org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:110) org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.checkCacheForObject(ExpressionQueryMechanism.java:730) org.eclipse.persistence.queries.ReadObjectQuery.checkEarlyReturnImpl(ReadObjectQuery.java:230) org.eclipse.persistence.queries.ObjectLevelReadQuery.checkEarlyReturn(ObjectLevelReadQuery.java:628) org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:599) org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:892) org.eclipse.persistence.queries.ReadObjectQuery.execute(ReadObjectQuery.java:398) org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:956) org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2651) org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1181) org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1165) org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1125) org.eclipse.persistence.internal.jpa.EntityManagerImpl.executeQuery(EntityManagerImpl.java:562) org.eclipse.persistence.internal.jpa.EntityManagerImpl.findInternal(EntityManagerImpl.java:514) org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:429) org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:337) com.ibm.ws.jpa.management.JPATxEmInvocation.find(JPATxEmInvocation.java:211) com.ibm.ws.jpa.management.JPAEntityManager.find(JPAEntityManager.java:175) org.eclipse.persistence.testing.tests.jpa.datetime.NullBindingJUnitTestCase.testNullifySqlDate(NullBindingJUnitTestCase.java:101) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:45) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) java.lang.reflect.Method.invoke(Method.java:599) junit.framework.TestCase.runTest(TestCase.java:168) junit.framework.TestCase.runBare(TestCase.java:134) org.eclipse.persistence.testing.framework.junit.JUnitTestCase.runBareServer(JUnitTestCase.java:397) org.eclipse.persistence.testing.framework.server.TestRunnerBean.runTest(TestRunnerBean.java:87) org.eclipse.persistence.testing.framework.server.EJSRemote0SLTestRunner_ed643271.runTest(EJSRemote0SLTestRunner_ed643271.java) org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie.runTest(_EJSRemote0SLTestRunner_ed643271_Tie.java) org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie._invoke(_EJSRemote0SLTestRunner_ed643271_Tie.java) com.ibm.CORBA.iiop.ServerDelegate.dispatchInvokeHandler(ServerDelegate.java:622) com.ibm.CORBA.iiop.ServerDelegate.dispatch(ServerDelegate.java:475) com.ibm.rmi.iiop.ORB.process(ORB.java:504) com.ibm.CORBA.iiop.ORB.process(ORB.java:1571) com.ibm.rmi.iiop.Connection.respondTo(Connection.java:2771) com.ibm.rmi.iiop.Connection.doWork(Connection.java:2640) com.ibm.rmi.iiop.WorkUnitImpl.doWork(WorkUnitImpl.java:63) com.ibm.ejs.oa.pool.PooledThread.run(ThreadPool.java:118) com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1527)
Since the bug stops me running all applicable tests on WebSphere, so I change the pripority from 3 to 2.
change the severity and pripority according to Tom's email.
Michael O'Brien starting reproduction and fix on 20090113
1) Reproduced the "hanging" behavior. The test suite is still stuck after 60 min. Instead of figuring out how to run WebSphere in debug mode.... java -Xdebug -Xnoagent -Djava.compiler=none -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005 -jar ....so I can attach to a debug port remotely from Eclipse, I will transition my WebLogic EAR to run this behavior from Eclipse so I can debug with/without the WAS 6.1 server plugin. Also, the IBM Jconsole implementation is unable to attach the the WAS process as I normally do with WebLogic. 2) We also need another shared library created for jpa\plugins\javax.persistence_2.0_preview.jar in the same manner as the one created for eclipselink.jar below configWebSphere.py library = AdminConfig.create('Library', n1, [['name', 'eclipselinkjpa'], ['classPath', '%%eclipselinkLibDir%%/eclipselink.jar']]) 3) I entered a separate bug so that we document the process of getting the tests running on WebSphere on Wiki. http://bugs.eclipse.org/261235 websphere.properties test.properties jndi.properties configWebSphere.py The last logs I see in WebSphere is the expected JDBC table creates... [1/15/09 16:23:49:009 EST] 00000021 SystemOut O [EL Warning]: 2009.01.15 16:23:49.009--ServerSession(1754556564)--Exception [EclipseLink-4002] (Eclipse Persistence Services - 1.1.0 (Build 20090113)): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: java.sql.SQLException: ORA-00955: name is already used by an existing object Error Code: 955 Call: CREATE TABLE CMP3_ENTITYD_SEQ (SEQ_NAME VARCHAR2(50) NOT NULL, SEQ_COUNT NUMBER(38) NULL, PRIMARY KEY (SEQ_NAME)) Query: DataModifyQuery(sql="CREATE TABLE CMP3_ENTITYD_SEQ (SEQ_NAME VARCHAR2(50) NOT NULL, SEQ_COUNT NUMBER(38) NULL, PRIMARY KEY (SEQ_NAME))")
I restarted to force finalize() and this is when the timeout message appeared. A little different message than "TimeoutManage" but in the same place in UnitOfWorkIdentityMapAccessor.java:146 It looks like we are experiencing a deadlock in cacheKey.getMutex().wait(5) // milliseconds protected Object getAndCloneCacheKeyFromParent(Vector primaryKey, Class theClass, boolean shouldReturnInvalidatedObjects, ClassDescriptor descriptor) { ... if ((cacheKey != null) && (shouldReturnInvalidatedObjects || !descriptor.getCacheInvalidationPolicy().isInvalidated(cacheKey))) { synchronized (cacheKey.getMutex()) { //if the object in the cachekey is null but the key is acquired then //someone must be rebuilding it or creating a new one. Sleep until // it's finished. A plain wait here would be more efficient but we may not // get notified for quite some time (ie deadlock) if the other thread //is building the object. Must wait and not sleep in order for the monitor to be released objectFromCache = cacheKey.getObject(); try { while (cacheKey.isAcquired() && (objectFromCache == null)) { 146----> cacheKey.getMutex().wait(5); } } catch (InterruptedException ex) { } if (objectFromCache == null) { return null; } } } else { return null; } ------stacktrace----------- Error Code: 955 Call: CREATE TABLE CMP3_ENTITYD_SEQ (SEQ_NAME VARCHAR2(50) NOT NULL, SEQ_COUNT NUMBER(38) NULL, PRIMARY KEY (SEQ_NAME)) Query: DataModifyQuery(sql="CREATE TABLE CMP3_ENTITYD_SEQ (SEQ_NAME VARCHAR2(50) NOT NULL, SEQ_COUNT NUMBER(38) NULL, PRIMARY KEY (SEQ_NAME))") [1/15/09 16:36:15:781 EST] 00000016 ThreadMonitor W WSVR0605W: Thread "ORB.thread.pool : 0" (00000021) has been active for 746725 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung. at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:196) at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:146) at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:110) at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.checkCacheForObject(ExpressionQueryMechanism.java:730) at org.eclipse.persistence.queries.ReadObjectQuery.checkEarlyReturnImpl(ReadObjectQuery.java:229) at org.eclipse.persistence.queries.ObjectLevelReadQuery.checkEarlyReturn(ObjectLevelReadQuery.java:627) at org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:613) at org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:891) at org.eclipse.persistence.queries.ReadObjectQuery.execute(ReadObjectQuery.java:397) at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:954) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2651) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1181) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1165) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1125) at org.eclipse.persistence.internal.jpa.EntityManagerImpl.executeQuery(EntityManagerImpl.java:571) at org.eclipse.persistence.internal.jpa.EntityManagerImpl.findInternal(EntityManagerImpl.java:516) at org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:431) at org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:339) at com.ibm.ws.jpa.management.JPATxEmInvocation.find(JPATxEmInvocation.java:211) at com.ibm.ws.jpa.management.JPAEntityManager.find(JPAEntityManager.java:175) at org.eclipse.persistence.testing.tests.jpa.relationships.VirtualAttributeTestSuite.testReadVirtualAttribute(VirtualAttributeTestSuite.java:76) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:45) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:599) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at org.eclipse.persistence.testing.framework.junit.JUnitTestCase.runBareServer(JUnitTestCase.java:391) at org.eclipse.persistence.testing.framework.server.TestRunnerBean.runTest(TestRunnerBean.java:87) at org.eclipse.persistence.testing.framework.server.EJSRemote0SLTestRunner_ed643271.runTest(EJSRemote0SLTestRunner_ed643271.java) at org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie.runTest(_EJSRemote0SLTestRunner_ed643271_Tie.java) at org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie._invoke(_EJSRemote0SLTestRunner_ed643271_Tie.java) at com.ibm.CORBA.iiop.ServerDelegate.dispatchInvokeHandler(ServerDelegate.java:622) at com.ibm.CORBA.iiop.ServerDelegate.dispatch(ServerDelegate.java:475) at com.ibm.rmi.iiop.ORB.process(ORB.java:504) at com.ibm.CORBA.iiop.ORB.process(ORB.java:1571) at com.ibm.rmi.iiop.Connection.respondTo(Connection.java:2771) at com.ibm.rmi.iiop.Connection.doWork(Connection.java:2640) at com.ibm.rmi.iiop.WorkUnitImpl.doWork(WorkUnitImpl.java:63) at com.ibm.ejs.oa.pool.PooledThread.run(ThreadPool.java:118) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1527) [1/15/09 17:36:22:877 EST] 00000019 RoleBasedAuth A SECJ0305I: The role-based authorization check failed for admin-authz operation Server:getProcessType. The user UNAUTHENTICATED (unique ID: unauthenticated) was not granted any of the following required roles: deployer, operator, configurator, monitor, administrator, adminsecuritymanager, auditor.
Just out of curiosity I left the treads running on the server - they never returned in 11 hours. [1/15/09 17:48:15:936 EST] 0000000e ThreadMonitor W WSVR0605W: Thread "ORB.thread.pool : 1" (00000022) has been active for 670586 milliseconds and may be hung. There is/are 2 thread(s) in total in the server that may be hung. Q) Do we see this in WebSphere 6.1?
The alt #1 workaround allows the tests to continue and emit the base server exception below: [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Source (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM minor code: 896 completed: No) Also, there are (2) two nested deadlocks in our code. The second deadlock occurs periodically after the first has been fixed during testModifyItem() in ConcurrencyManager.releasesDeferredLock() 1/19/09 16:17:33:719 EST] 00000019 TimeoutManage I WTRN0006W: Transaction 0000011EF0BF93D7000000010000000F3FC544039A0DB96F61B909BB432079E1295E672A0000011EF0BF93D7000000010000000F3FC544039A0DB96F61B909BB432079E1295E672A00000001 has timed out after 120 seconds. [1/19/09 16:17:33:719 EST] 00000019 TimeoutManage I WTRN0124I: When the timeout occurred the thread with which the transaction is, or was most recently, associated was Thread[ORB.thread.pool : 0,5,main]. The stack trace of this thread when the timeout occurred was: java.lang.Thread.sleep(Native Method) java.lang.Thread.sleep(Thread.java:850) org.eclipse.persistence.internal.helper.ConcurrencyManager.releaseDeferredLock(ConcurrencyManager.java:429) org.eclipse.persistence.internal.helper.ConcurrencyManager.checkDeferredLock(ConcurrencyManager.java:193) org.eclipse.persistence.internal.identitymaps.CacheKey.checkDeferredLock(CacheKey.java:176) org.eclipse.persistence.internal.identitymaps.IdentityMapManager.getFromIdentityMapWithDeferredLock(IdentityMapManager.java:686) org.eclipse.persistence.internal.sessions.IdentityMapAccessor.getFromIdentityMapWithDeferredLock(IdentityMapAccessor.java:424) org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.checkCacheForObject(ExpressionQueryMechanism.java:728) org.eclipse.persistence.queries.ReadObjectQuery.checkEarlyReturnImpl(ReadObjectQuery.java:229) org.eclipse.persistence.queries.ObjectLevelReadQuery.checkEarlyReturn(ObjectLevelReadQuery.java:620) org.eclipse.persistence.queries.ObjectLevelReadQuery.checkEarlyReturn(ObjectLevelReadQuery.java:644) org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:613) org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:891) org.eclipse.persistence.queries.ReadObjectQuery.execute(ReadObjectQuery.java:397) org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:954) org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2651) org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1181) org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1165) org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1125) org.eclipse.persistence.internal.jpa.EntityManagerImpl.executeQuery(EntityManagerImpl.java:571) org.eclipse.persistence.internal.jpa.EntityManagerImpl.findInternal(EntityManagerImpl.java:516) org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:431) org.eclipse.persistence.internal.jpa.EntityManagerImpl.find(EntityManagerImpl.java:339) com.ibm.ws.jpa.management.JPATxEmInvocation.find(JPATxEmInvocation.java:211) com.ibm.ws.jpa.management.JPAEntityManager.find(JPAEntityManager.java:175) org.eclipse.persistence.testing.tests.jpa.relationships.RelationshipModelJUnitTestSuite.testModifyItem(RelationshipModelJUnitTestSuite.java:119) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:45) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) java.lang.reflect.Method.invoke(Method.java:599) junit.framework.TestCase.runTest(TestCase.java:154) junit.framework.TestCase.runBare(TestCase.java:127) org.eclipse.persistence.testing.framework.junit.JUnitTestCase.runBareServer(JUnitTestCase.java:391) org.eclipse.persistence.testing.framework.server.TestRunnerBean.runTest(TestRunnerBean.java:87) org.eclipse.persistence.testing.framework.server.EJSRemote0SLTestRunner_ed643271.runTest(EJSRemote0SLTestRunner_ed643271.java) org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie.runTest(_EJSRemote0SLTestRunner_ed643271_Tie.java) org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie._invoke(_EJSRemote0SLTestRunner_ed643271_Tie.java) com.ibm.CORBA.iiop.ServerDelegate.dispatchInvokeHandler(ServerDelegate.java:622) while (true) { // 2612538 - the default size of Map (32) is appropriate Map recursiveSet = new IdentityHashMap(); if (isBuildObjectOnThreadComplete(currentThread, recursiveSet)) {// Thread job done. lockManager.releaseActiveLocksOnThread(); removeDeferredLockManager(currentThread); AbstractSessionLog.getLog().log(SessionLog.FINER, "deferred_locks_released", currentThread.getName()); return; } else {// Not done yet, wait and check again. try { ----> Thread.sleep(1); } catch (InterruptedException ignoreAndContinue) { } } }
How to remote debug EclipseLink on WebSphere 7 using Eclipse (without an Eclipse Server Plugin for WebSphere) http://wiki.eclipse.org/EclipseLink/Examples/JPA/WebSphere7_Web_Tutorial#Remote_Debugging_EclipseLink_in_WebSphere_7.0_from_Eclipse_3.4
Created attachment 123234 [details] Alt#1 patch (for discussion and unblocking only) until Alt#3 is worked out This patch is for internal discussion only and will not be checked in until a full fix addressing the deadlock issue is discussed in alt #3 http://wiki.eclipse.org/EclipseLink/Bugs/259993#Alternative_1:_Short_Fix:_Do_Not_Wait_Indefinitely
This issue stays as Target Milestone=1.1 until we determine the severity of the underlying issue - starting with this section. http://wiki.eclipse.org/EclipseLink/Bugs/259993#The_activeThread_on_release.28.29_is_different_than_the_one_on_acquire.28.29
As of an update from 3202 to 3266 I am unable to reproduce the deadlock in this bug anymore. Both breakpoints are no longer hit. If I debug the acquire() and release() code - it works fine now. The change for bug# 261910 affected the same areas around readObjectQuiery() that were causing a deadlock that this bug# 259993 was experiencing. It looks like rev 3266 (20090122:0934) may have fixed this deadlock that occurs on a readQuery inside an em.find() after a recent persist/merge of the entity being referenced. I am unable to reproduce the deadlock that I could debug just before an SVN update to 3266 (I was previously at rev 3202) >20090122:1048: merging with deadlock during readlock during merge 261910 http://fisheye2.atlassian.com/changelog/eclipselink/?cs=3266 I am retesting all configurations, but the current configuration that works is - compile test and eclipselink.jar with the IBM J9 jvm in WebSphere 7 - run server in debug mode - run tests with/without eclipse debugger attached Previously before 3266 I would hit the following breakpoint in an infinite loop until I modified depth --> 0 in the Eclipse debugger. while (cacheKey.isAcquired() && (objectFromCache == null)) { cacheKey.getMutex().wait(MUTEX_STILL_ACQUIRED_WAIT_MILLISECONDS_TIME); } mutex ConcurrencyManager (id=287) activeThread ThreadPool$Worker (id=133) depth 1 lockedByMergeManager true numberOfReaders 0 numberOfWritersWaiting 0 ownerCacheKey HardCacheWeakIdentityMap$ReferenceCacheKey (id=288) -------------test results--------------------- >prereq: - I installed as a service - but it is still scriptable with server-start/stop - I installed with no admin security (no more popups) - I went into the admin console and selected the debug flag - the server runs a bit slower and can be attached to by a debug client - also I compiled the test source and eclipselink.jar using the J9 jvm from websphere I will try with regular SUN JDK 1.6.0_07 compiled code >With Eclipse Debugger attached - using F8 step through server-run-test: [echo] Running test eclipselink-relationships-model on websphere Warning: Reference compile.server.path has not been set at runtime, but was found during build file parsing, attempting to resolve. Future versions of Ant may support referencing ids defined in non-executed targets. Warning: Reference compile.path has not been set at runtime, but was found during build file parsing, attempting to resolve. Future versions of Ant may support referencing ids defined in non-executed targets. [junit] Running org.eclipse.persistence.testing.tests.jpa.relationships.ServerTestSuite [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 153.484 sec [junit] Test org.eclipse.persistence.testing.tests.jpa.relationships.ServerTestSuite FAILED Total time: 6 minutes 17 seconds >With no debugger attached - only WebSphere server running in debug mode [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 18.095 sec >stop server >restart server >now I hang again on the following (I did a build but I am turning off auto-compilation in eclipse and rebuilding) [1/22/09 12:06:30:643 EST] 00000019 SystemOut O [EL Finest]: 2009.01.22 12:06:30.643--UnitOfWork(970209748)--Thread(Thread[ORB.thread.pool : 0,5,main])--Execute query ReadObjectQuery(referenceClass=OneToOneVirtualAttributeHolder sql="SELECT O_O_VIRTUALID, VIRTUAL_ID FROM O_O_VIRTUAL WHERE (O_O_VIRTUALID = ?)") >attach debugger - no break points hit >rerun test with debugger attached - working >rerun test with debugger attached but all breakpoints disabled except at the 2 deadlock while loops - no bp hit - working >rerun test with debugger above 2nd time to verify theory of every second run hangs - working [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 18.673 sec >rerun test with debugger detached again - working [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 18.283 sec >stop server >restart server >rerun test without debugger detached - working [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 18.174 sec >rerun test without debugger detached - working [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 13.9 sec >stop server >restart server >rerun test without debugger detached - working [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 17.102 sec >recompile test/eclipselink.jar using the SUN JDK 1.6.0_07 instaed of the IBM J9 1.6.0 JVM - pending I have asked Yiping to reverify with 3266.
Correction I was previously failing with svn rev 3220 (16 Jan 2009), 3266 is working fine. >stop server >recompile test/eclipselink.jar using the SUN JDK 1.6.0_07 instead of the IBM J9 1.6.0 JVM (we still run the server on J9 though) >restart server >rerun test (server in debug mode) without debugger detached - working [junit] Tests run: 13, Failures: 0, Errors: 2, Time elapsed: 28.303 sec >stop server >We should reverify that we are failing with the current test code using an old eclipselink.jar take old jar from 3220 (14 Jan 2009) and retest - expect to hang with a deadlock >restart server >rerun test (server in debug mode) without debugger detached - hanging = OK ---> as expected on an old 3220 label [junit] Running org.eclipse.persistence.testing.tests.jpa.relationships.ServerTestSuite Terminate batch job (Y/N)? y >stop server >restart server to recover deadlocked threads >stop server
After Yiping tested - it is verified that we no longer get the first deadlock in http://wiki.eclipse.org/EclipseLink/Bugs/259993#Timeout_during_UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent.28.29 but we still get the 2nd deadlock (that i intermittently saw only once) in http://wiki.eclipse.org/EclipseLink/Bugs/259993#Timeout_during_ConcurrencyManager.releaseDeferredLock.28.29 when running ant server-test-fieldaccess-advanced I see the following new finer log message from 3226 at the end which allows us to continue and loop in ConcurrencyManager [1/22/09 15:35:44:016 EST] 0000001e SystemOut O [EL Finer]: 2009.01.22 15:35:44.016--UnitOfWork(1534811003)--Thread(Thread[ORB.thread.pool : 0,5,main])--begin unit of work commit [1/22/09 15:35:44:031 EST] 0000001e SystemOut O [EL Finer]: 2009.01.22 15:35:44.031--UnitOfWork(1534811003)--Thread(Thread[ORB.thread.pool : 0,5,main])--initialize identitymaps [1/22/09 15:35:44:031 EST] 0000001e SystemOut O [EL Finer]: 2009.01.22 15:35:44.031--UnitOfWork(1534811003)--Thread(Thread[ORB.thread.pool : 0,5,main])--TX afterCompletion callback, status=COMMITTED [1/22/09 15:35:44:031 EST] 0000001e SystemOut O [EL Finer]: 2009.01.22 15:35:44.031--ServerSession(282857692)--Thread(Thread[ORB.thread.pool : 0,5,main])--client acquired [1/22/09 15:35:44:031 EST] 0000001e SystemOut O [EL Finest]: 2009.01.22 15:35:44.031--UnitOfWork(259264372)--Thread(Thread[ORB.thread.pool : 0,5,main])--Execute query ReadAllQuery(referenceClass=Employee sql="SELECT t1.EMP_ID, t2.EMP_ID, t1.GENDER, t1.L_NAME, t1.PAY_SCALE, t1.ROOM_NUM, t2.SALARY, t1.STATUS, t1.VERSION, t1.F_NAME, t1.START_DATE, t1.END_DATE, t1.FORMER_COMPANY, t1.FORMER_START_DATE, t1.FORMER_END_DATE, t1.DEPT_ID, t1.ADDR_ID, t1.MANAGER_EMP_ID, t0.ID, t0.NAME FROM CMP3_FA_DEPT t0, CMP3_FA_SALARY t2, CMP3_FA_EMPLOYEE t1 WHERE ((t2.EMP_ID = t1.EMP_ID) AND (t0.ID (+) = t1.DEPT_ID))") [1/22/09 15:35:44:031 EST] 0000001e SystemOut O [EL Finest]: 2009.01.22 15:35:44.031--ServerSession(282857692)--Thread(Thread[ORB.thread.pool : 0,5,main])--reconnecting to external connection pool [1/22/09 15:35:44:031 EST] 0000001e SystemOut O [EL Fine]: 2009.01.22 15:35:44.031--ServerSession(282857692)--Connection(1384141440)--Thread(Thread[ORB.thread.pool : 0,5,main])--SELECT t1.EMP_ID, t2.EMP_ID, t1.GENDER, t1.L_NAME, t1.PAY_SCALE, t1.ROOM_NUM, t2.SALARY, t1.STATUS, t1.VERSION, t1.F_NAME, t1.START_DATE, t1.END_DATE, t1.FORMER_COMPANY, t1.FORMER_START_DATE, t1.FORMER_END_DATE, t1.DEPT_ID, t1.ADDR_ID, t1.MANAGER_EMP_ID, t0.ID, t0.NAME FROM CMP3_FA_DEPT t0, CMP3_FA_SALARY t2, CMP3_FA_EMPLOYEE t1 WHERE ((t2.EMP_ID = t1.EMP_ID) AND (t0.ID (+) = t1.DEPT_ID)) [1/22/09 15:35:44:047 EST] 0000001e SystemOut O [EL Finest]: 2009.01.22 15:35:44.047--UnitOfWork(259264372)--Thread(Thread[ORB.thread.pool : 0,5,main])--Register the existing object Employee: 3834 --->[1/22/09 15:35:44:047 EST] 0000001e SystemOut O [EL Finer]: 2009.01.22 15:35:44.047--Thread(Thread[ORB.thread.pool : 0,5,main])--Thread "ORB.thread.pool : 0" has acquired a deferred lock on object : null in order to avoid deadlock. server-run-test: [echo] Running test eclipselink-advanced-field-access-model on websphere Warning: Reference compile.server.path has not been set at runtime, but was found during build file parsing, attempting to resolve. Future versions of Ant may support referencing ids defined in non-executed targets. Warning: Reference compile.path has not been set at runtime, but was found during build file parsing, attempting to resolve. Future versions of Ant may support referencing ids defined in non-executed targets. [junit] Running org.eclipse.persistence.testing.tests.jpa.fieldaccess.advanced.ServerTestSuite [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] >> SERVER (id=4773e3aa, host=mobrien-pc2.st-mdc.ca.oracle.com) TRACE START: [junit] >> javax.ejb.EJBTransactionRolledbackException [junit] >> at com.ibm.ejs.csi.BeanManaged.postInvoke(BeanManaged.java:329) [junit] >> at com.ibm.ejs.csi.TransactionControlImpl.postInvoke(TransactionControlImpl.java:561) [junit] >> at com.ibm.ejs.container.EJSContainer.postInvoke(EJSContainer.java:4512) [junit] >> at org.eclipse.persistence.testing.framework.server.EJSRemote0SLTestRunner_ed643271.runTest(EJS Remote0SLTestRunner_ed643271.java) [junit] >> at org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie.runTes t(_EJSRemote0SLTestRunner_ed643271_Tie.java) [junit] >> at org.eclipse.persistence.testing.framework.server._EJSRemote0SLTestRunner_ed643271_Tie._invok e(_EJSRemote0SLTestRunner_ed643271_Tie.java) [junit] >> at com.ibm.CORBA.iiop.ServerDelegate.dispatchInvokeHandler(ServerDelegate.java:622) [junit] >> at com.ibm.CORBA.iiop.ServerDelegate.dispatch(ServerDelegate.java:475) [junit] >> at com.ibm.rmi.iiop.ORB.process(ORB.java:504) [junit] >> at com.ibm.CORBA.iiop.ORB.process(ORB.java:1571) [junit] >> at com.ibm.rmi.iiop.Connection.respondTo(Connection.java:2771) [junit] >> at com.ibm.rmi.iiop.Connection.doWork(Connection.java:2640) [junit] >> at com.ibm.rmi.iiop.WorkUnitImpl.doWork(WorkUnitImpl.java:63) [junit] >> at com.ibm.ejs.oa.pool.PooledThread.run(ThreadPool.java:118) [junit] >> at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1527) [junit] >> SERVER (id=4773e3aa, host=mobrien-pc2.st-mdc.ca.oracle.com) TRACE END. [junit] ) [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] org.omg.CORBA.MARSHAL: Unable to read value from underlying bridge : Mismatched serialization UIDs : Sou rce (RepId RMI:junit.framework.AssertionFailedError:2E921E9DEB124A7F:0000000000000001) = 0000000000000001 whereas Target (RepId RMI:junit.framework.AssertionFailedError:048A3675A6FD5463:A02206A4C2A2FD62) = A02206A4C2A2FD62 vmcid: IBM mino r code: 896 completed: No) [junit] Exception Description: Cannot persist detached object [Employee: 3828]. [junit] Class> org.eclipse.persistence.testing.models.jpa.fieldaccess.advanced.Employee Primary Key> [3828]) -->deadlock here Summarizing: For some models that failed on both deadlocks - they fully pass, some other models only fail on the 2nd deadlock - so we have progress Before 3226 ----------- Hangs ant server-test-relationships ant server-test-datetime ? ant server-test-fieldaccess-advanced After 3226 ----------- Hangs ant server-test-fieldaccess-advanced OK ant server-test-relationships ant server-test-datetime
>3rd time through internalTestUpdateAllProjects Daemon Thread [ORB.thread.pool : 1] (Suspended (breakpoint at line 434 in ConcurrencyManager)) ConcurrencyManager.releaseDeferredLock() line: 434 HardCacheWeakIdentityMap$ReferenceCacheKey(CacheKey).releaseDeferredLock() line: 413 ObjectBuilder.buildObject(ObjectBuildingQuery, AbstractRecord, AbstractSession, Vector, ClassDescriptor, JoinedAttributeManager) line: 769 ObjectBuilder.buildWorkingCopyCloneNormally(ObjectBuildingQuery, AbstractRecord, UnitOfWorkImpl, Vector, ClassDescriptor, JoinedAttributeManager) line: 574 ObjectBuilder.buildObjectInUnitOfWork(ObjectBuildingQuery, JoinedAttributeManager, AbstractRecord, UnitOfWorkImpl, Vector, ClassDescriptor) line: 544 ObjectBuilder.buildObject(ObjectBuildingQuery, AbstractRecord, JoinedAttributeManager) line: 485 ObjectBuilder.buildObject(ObjectLevelReadQuery, AbstractRecord) line: 437 ReadAllQuery(ObjectLevelReadQuery).buildObject(AbstractRecord) line: 571 ReadAllQuery.registerResultInUnitOfWork(Object, UnitOfWorkImpl, AbstractRecord, boolean) line: 903 ReadAllQuery.executeObjectLevelReadQuery() line: 489 ReadAllQuery(ObjectLevelReadQuery).executeDatabaseQuery() line: 930 ReadAllQuery(DatabaseQuery).execute(AbstractSession, AbstractRecord) line: 664 ReadAllQuery(ObjectLevelReadQuery).execute(AbstractSession, AbstractRecord) line: 891 ReadAllQuery.execute(AbstractSession, AbstractRecord) line: 457 ReadAllQuery(ObjectLevelReadQuery).executeInUnitOfWork(UnitOfWorkImpl, AbstractRecord) line: 954 RepeatableWriteUnitOfWork(UnitOfWorkImpl).internalExecuteQuery(DatabaseQuery, AbstractRecord) line: 2664 RepeatableWriteUnitOfWork(AbstractSession).executeQuery(DatabaseQuery, AbstractRecord, int) line: 1181 RepeatableWriteUnitOfWork(AbstractSession).executeQuery(DatabaseQuery, AbstractRecord) line: 1165 RepeatableWriteUnitOfWork(AbstractSession).executeQuery(DatabaseQuery, List) line: 1139 EJBQueryImpl.executeReadQuery() line: 392 EJBQueryImpl.getResultList() line: 592 EntityManagerJUnitTestSuite.internalTestUpdateAllProjects(Class) line: 3742 EntityManagerJUnitTestSuite.testUpdateAllSmallProjects() line: 3720 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 45 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 37 Method.invoke(Object, Object...) line: 599 EntityManagerJUnitTestSuite(TestCase).runTest() line: 154 EntityManagerJUnitTestSuite(TestCase).runBare() line: 127 EntityManagerJUnitTestSuite(JUnitTestCase).runBareServer() line: 391 TestRunnerBean.runTest(String, String, Properties) line: 87 EJSRemote0SLTestRunner_ed643271.runTest(String, String, Properties) line: not available _EJSRemote0SLTestRunner_ed643271_Tie.runTest(InputStream, ResponseHandler) line: not available _EJSRemote0SLTestRunner_ed643271_Tie._invoke(String, InputStream, ResponseHandler) line: not available WSExtendedServerDelegate(ServerDelegate).dispatchInvokeHandler(InterceptorManager, ServerRequestInfoImpl, String, Object, ServerRequestReader) line: 622 WSExtendedServerDelegate(ServerDelegate).dispatch(ServerRequest) line: 475 ORB(ORB).process(ServerRequest) line: 504 ORB.process(ServerRequest) line: 1571 Connection.respondTo(RequestHandler, IIOPReader) line: 2771 Connection.doWork(RequestHandler, IIOPReader) line: 2640 WorkUnitImpl.doWork() line: 63 PooledThread.run() line: 118 ThreadPool$Worker.run() line: 1527 this ConcurrencyManager (id=169) activeThread ThreadPool$Worker (id=96) depth 1 lockedByMergeManager false numberOfReaders 0 numberOfWritersWaiting 0 ownerCacheKey HardCacheWeakIdentityMap$ReferenceCacheKey (id=170) recursiveSet IdentityHashMap<K,V> (id=185) currentThread ThreadPool$Worker (id=162) lockManager DeferredLockManager (id=187) activeLocks Vector<E> (id=192) deferredLocks Vector<E> (id=200) isThreadComplete true threadDepth 1 depth 1 iterations 0 [1/22/09 16:17:10:366 EST] 0000001b SystemOut O [EL Finer]: 2009.01.22 16:17:10.366--Thread(Thread[ORB.thread.pool : 1,5,main])--Thread "ORB.thread.pool : 1" has acquired a deferred lock on object : org.eclipse.persistence.testing.models.jpa.fieldaccess.advanced.LargeProject@1da81da8 in order to avoid deadlock. ConcurrencyManager public static boolean isBuildObjectOnThreadComplete(Thread thread, Map recursiveSet) { // the active thread may be set to null at anypoint // if added for CR 2330 if (activeThread != null) { DeferredLockManager currentLockManager = getDeferredLockManager(activeThread); if (currentLockManager == null) { --> return false; public void releaseDeferredLock() throws ConcurrencyException { while (true) { // 2612538 - the default size of Map (32) is appropriate Map recursiveSet = new IdentityHashMap(); if (isBuildObjectOnThreadComplete(currentThread, recursiveSet)) {// Thread job done. lockManager.releaseActiveLocksOnThread(); removeDeferredLockManager(currentThread); AbstractSessionLog.getLog().log(SessionLog.FINER, "deferred_locks_released", currentThread.getName()); return; } else {// Not done yet, wait and check again. try { --> Thread.sleep(DEFERRED_LOCK_SLEEP_MILLISECONDS_TIME); AbstractSessionLog.getLog().log(SessionLog.FINEST, "timeout_waiting_for_deferred_locks_release", iterations++ );//currentThread.getName()); } catch (InterruptedException ignoreAndContinue) { ignoreAndContinue.printStackTrace(); } } } Again, same issue here except that we are no longer lockedByMergeManager=true, but our depth is still > 0 so isAcquired() will return true. If while we are debugging we set depth=0 - we exit the loop we get [junit] Exception Description: Max number of attempts to lock object: org.eclipse.persistence.testing.models.jpa.fie ldaccess.advanced.LargeProject@1da81da8 exceded. Failed to clone the object.) 2) However late we get the original first deadlock on UnitOfWorkItentityMapAccessor for a getReference call (which is more restrictive than a find()) public void testSetFieldForPropertyAccessWithRefresh() { ... employee = em.getReference(Employee.class, employee.getId()); We need the same type of fix that does not lock the mutex during a read in getReference() mutex ConcurrencyManager (id=593) activeThread ThreadPool$Worker (id=96) depth 1 lockedByMergeManager true numberOfReaders 0 numberOfWritersWaiting 0 ownerCacheKey HardCacheWeakIdentityMap$ReferenceCacheKey (id=569) Daemon Thread [ORB.thread.pool : 0] (Suspended (breakpoint at line 154 in UnitOfWorkIdentityMapAccessor)) UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(Vector, Class, boolean, ClassDescriptor) line: 154 UnitOfWorkIdentityMapAccessor.getFromIdentityMap(Vector, Class, boolean, ClassDescriptor, boolean) line: 116 UnitOfWorkIdentityMapAccessor(IdentityMapAccessor).getFromIdentityMap(Vector, Class, boolean) line: 327 UnitOfWorkIdentityMapAccessor(IdentityMapAccessor).getFromIdentityMap(Vector, Class) line: 310 RepeatableWriteUnitOfWork(UnitOfWorkImpl).getReference(Class, Object) line: 5375 EntityManagerImpl.getReference(Class<T>, Object) line: 973 JPATxEmInvocation.getReference(Class<T>, Object) line: 262 JPATxEntityManager(JPAEntityManager).getReference(Class<T>, Object) line: 188 EntityManagerJUnitTestSuite.testSetFieldForPropertyAccessWithRefresh() line: 3346 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 45 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 37 Method.invoke(Object, Object...) line: 599 EntityManagerJUnitTestSuite(TestCase).runTest() line: 154 EntityManagerJUnitTestSuite(TestCase).runBare() line: 127 EntityManagerJUnitTestSuite(JUnitTestCase).runBareServer() line: 391 TestRunnerBean.runTest(String, String, Properties) line: 87 EJSRemote0SLTestRunner_ed643271.runTest(String, String, Properties) line: not available _EJSRemote0SLTestRunner_ed643271_Tie.runTest(InputStream, ResponseHandler) line: not available _EJSRemote0SLTestRunner_ed643271_Tie._invoke(String, InputStream, ResponseHandler) line: not available WSExtendedServerDelegate(ServerDelegate).dispatchInvokeHandler(InterceptorManager, ServerRequestInfoImpl, String, Object, ServerRequestReader) line: 622 WSExtendedServerDelegate(ServerDelegate).dispatch(ServerRequest) line: 475 ORB(ORB).process(ServerRequest) line: 504 ORB.process(ServerRequest) line: 1571 Connection.respondTo(RequestHandler, IIOPReader) line: 2771 Connection.doWork(RequestHandler, IIOPReader) line: 2640 WorkUnitImpl.doWork() line: 63 PooledThread.run() line: 118 ThreadPool$Worker.run() line: 1527
There are 2 deadlocks here 1) readObjectQuery without merge changes 2) readObjectQuery with merge changes The deadlock in (1) has been fixed in bug# 261910 in rev 3266 yesterday 20090122 and was part of bug# 259993. http://bugs.eclipse.org/261910 The deadlock in (2) is being fixed as the 2nd part of 259993 - in progress see: http://bugs.eclipse.org/259993 http://wiki.eclipse.org/EclipseLink/Bugs/259993 See possibly related new GlassFish bug# 262157
Rev 3322 : Retesting with http://fisheye2.atlassian.com/changelog/eclipselink/?cs=3322 The fix in ref 3322 has reversed test results. Where before we were only hanging in ConcurrencyManager, we now only hang in UnitOfWorkIdentityMapAccessor. ant server-test-fieldaccess-advanced We only hang on 1 test instead of 2 - in UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(). We no longer hang on the secondary timeout in ConcurrencyManager.releaseDeferredLock() ant server-test-relationships We hang on 7 tests - all in the same place in UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(). We no longer hang on ConcurrencyManager.releaseDeferredLock() Summary: No longer occurs http://wiki.eclipse.org/EclipseLink/Bugs/259993#Timeout_during_ConcurrencyManager.releaseDeferredLock.28.29 Still occurring - depth on single thread remains == 1 http://wiki.eclipse.org/EclipseLink/Bugs/259993#Timeout_during_UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent.28.29
Created attachment 124091 [details] Hanging tests 7 + 1 for UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent The following stack and variable trace is for each of the 7 (server-test-fieldaccess-advanced) and 1 (server-test-relationships) remaining hung test threads. - we no longer see 2 threads causing issues with ConcurrencyManager.releaseDeferredLock() - when debugging the tests, we exit the hung while loop by manually changing mutex.depth from 1 to 0 so we can continue to the next test.
New Deadlock #3 I had the following suite commented - when it is enabled we now get a new 3rd type of hang waiting indefinitely for a notify() or notifyAll() on the current thread during a find/ReadObjectQuery suite.addTest(VirtualAttributeTestSuite.suite()); Daemon Thread [ORB.thread.pool : 1] (Running) Daemon Thread [ORB.thread.pool : 0] (Suspended) Object.wait(long, int) line: not available [native method] ConcurrencyManager(Object).wait() line: 167 ConcurrencyManager.acquireReadLock() line: 242 [local variables unavailable] ConcurrencyManager.checkReadLock() line: 230 HardCacheWeakIdentityMap$ReferenceCacheKey(CacheKey).checkReadLock() line: 180 IdentityMapManager.getFromIdentityMap(Vector, Class, boolean, ClassDescriptor) line: 610 IdentityMapAccessor.getFromIdentityMap(Vector, Class, boolean, ClassDescriptor) line: 345 ExpressionQueryMechanism.checkCacheForObject(AbstractRecord, AbstractSession) line: 730 ReadObjectQuery.checkEarlyReturnImpl(AbstractSession, AbstractRecord) line: 229 ReadObjectQuery(ObjectLevelReadQuery).checkEarlyReturn(AbstractSession, AbstractRecord) line: 620 ReadObjectQuery(ObjectLevelReadQuery).checkEarlyReturn(AbstractSession, AbstractRecord) line: 644 ReadObjectQuery(DatabaseQuery).execute(AbstractSession, AbstractRecord) line: 613 ReadObjectQuery(ObjectLevelReadQuery).execute(AbstractSession, AbstractRecord) line: 891 ReadObjectQuery.execute(AbstractSession, AbstractRecord) line: 397 ReadObjectQuery(ObjectLevelReadQuery).executeInUnitOfWork(UnitOfWorkImpl, AbstractRecord) line: 954 RepeatableWriteUnitOfWork(UnitOfWorkImpl).internalExecuteQuery(DatabaseQuery, AbstractRecord) line: 2690 RepeatableWriteUnitOfWork(AbstractSession).executeQuery(DatabaseQuery, AbstractRecord, int) line: 1181 RepeatableWriteUnitOfWork(AbstractSession).executeQuery(DatabaseQuery, AbstractRecord) line: 1165 RepeatableWriteUnitOfWork(AbstractSession).executeQuery(DatabaseQuery) line: 1125 EntityManagerImpl.executeQuery(ReadObjectQuery, LockModeType, UnitOfWork) line: 626 EntityManagerImpl.findInternal(ClassDescriptor, AbstractSession, Object, LockModeType, Map) line: 570 EntityManagerImpl.find(Class<T>, Object, LockModeType, Map) line: 474 EntityManagerImpl.find(Class<T>, Object) line: 374 JPATxEmInvocation.find(Class<T>, Object) line: 211 JPATxEntityManager(JPAEntityManager).find(Class<T>, Object) line: 175 VirtualAttributeTestSuite.testReadVirtualAttribute() line: 79 NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] NativeMethodAccessorImpl.invoke(Object, Object[]) line: 45 DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 37 Method.invoke(Object, Object...) line: 599 VirtualAttributeTestSuite(TestCase).runTest() line: 154 VirtualAttributeTestSuite(TestCase).runBare() line: 127 VirtualAttributeTestSuite(JUnitTestCase).runBareServer() line: 396 TestRunnerBean.runTest(String, String, Properties) line: 87 EJSRemote0SLTestRunner_ed643271.runTest(String, String, Properties) line: not available _EJSRemote0SLTestRunner_ed643271_Tie.runTest(InputStream, ResponseHandler) line: not available _EJSRemote0SLTestRunner_ed643271_Tie._invoke(String, InputStream, ResponseHandler) line: not available WSExtendedServerDelegate(ServerDelegate).dispatchInvokeHandler(InterceptorManager, ServerRequestInfoImpl, String, Object, ServerRequestReader) line: 622 WSExtendedServerDelegate(ServerDelegate).dispatch(ServerRequest) line: 475 ORB(ORB).process(ServerRequest) line: 504 ORB.process(ServerRequest) line: 1571 Connection.respondTo(RequestHandler, IIOPReader) line: 2771 Connection.doWork(RequestHandler, IIOPReader) line: 2640 WorkUnitImpl.doWork() line: 63 PooledThread.run() line: 118 ThreadPool$Worker.run() line: 1527 public synchronized void acquireReadLock() throws ConcurrencyException { // Cannot check for starving writers as will lead to deadlocks. while ((this.activeThread != null) && (this.activeThread != Thread.currentThread())) { try { -->holding wait(); } catch (InterruptedException exception) { throw ConcurrencyException.waitWasInterrupted(exception.getMessage()); } } this.numberOfReaders++; }
After several debug sessions here are some new facts: - we intermittently fail with 2 types of deadlocks - we always fail with deadlock type 1 - working around the deadlock using any type of workaround like the alt# 1 patch - causes a 3rd or 4th type of deadlock on a wait() for another thread later. 1) "Intermittently" there are 2 threads running, on occassion we only see a single thread and only see the 8 failures mentioned above. - only UnitOfWorkIdentityMapAccessor deadlocks Daemon Thread [RT=0:P=942386:O=0:WSTCPTransportConnection[addr=10.156.52.98,port=1511,local=2813]] (Running) Daemon Thread [ORB.thread.pool : 0] (Running) - ConcurrencyManager deadlocks Daemon Thread [RT=0:P=942386:O=0:WSTCPTransportConnection[addr=10.156.52.98,port=1511,local=2813]] (Running) Daemon Thread [ORB.thread.pool : 0] (Running) Daemon Thread [RT=1:P=942386:O=0:WSTCPTransportConnection[addr=10.156.52.98,port=1512,local=9104]] (Running) Daemon Thread [ORB.thread.pool : 1] (Running) 2) When a deadlock occurs on ConcurrencyManager, this is due to to the fact that the currentThread != the activeThread. This agrees with our original theory that the acquire and release threads are different and that we are not checking the thread on release - like we are on acquire. [1/29/09 10:56:05:127 EST] 00000025 SystemOut O [EL Finer]: 2009-01-29 10:56:05.127--Thread(Thread[ORB.thread.pool : 0,5,main])--Thread "ORB.thread.pool : 0" has acquired a deferred lock on object : null in order to avoid deadlock. [1/29/09 10:59:30:006 EST] 00000025 SystemOut O __ConcurrencyManager.releaseDeferredLock() deadlock: currentThread: Thread[ORB.thread.pool : 0,5,main] activeThread: Thread[ORB.thread.pool : 1,5,main] 3) A 3rd or 4th type of deadlock on a wait() occurs as an affect of breaking out of the deadlock in 1 or 2 above because the thread is still not released - it will not occur under normal functioning. In the javadoc for wait() in ConcurrencyManager.acquireDeferredLock():190 we state that "If another thread is also waiting on the acquire then a deadlock could occur" - this is what is happening here. Two deadlocked threads (thread 0 isHung==true --------------------------------------------- Daemon Thread [RT=0:P=942386:O=0:WSTCPTransportConnection[addr=10.156.52.98,port=1511,local=2813]] (Running) Daemon Thread [ORB.thread.pool : 0] (Suspended) Object.wait(long, int) line: not available [native method] ConcurrencyManager(Object).wait() line: 167 ConcurrencyManager.acquire(boolean) line: 89 ConcurrencyManager.acquire() line: 75 [local variables unavailable] ConcurrencyManager.acquireDeferredLock() line: 198 EntityManagerJUnitTestSuite.testUpdateAllProjectsWithNullTeamLeader() line: 3913 [local variables unavailable] Daemon Thread [RT=1:P=942386:O=0:WSTCPTransportConnection[addr=10.156.52.98,port=1512,local=9104]] (Running) Daemon Thread [ORB.thread.pool : 1] (Suspended) Object.wait(long, int) line: not available [native method] this ThreadPool (id=313) activeThreads 1 maximumPoolSize_ 50 minimumPoolSize_ 10 name "ORB.thread.pool" (id=329) poolSize_ 2 threads_ HashMap<K,V> (id=333) elementCount 2 elementData HashMap$Entry<K,V>[16] (id=357) [1] HashMap$Entry<K,V> (id=363) key ThreadPool$Worker (id=166) isHung false name "ORB.thread.pool : 1" (id=316) threadNumber "00000026" (id=319) uniqueId 102 [8] HashMap$Entry<K,V> (id=367) key ThreadPool$Worker (id=139) isHung true lock Thread$ThreadLock (id=251) name "ORB.thread.pool : 0" (id=253) threadNumber "00000025" (id=376) uniqueId 100 - Currently debugging earlier in AbstractSynchronizationLister.afterCompletion()
Created attachment 124639 [details] eclipselink core patch (pre code-review) There are 3 places where we have thread wait issues, here are the fixes in core. I aligned these changes with exiting thread checks and wait timeouts already in the code. We are seeing this thread wait behavior when we run get two different server threads for writing and reading. 1) A find after changes were already committed Reproduction: ant server-test-relationships Where: UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent() Fix: Same fix as CacheKey.waitForObject() for a wait(10) call http://fisheye2.atlassian.com/browse/eclipselink/trunk/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/identitymaps/CacheKey.java?r1=1817&r2=3322 We use CacheKey.MAX_WAIT_TRIES on the same wait(5) call Effect: throw same ConcurrencyException.maxTriesLockOnBuildObjectExceded 2) A find or getReference that causes pending changes to commit Reproduction: ant server-test-relationships Reproduction: ant server-test-fieldaccess-advanced Where: ConcurrencyManager.releaseDeferredLock() Fix: In WriteLockManager.acquireLocksForClone() We use the same fix as WriteLockManager.acquireRequiredLocks() where we also verify the threads are different before a wait() && (activeCacheKey.getMutex().getActiveThread() != Thread.currentThread()) also We preemptively check for different current/active threads and switch them on the after AbstractSynchronizationListener.afterCompletion() callback. and log a warning Effect: - no more thread wait issue - threads are switched and new activeThread is released - warning emitted when threads were different on afterCompletion callback 3) A find that acquires a readLock with no pending changes Same issue as https://bugs.eclipse.org/bugs/show_bug.cgi?id=261910#c2 Reproduction: ant server-test-relationships Where: ConcurrencyManager.acquireReadLock() line: 243 Fix: Same fix as WriteLockManager.acquireLocksForClone() using MAXTRIES Except we throw a different maxTriesAcquireReadLockExceded a ConcurrencyException.maxTriesLockOnCloneExceded Effect: - we exit by throwing a maxTriesAcquireReadLockExceded Other options: ------------- I am only using option 2 below - so there is no configuration changes requires of developers. 2) switch activeThread to currentThread this.activeThread = currentThread; this.release(); 3) For deferred lock deadlock - disable caching using the persistence.xml property "eclipselink.cache.shared.default"="false" 4) Configure cache isolation using James /** Reads and unit of work merges can occur concurrently. */ getServerSession().getLogin().setCacheTransactionIsolation(DatabaseLogin.CONCURRENT_READ_WRITE); Gordon /** Writes to the cache (merge, object build/refresh will be synchronized * as will cache access (cloning) based on when access is required. */ getServerSession().getLogin().setCacheTransactionIsolation(DatabaseLogin.SYNCRONIZED_OBJECT_LEVEL_READ_WRITE); 5) Disable deferred locks - only for debugging the cause - DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS = false; >result = 6) Use lazy relationships with no join fetching (that requires deferred locks) - use batch reading - but one of the tests only involves a single entity with an embeddable Testing: ----------------- LRG: before fix#3 [junit] Tests run: 6647, Failures: 0, Errors: 8, Time elapsed: 3,514.685 sec Total time: 58 minutes 50 seconds - server test hung only on issue #3 - I will retest the lrg tomorrow as it takes an hour. After fix #3: we have no more thread wait issues - we used to fail with 8 errors here. [junit] Tests run: 30, Failures: 0, Errors: 4, Time elapsed: 296.177 sec [junit] Test org.eclipse.persistence.testing.tests.jpa.relationships.ServerTestSuite FAILED This suite is unchanged (but without any wait loops) [junit] Tests run: 203, Failures: 0, Errors: 28, Time elapsed: 553.018 sec [junit] Test org.eclipse.persistence.testing.tests.jpa.fieldaccess.advanced.ServerTestSuite FAILED Note: the bug wiki still needs to be updated with this patch's results
Cause found while debugging with Gordon and Peter. How? added extra breakpoints on all (non-setLifecycle()) direct modifier of uow.lifecycle to pick up callbacks from the server during the previous commit before a find() 1) We acquire locks in beforeCompletion() 2) During a commit WebSphere 7 is attempting to clearForClose the entityManager before the transaction has completed. WebSphere JUnit jpa.test [Remote Java Application] IBM J9 VM[localhost:7777] Daemon Thread [ORB.thread.pool : 1] (Suspended (breakpoint at line 5339 in UnitOfWorkImpl)) RepeatableWriteUnitOfWork(UnitOfWorkImpl).clearForClose(boolean) line: 5339 RepeatableWriteUnitOfWork.clearForClose(boolean) line: 126 EntityManagerImpl.clear() line: 220 JPAEMPool.putEntityManager(EntityManager) line: 147 JPATxEntityManager.closeTxEntityManager(EntityManager, boolean) line: 299 JPATxEmInvocation.afterCompletion(int) line: 87 ComponentContextSynchronizationWrapper.afterCompletion(int) line: 83 RegisteredSyncs(RegisteredSyncs).coreDistributeAfter(int) line: 357 RegisteredSyncs.distributeAfter(int) line: 317 TransactionImpl(TransactionImpl).distributeAfter(int) line: 2933 TransactionImpl(TransactionImpl).postCompletion(int, boolean, boolean) line: 2892 TransactionImpl(TransactionImpl).postCompletion(int) line: 2805 TransactionImpl(TransactionImpl).commitXAResources() line: 1775 TransactionImpl.stage1CommitProcessing() line: 497 TransactionImpl(TransactionImpl).processCommit() line: 978 TransactionImpl(TransactionImpl).commit() line: 913 TranManagerImpl.commit() line: 369 TranManagerSet(TranManagerSet).commit() line: 161 UserTransactionImpl.commit() line: 293 UserTransactionWrapper.commit() line: 305 JEEPlatform.commitTransaction() line: 74 VirtualAttributeTestSuite(JUnitTestCase).commitTransaction(EntityManager) line: 209 VirtualAttributeTestSuite.testInsertVirtualAttribute() line: 65 3) consequently, the state of the unitOfWork is modified from 4 (MergePending) back to 0 (Birth). 4) During the commit before in the AbstractSynchronizationListener.afterCompletion() callback - the uow.isMergePending() is now 0 instead of 4, so no mergeClonesAfterCompletion() happens. - this results no release of the lock acquired before the em was cleared in mid commit. Fix: Ignore/defer any request to uow.clear() in uow.clearForClose() if the state/lifecyle of the unitOfWork is *Pending or one of (1,2 or 4), the clear() will be done later on release(). We no longer have issues with a find that has issues with a non-released previous commit. Tested: The following suites are OK now ant server-test-relationships [junit] Tests run: 30, Failures: 0, Errors: 0, Time elapsed: 39.021 sec ant server-test-fieldaccess-advanced [junit] Tests run: 203, Failures: 0, Errors: 2, Time elapsed: 77.011 sec Code-review, Java SE tests and full regression tests are in progress.
Created attachment 125187 [details] eclipselink core patch (pre-code review)
Created attachment 125268 [details] eclipselink jpa test 5 of 6 tests complete (postPersist callback is pending)
Created attachment 125287 [details] eclipselink core patch (code review 2) 1) UnitOfWorkImpl.mergeClonesAfterCompletion() change moved to MergeManager.mergeChanges() 2) UnitOfWorkImpl.clearForClose() unchanged from previous patch
Created attachment 125318 [details] eclipselink core patch (code review 3) Review 3: https://bugs.eclipse.org/bugs/show_bug.cgi?id=259993 1) UnitOfWorkImpl.mergeClonesAfterCompletion() - I moved the different thread code block back to UnitOfWorkImpl from MergeManager - I assume that the activeThread != null After James/Gordon's suggestion did the following MergeManager change - cache the currentThread as MergeManager.lockThread on the 3 acquire calls in WriteLockManager - if this thread is different in uow.mergeClonesAfterCompletion - change the activeThread on all lockManagers 2) UnitOfWorkImpl.clearForClose() - Andrei's suggestion that we still clear the identityMaps has been added as a "Partial clear()" when the clearForClose flag is true I split out the part of clear() that clears the identityMaps and run this from clear() as well as clearForClose() in our *Pending state case. Test results: ------------- With the 2nd last patch https://bugs.eclipse.org/bugs/attachment.cgi?id=125187&action=diff ant server-test-relationships [junit] Tests run: 30, Failures: 0, Errors: 0, Time elapsed: 39.021 sec ant server-test-fieldaccess-advanced [junit] Tests run: 203, Failures: 0, Errors: 2, Time elapsed: 77.011 sec With the current patch (we get 1 extra error in test-fieldaccess-advanced) ant server-test-relationships [junit] Tests run: 30, Failures: 0, Errors: 0, Time elapsed: 44.788 sec ant server-test-fieldaccess-advanced [junit] Tests run: 203, Failures: 0, Errors: 3, Time elapsed: 89.341 sec thank you /michael
Created attachment 125441 [details] eclipselink core patch (code review 3b) - mergeManager NPE check on was7 afterCompletion callback This patch is the same as the previous version except for a null check in uow.mergeClonesAfterCompletion() >The null pointer exception was on a WAS7 callback to afterCompletion where the mergeManager was null - I added back the !null check. WebSphere 7 Tests now match the results we have for WebSphere 6.1 [junit] Running org.eclipse.persistence.testing.tests.jpa.relationships.ServerTestSuite [junit] Tests run: 30, Failures: 0, Errors: 0, Time elapsed: 97.437 sec and [junit] Running org.eclipse.persistence.testing.tests.jpa.fieldaccess.advanced.ServerTestSuite [junit] Tests run: 203, Failures: 0, Errors: 2, Time elapsed: 164.829 sec
Created attachment 125455 [details] eclipselink core patch (code review 4) - 1 of 3 sets of lockThread kept in writeLockManager.acquire*() functions Code Review: Gordon/Andrei 1) We only require the lockThread set on acquires that are involved in inserts/updates and not reads. Keep: public void acquireRequiredLocks(MergeManager mergeManager, UnitOfWorkChangeSet changeSet) Remove: public Map acquireLocksForClone(Object objectForClone, ClassDescriptor descriptor, CacheKey cacheKey, AbstractSession session, UnitOfWorkImpl unitOfWork) public CacheKey acquireLockAndRelatedLocks(Object objectForClone, Map lockedObjects, Map refreshedObjects, CacheKey cacheKey, ClassDescriptor descriptor, AbstractSession session, UnitOfWorkImpl unitOfWork) 2) Of the 3 lifecycle states that require us to do a partial clear() that involves only the cache - the CommitTransactionPending==2 state may not be required. However, we will never reach lifecycle state 2 because the entityManager.clear(true) call in the first part of the if is the only part that calls clearForClose(true). If we are in a transaction then the clearForClose() code will not be called. We are ok with/without the check on lifecyle state 2==CommitTransactionPending. The other two CommitPending and MergePending are still required Testing: ok [junit] Running org.eclipse.persistence.testing.tests.jpa.relationships.ServerTestSuite [junit] Tests run: 30, Failures: 0, Errors: 0, Time elapsed: 98.124 sec [junit] Running org.eclipse.persistence.testing.tests.jpa.fieldaccess.advanced.ServerTestSuite [junit] Tests run: 203, Failures: 0, Errors: 2, Time elapsed: 162.483 sec
Code review #4 complete - ready for checkin as soon as I verify JPA/Core LRG results - these tests are in progress today and tomorrow JPA-LRG tests results: I am not getting 0,0 results for the non-server SE tests - triaging part1 = clearForClose() clear only cache when in state *Pending part2 = switch activeThread during mergeClonesAfterCompletion() to acquired thread saved on MergeManager Triaging: IBM J9 JVM test combinations |clean | w/part1 | w/part2 | all changes ------------+--------------------------------------- branch 1.1 |?pend | 36,152 | 36,152 | 36,153 trunk |8,18 | 8,17 | ?pend | 8,17 branch 1.1 compiled and run under IBM J9 (with last patch) ---------------------------------------------------------- >ant test-jpa [junit] Tests run: 1064, Failures: 36, Errors: 156, Time: 1,871.345 sec >after refresh to rev 3439 without recompile [junit] Tests run: 1064, Failures: 36, Errors: 153, Time: 1,478.587 sec >after switch to local db, and log back to info >after compile under sun jdk [junit] Tests run: 1064, Failures: 36, Errors: 157, Time: 730.414 sec >comment out part 2 of fix (thread switch) >recompile under ibm j9 [junit] Tests run: 1064, Failures: 36, Errors: 152, Time: 940.399 sec >comment out part 1 of fix (clearForClose) only, reenable part 2 [junit] Tests run: 1064, Failures: 36, Errors: 152, Time: 896.875 sec >comment out everything ..pending - expect the same 36,152 trunk 2.0 clean 20090211: compiled and run under IBM J9 (rev 3439 (no modifications)) --------------------------------------------------------------------------------- >ant test-jpa [junit] Tests run: 1087, Failures: 8, Errors: 18, Time: 1,829.168 sec >after switch to local db, and log back to info >after compile under sun jdk - new issues with jaxp compiler path >switch back to ibm - recompile [junit] Tests run: 1087, Failures: 9, Errors: 18, Time: 964.189 sec >adding part 1 of fix (clearForClose) only >recompile under ibm j9 [junit] Tests run: 1087, Failures: 8, Errors: 17, Time: 1,006.544 sec >add only part 2 of fix (thread switch) ..pending - expect the same 8,17 >add full fix to trunk [junit] Tests run: 1087, Failures: 8, Errors: 17, Time: 978.588 sec summary: -------- an aside-->switching the database from a hosted server to a local (a min dual-core) or in my case 2nd pc on the same subnet - cuts transaction times in half - the tests run in 15 minn instead of 30 From the results it looks like there is no new failures on the [trunk] branch with these changes. The 1.1 branch is having issues with numbers that are not related to these 2 changes - testing core LRG tomorrow
New tests numbers - JPA/Core LRG ok 1.1 (with local db clean) New tests in jpa/advanced/concurrency/LifecycleJUnitTest JPA LRG Without fix (- expected errors on new LifecycleJUnitTest suite if fix is disabled) [junit] Tests run: 1071, Failures: 0, Errors: 3, Time elapsed: 896.499 sec With fix [junit] Tests run: 1071, Failures: 0, Errors: 3, Time elapsed: 935.753 sec Core LRG With/Without fix [junit] Tests run: 6647, Failures: 0, Errors: 0, Time elapsed: 1,523.895 sec
>ant server-test-lrg WebSphere 7.0 Total time: 35 minutes 11 seconds 1045 0 28 97.32% 508.332 Name Tests Errors Failures Time(s) Time Stamp Host org.eclipse.persistence.testing.tests.jpa.advanced 248 6 0 94.782 2009-02-12T19:31:08 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.datatypes 17 0 0 4.399 2009-02-12T19:32:41 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.datetime 7 0 0 2.943 2009-02-12T19:34:03 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.ddlgeneration 9 1 0 6.633 2009-02-12T19:35:26 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.fieldaccess.advanced 203 2 0 95.392 2009-02-12T19:36:56 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.fieldaccess.relationships 16 0 0 12.135 2009-02-12T19:40:01 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.inheritance 41 1 0 35.441 2009-02-12T19:41:37 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.inherited 51 1 0 12.866 2009-02-12T19:43:37 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.jpaadvancedproperties 10 10 0 2.628 2009-02-12T19:45:13 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.jpql 273 0 0 145.442 2009-02-12T19:46:44 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.lob 5 1 0 2.145 2009-02-12T19:50:38 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.relationships 30 0 0 49.495 2009-02-12T19:52:07 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.sessionbean 5 2 0 8.675 2009-02-12T19:54:27 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.timestamptz 5 0 0 1.758 2009-02-12T20:01:47 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.validation 3 0 0 3.767 2009-02-12T19:56:01 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.xml 108 4 0 23.578 2009-02-12T19:59:38 mfobrien-PC3 org.eclipse.persistence.testing.tests.jpa.xml.merge.inherited 14 0 0 6.253 2009-02-12T19:57:46 mfobrien-PC3 >results: we have the same # of total errors+failures (28 instead of 27) but they are all errors in WAS 7.0 All but one of these failure to error transitions are due to new serialization requirements on WAS... WebSphere 6.1 (for comparison) http://stame02.us.oracle.com/eclipselink/1.1.0-release/2009-02-10/server-test-lrg-websphere-6.1.html 1045 6 21 97.42% 544.358 Name Tests Errors Failures Time(s) Time Stamp Host org.eclipse.persistence.testing.tests.jpa.advanced 248 3 2 79.121 2009-02-10T20:14:20 stame10 org.eclipse.persistence.testing.tests.jpa.datatypes 17 0 0 5.182 2009-02-10T20:15:57 stame10 org.eclipse.persistence.testing.tests.jpa.datetime 7 0 0 3.400 2009-02-10T20:17:25 stame10 org.eclipse.persistence.testing.tests.jpa.ddlgeneration 9 1 0 8.118 2009-02-10T20:19:02 stame10 org.eclipse.persistence.testing.tests.jpa.fieldaccess.advanced 203 0 2 104.429 2009-02-10T20:20:38 stame10 org.eclipse.persistence.testing.tests.jpa.fieldaccess.relationships 16 0 0 15.935 2009-02-10T20:23:48 stame10 org.eclipse.persistence.testing.tests.jpa.inheritance 41 0 1 41.463 2009-02-10T20:25:34 stame10 org.eclipse.persistence.testing.tests.jpa.inherited 51 0 1 22.055 2009-02-10T20:27:41 stame10 org.eclipse.persistence.testing.tests.jpa.jpaadvancedproperties 10 10 0 5.170 2009-02-10T20:29:32 stame10 org.eclipse.persistence.testing.tests.jpa.jpql 273 0 0 139.433 2009-02-10T20:31:06 stame10 org.eclipse.persistence.testing.tests.jpa.lob 5 1 0 4.359 2009-02-10T20:34:50 stame10 org.eclipse.persistence.testing.tests.jpa.relationships 30 0 0 48.656 2009-02-10T20:36:19 stame10 org.eclipse.persistence.testing.tests.jpa.sessionbean 5 2 0 13.413 2009-02-10T20:38:33 stame10 org.eclipse.persistence.testing.tests.jpa.timestamptz 5 0 0 3.860 2009-02-10T20:45:40 stame10 org.eclipse.persistence.testing.tests.jpa.validation 3 0 0 6.590 2009-02-10T20:40:14 stame10 org.eclipse.persistence.testing.tests.jpa.xml 108 4 0 33.863 2009-02-10T20:43:42 stame10 org.eclipse.persistence.testing.tests.jpa.xml.merge.inherited 14 0 0 9.311 2009-02-10T20:41:52 stame10
Created attachment 125662 [details] eclipselink jpa test (extra tests and setup was missing table creation) Added 2 extra SE unit tests that test commit with/without a clear after. I was missing the schema generation in the setup - the first run would fail on missing tables - thank you Guy - setup() change reviewed by Tom. All 5 tests run fine after running setup() to clean the db.
>EclipseLink 1.1 branch fix checked into SVN 3447 http://fisheye2.atlassian.com/changelog/eclipselink/?cs=3447 >With test case additions and changes in... http://fisheye2.atlassian.com/changelog/eclipselink/?cs=3455 http://fisheye2.atlassian.com/changelog/eclipselink/?cs=3461 Trunk commit is pending for Tuesday after the 1.1 branch full regression tests are run on all application servers. A review mail to the dev group for trunk will be sent out before commit. The wiki page will be updated to reflect final changes after the servers come back on line next tue.
Created attachment 125910 [details] EclipseLink full trunk patch (off rev 3439)
Created attachment 125911 [details] EclipseLink full trunk patch (off rev 3439)
Comment on attachment 125910 [details] EclipseLink full trunk patch (off rev 3439) double submitted patch
Patch checked into Trunk in rev 3472 http://fisheye2.atlassian.com/changelog/eclipselink/?cs=3472
The WebSphere 6.1 server tests are reduced from 6 failures, 28 errors to 6 failures, 27 errors with this transaction. org.eclipse.persistence.testing.tests.jpa.advanced | EntityManagerJUnitTestSuite | testGetReferenceUsedInUpdate passes now.
This bug still exists in the case where the transaction rolls back. UnitOfWorkImpl.afterExternalTransactionRollback() does not include the code to update the lock owning threads but it does attempt to release locks. The lock owner updating code should be abstracted into a separate thread. Support to switch deferred locks is needed as well. It could easily be added to this bug or an additional bug could be opened.
I think we need a new bug - as 259993 is fixed as far as WebSphere 7 is concerned regarding a prematurely cleared entitymanager. The 259993 bug never really had anything to do with mismatched lock threads - and it occurs on WebLogic. The 2nd part of the fix for 259993 that we put in as an extra fix along with the base 259993 clearForClose() fix for WAS, does not have a test case and we were not able to reproduce it - so we currently have no way to test any change until we create a unit server test on WebLogic. 1) The change for WebSphere in clearForClose() is reproducable and has server tests. 2) The change for mismatched threads in mergeClonesAfterCompletion() needs a reproduction test case If we reopen the bug I think we should remove the 2nd part of the original fix that is unrelated to this bug - restest on WebSphere 7 , close the bug and reopen a bug specifically for this thread locking issue.
This bug has two different fixes one of which does not need to be updated. A new bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=272022 has been created to isolated the required additional updates.
>See additional workaround for the fact that WebSphere 7 caches the EntityManager Impls and uses clear() instead of close() on them in bug# 316531
The Eclipselink project has moved to Github: https://github.com/eclipse-ee4j/eclipselink