Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 350649 - Offline test and example failures
Summary: Offline test and example failures
Status: CLOSED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: cdo.core (show other bugs)
Version: 4.1   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Caspar D. CLA
QA Contact: Eike Stepper CLA
URL:
Whiteboard:
Keywords:
Depends on: 349804
Blocks:
  Show dependency tree
 
Reported: 2011-06-29 01:17 EDT by Caspar D. CLA
Modified: 2012-09-21 07:16 EDT (History)
2 users (show)

See Also:
stepper: review+


Attachments
Patch v1 (12.26 KB, patch)
2011-06-29 06:46 EDT, Caspar D. CLA
no flags Details | Diff
Patch v2 (16.86 KB, patch)
2011-06-30 02:19 EDT, Caspar D. CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Caspar D. CLA 2011-06-29 01:17:02 EDT
Many (indeed most) of our offline test are currently failing.

Some issues I've already identified:

1. Expected revision counts are off due to use of test-specific
   resource folder (#getResourcePath).

2. TimestampAuthority.endCommit fails to notify threads suspended
   in waitForCommit.

Investigating further.
Comment 1 Caspar D. CLA 2011-06-29 01:21:03 EDT
3. Something is invoking raw replication for MEMStore, which doesn't
   support raw mode. This causes an endless string of OpNotSupportedEx's,
   which never bubble up to the client, it seems.
Comment 2 Caspar D. CLA 2011-06-29 01:31:39 EDT
@3.. my mistake. I enabled the raw tests for the MEMStore explicitly.
Comment 3 Caspar D. CLA 2011-06-29 01:43:12 EDT
4. An apparent bug in AbstractSyncingTest.checkEvent. The PollingTimeouter
   checks the length of the events array. But if the condition fails the
   first time, how could it ever pass later? The (growing) list of events
   inside the TestListener is converted to an array only once; of course, 
   the length of this array will not change later.
Comment 4 Caspar D. CLA 2011-06-29 04:47:30 EDT
5. Bug 349804
Comment 5 Caspar D. CLA 2011-06-29 06:19:24 EDT
6. In the logic that creates a ChangeSetData instance after raw replication,
   it appears to have been overlooked that the constructed delta could (in
   fact: is likely to) represent a compound change; that is, a change from
   rev m to rev n, where n > m+1. The delta ends up being applied to the
   clone's revision m, raising its revision number with +1, while in fact 
   the state conveyed belongs to a later revision. When the client makes
   later modifications to that revision and commits them, a "attempt to
   modify historical revision" gets thrown.

   This causes the failure of testSynchronizationMasterCloneWithReplication.
Comment 6 Caspar D. CLA 2011-06-29 06:39:30 EDT
@6. Ultimately the client sets the version of the new revision (which
    carries the state conveyed by the delta) by calling adjustForCommit
    (AbstractCDORevision). This is where the +1 assumption is hardcoded.
Comment 7 Caspar D. CLA 2011-06-29 06:46:41 EDT
Created attachment 198812 [details]
Patch v1
Comment 8 Caspar D. CLA 2011-06-29 06:47:27 EDT
Patch v1 addresses issues 1, 2, 3, 4 and 5.
Comment 9 Caspar D. CLA 2011-06-30 02:19:20 EDT
Created attachment 198872 [details]
Patch v2
Comment 10 Caspar D. CLA 2011-06-30 02:21:08 EDT
Patch v2 addresses 1 through 6, as well as:

7. CDOStoreImpl.set calls getRevision where it should use
   getRevisionForReading. The former does not load the revision
   if necessary, so that an NPE arises if a setter is called
   on an object in PROXY state.
Comment 11 Caspar D. CLA 2011-06-30 04:38:26 EDT
There are some problems with the examples as well.

8. After starting up the OfflineExampleMaster, OfflineExampleClone,
   and OfflineExampleClient (in that order), it turns out that the
   clone has returned a rootresourceID of NULL to the client. When
   the client actually tries to perform some ops, an NPE follows.
Comment 12 Eike Stepper CLA 2011-07-02 07:16:19 EDT
Changing to 4.1 to ensure that the fix will "last". Please clone this bugzilla
to 4.0 if you want a maintenance fix, too.
Comment 13 Eike Stepper CLA 2011-07-02 07:29:21 EDT
Note: your changes in TimeStampAuthority will partially conflict with a recent fix from Egidijus. IIRC Egidijus had the same fix for *parts* of your changes.
Comment 14 Eike Stepper CLA 2011-07-02 07:30:56 EDT
Can it be that you forgot to revert your changes in AllTestsMEMOffline? See your comment #2 ...
Comment 15 Eike Stepper CLA 2011-07-02 07:37:47 EDT
Go ahead and commit to *trunk* .

Is it true that we're now down to 16/1 failures out of 34 tests?

What about problem 8 in comment #11 ?
Comment 16 Caspar D. CLA 2011-07-04 01:58:00 EDT
Committed revision 8579.
Comment 17 Caspar D. CLA 2011-07-04 01:59:04 EDT
Cloned for 4.0 as Bug 351046
Comment 18 Caspar D. CLA 2011-07-04 02:04:34 EDT
(In reply to comment #15)

> Is it true that we're now down to 16/1 failures out of 34 tests?

No. I get 0 errors, 1 failure out of 34 tests. And that single failure
only happens when that test is run as part of the suite. When I re-run
it separately, it always passes. :-S

> What about problem 8 in comment #11 ?

I think I caused the problem myself by debugging multiple threads
simultaneously and suspending the thread that was supposed to set the
rootResourceID -- not 100% sure. :S
Comment 19 Caspar D. CLA 2011-07-04 02:08:03 EDT
Cloned for 4.0 as Bug 351046
Comment 20 Eike Stepper CLA 2011-07-05 14:17:50 EDT
Strange, every test case n (n is an odd number) passes but blocks for some while in the teardown process. Every next test case n+1 fails in the setup process:

org.eclipse.emf.cdo.tests.config.impl.ConfigTestException: Error in OfflineTest.testClientCommits [Combined, DBStore: H2 (offline), JVM, Native]
	at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:516)
	at junit.framework.TestResult$1.protect(TestResult.java:110)
	at junit.framework.TestResult.runProtected(TestResult.java:128)
	at junit.framework.TestResult.run(TestResult.java:113)
	at junit.framework.TestCase.run(TestCase.java:124)
	at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:260)
	at junit.framework.TestSuite.runTest(TestSuite.java:243)
	at org.eclipse.emf.cdo.tests.config.impl.ConfigTestSuite$TestWrapper.runTest(ConfigTestSuite.java:126)
	at junit.framework.TestSuite.run(TestSuite.java:238)
	at junit.framework.TestSuite.runTest(TestSuite.java:243)
	at junit.framework.TestSuite.run(TestSuite.java:238)
	at junit.framework.TestSuite.runTest(TestSuite.java:243)
	at junit.framework.TestSuite.run(TestSuite.java:238)
	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.lang.IllegalStateException: Not active: ManagedContainer
	at org.eclipse.net4j.util.lifecycle.LifecycleUtil.checkActive(LifecycleUtil.java:72)
	at org.eclipse.net4j.util.lifecycle.Lifecycle.checkActive(Lifecycle.java:190)
	at org.eclipse.net4j.util.container.ManagedContainer.getElement(ManagedContainer.java:273)
	at org.eclipse.net4j.util.container.ManagedContainer.getElement(ManagedContainer.java:265)
	at org.eclipse.net4j.jvm.JVMUtil.getAcceptor(JVMUtil.java:36)
	at org.eclipse.emf.cdo.tests.config.impl.SessionConfig$Net4j$JVM.getAcceptor(SessionConfig.java:492)
	at org.eclipse.emf.cdo.tests.config.impl.SessionConfig$Net4j.startTransport(SessionConfig.java:253)
	at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.startTransport(ConfigTest.java:279)
	at org.eclipse.emf.cdo.tests.AbstractCDOTest.doSetUp(AbstractCDOTest.java:57)
	at org.eclipse.net4j.util.tests.AbstractOMTest.setUp(AbstractOMTest.java:159)
	at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.setUp(ConfigTest.java:694)
	at org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:213)
	at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:507)
	... 18 more
Comment 21 Eike Stepper CLA 2011-07-05 14:21:52 EDT
This is the stack trace of the main thread when blocking some time:
Thread [main] (Suspended)	
	owns: Scenario  (id=77)	
	waiting for: Worker$WorkerThread  (id=38)	
	Object.wait(long) line: not available [native method]	
	Worker$WorkerThread(Thread).join(long) line: 1194	
	RepositorySynchronizer(Worker).doDeactivate() line: 120	
	RepositorySynchronizer(QueueWorker<E>).doDeactivate() line: 110	
	RepositorySynchronizer.doDeactivate() line: 206	
	RepositorySynchronizer(Lifecycle).deactivate() line: 125	
	RepositoryConfig$OfflineConfig$1(SynchronizableRepository).stopSynchronization() line: 402	
	RepositoryConfig$OfflineConfig$1(SynchronizableRepository).doDeactivate() line: 376	
	RepositoryConfig$OfflineConfig$1(Lifecycle).deactivate() line: 125	
	LifecycleUtil.deactivate(Object, boolean) line: 206	
	LifecycleUtil.deactivate(Object) line: 196	
	AllTestsDBH2Offline$H2Offline$ReusableFolder(RepositoryConfig).deactivateRepositories() line: 318	
	AllTestsDBH2Offline$H2Offline$ReusableFolder(RepositoryConfig$OfflineConfig).deactivateRepositories() line: 558	
	AllTestsDBH2Offline$H2Offline$ReusableFolder(RepositoryConfig).tearDown() line: 284	
	Scenario.tearDown() line: 230	
	OfflineTest(ConfigTest).doTearDown() line: 709	
	OfflineTest(AbstractCDOTest).doTearDown() line: 66	
	OfflineTest(AbstractOMTest).tearDown() line: 180	
	OfflineTest(AbstractOMTest).runBare() line: 224	
	OfflineTest(ConfigTest).runBare() line: 507	
	TestResult$1.protect() line: 110	
	TestResult.runProtected(Test, Protectable) line: 128	
	TestResult.run(TestCase) line: 113	
	OfflineTest(TestCase).run(TestResult) line: 124	
	OfflineTest(AbstractOMTest).run(TestResult) line: 260	
	JUnit3TestReference.run(TestExecution) line: 130	
	TestExecution.run(ITestReference[]) line: 38	
	RemoteTestRunner.runTests(String[], String, TestExecution) line: 467	
	RemoteTestRunner.runTests(TestExecution) line: 683	
	RemoteTestRunner.run() line: 390	
	RemoteTestRunner.main(String[]) line: 197
Comment 22 Eike Stepper CLA 2011-07-05 14:36:00 EDT
Committed revision 8605:
- trunk/plugins/org.eclipse.emf.cdo.tests
- trunk/plugins/org.eclipse.net4j.util
Comment 23 Eike Stepper CLA 2011-07-05 14:36:40 EDT
Okay, I found it. It was a temporary deadlock situation between the main thread and the RepositorSynchronizer thread. After removing the picky synchronized modifiers from all Scenario.java methods I get the same results as you: only 1 failure, and that only if executed in the suite.
Comment 24 Eike Stepper CLA 2011-07-05 14:37:37 EDT
I think we can resolve this for now. Thanks!
Comment 25 Eike Stepper CLA 2012-09-21 07:16:36 EDT
Closing.