Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 331878

Summary: [Discovery][Zookeeper] Zookeeper unit tests crash the VM with system exit 10 due to FileNotFoundException
Product: [RT] ECF Reporter: Markus Kuppe <bugs.eclipse.org>
Component: ecf.providersAssignee: ecf.core-inbox <ecf.core-inbox>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: ahmed.aadel, wim.jongman
Version: 3.5.0   
Target Milestone: 3.5.0   
Hardware: All   
OS: All   
Whiteboard:
Attachments:
Description Flags
mylyn/context/zip
none
zoodiscovery Test Patch none

Description Markus Kuppe CLA 2010-12-05 14:42:13 EST
2010-12-05 20:32:05,033 - FATAL [pool-1-thread-3:ZooKeeperServer@262] - Severe unrecoverable error, exiting
java.io.FileNotFoundException: /tmp/zookeeperData/version-2/snapshot.0 (No such file or directory)
	at java.io.FileOutputStream.open(Native Method)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:179)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:131)
	at org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:224)
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:211)
	at org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:260)
	at org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:255)
	at org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:366)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer.startStandAlone(ZooDiscoveryContainer.java:185)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer$3.run(ZooDiscoveryContainer.java:164)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)

Calling System.exit() in org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot() is IMO not appropriate for a library bundle. It might be best to prevent Zookeeper from exiting by setting a special security manager
Comment 1 Markus Kuppe CLA 2010-12-05 15:12:04 EST
The following code snippet fixes the crash for me:

diff --git a/providers/bundles/org.eclipse.ecf.provider.zookeeper/src/org/eclipse/ecf/provider/zookeeper/core/ZooDiscoveryContainer.java b/providers/bundles/org.eclipse.ecf.provi
index 7c07073..657455f 100644
--- a/providers/bundles/org.eclipse.ecf.provider.zookeeper/src/org/eclipse/ecf/provider/zookeeper/core/ZooDiscoveryContainer.java
+++ b/providers/bundles/org.eclipse.ecf.provider.zookeeper/src/org/eclipse/ecf/provider/zookeeper/core/ZooDiscoveryContainer.java
@@ -180,6 +180,13 @@ public class ZooDiscoveryContainer extends AbstractDiscoveryContainerAdapter {
                else if (this.zooKeeperServer != null
                                && !this.zooKeeperServer.isRunning())
                        try {
+                               ZooDiscoveryContainer.this.zooKeeperServer = new ZooKeeperServer();
+                               FileTxnSnapLog fileTxnSnapLog = new FileTxnSnapLog(conf
+                                               .getZookeeperDataFile(), conf.getZookeeperDataFile());
+                               ZooDiscoveryContainer.this.zooKeeperServer
+                                               .setTxnLogFactory(fileTxnSnapLog);
+                               ZooDiscoveryContainer.this.zooKeeperServer.setTickTime(conf
+                                               .getTickTime());
                                this.zooKeeperServer.startup();
                                return;
                        } catch (Exception e) {

Wim/Achmed are you going to look into it?
Comment 2 Markus Kuppe CLA 2010-12-05 15:12:18 EST
Created attachment 184563 [details]
mylyn/context/zip
Comment 3 Wim Jongman CLA 2010-12-05 17:11:33 EST
I have seen my Yazafatutu.com zookeeper crashing with this bug once. Since then it has made over a million discoveries without a hitch. I contacted the Zookeeper guys with this and they blamed the /tmp location of zookeepers discovery files. I suggest we first change the location where ZD stores it files and then see if this fixes thing.

The parameter to add is -Dzoodiscovery.dataDir='/some/place/less/volatile'
Comment 5 Markus Kuppe CLA 2010-12-06 03:10:06 EST
(In reply to comment #3)
> I have seen my Yazafatutu.com zookeeper crashing with this bug once. Since then
> it has made over a million discoveries without a hitch. I contacted the
> Zookeeper guys with this and they blamed the /tmp location of zookeepers
> discovery files. I suggest we first change the location where ZD stores it
> files and then see if this fixes thing.
> 
> The parameter to add is -Dzoodiscovery.dataDir='/some/place/less/volatile'

What is so different about /tmp that it would cause Zookeeper to crash? I'm rather inclined to assume that the default handling in either Zookeeper or ZooDiscovery is off. The bug disappears if the default gets overwritten by the property, and crashes otherwise.
Tried to set zoodiscovery.dataDir to e.g. /tmp/zookeeper?
Comment 6 Wim Jongman CLA 2010-12-06 03:59:50 EST
> What is so different about /tmp that it would cause Zookeeper to crash? I'm

I have a hard time with this argument as well. Could there be processes cleaning /tmp? 

> rather inclined to assume that the default handling in either Zookeeper or
> ZooDiscovery is off. 

The default handling of what? What do you mean by "off"? The handling of the dataDir?

> The bug disappears if the default gets overwritten by the
> property, and crashes otherwise.

You mean that the property setting fixed it?
Comment 7 Markus Kuppe CLA 2010-12-06 07:34:14 EST
(In reply to comment #6)
> > What is so different about /tmp that it would cause Zookeeper to crash? I'm
> 
> I have a hard time with this argument as well. Could there be processes
> cleaning /tmp? 
> 
> > rather inclined to assume that the default handling in either Zookeeper or
> > ZooDiscovery is off. 
> 
> The default handling of what? What do you mean by "off"? The handling of the
> dataDir?

default == no system property is set
off == broken

> > The bug disappears if the default gets overwritten by the
> > property, and crashes otherwise.
> 
> You mean that the property setting fixed it?

Setting the property causes a NPE. Apparently whatever it is set to, Zoodiscovery always prepends "/tmp". E.g. zoodiscovery.datadir=/home/markus ends up being /tmp/home/markus
Comment 8 Ahmed Aadel CLA 2010-12-06 09:27:18 EST
Actually, zoodiscovery should be working without setting the dir where data and log are stored. It defaults to temp dir.
Looking at the stack trace (the first post) indicates that zoodiscovery has sucessfullyv(by calling Configuration.configure()) written data to the default directoty  /tmp/zookeeperData/, then when trying to start a stand-alone server, the time when it indeed needs reading that same data to feed underlying zookeeper, that file (or whole dir) is no more existent. I think it is swept away while still being pointed (which let me wonder!).   

Anyway, the default temp directory can be set using system property "zoodiscovery.tempDir"
Comment 9 Markus Kuppe CLA 2010-12-06 09:36:27 EST
(In reply to comment #8)
> Actually, zoodiscovery should be working without setting the dir where data and
> log are stored. It defaults to temp dir.
> Looking at the stack trace (the first post) indicates that zoodiscovery has
> sucessfully(by calling Configuration.configure()) written data to the default
> directoty  /tmp/zookeeperData/, then when trying to start a stand-alone server,
> the time when it indeed needs reading that same data to feed underlying
> zookeeper, that file (or whole dir) is no more existent. I think it is swept
> away while still being pointed (which let me wonder!).   

If the folder gets indeed deleted, it's deleted by ZooDiscovery/Zookeeper itself.
Comment 10 Ahmed Aadel CLA 2010-12-06 10:37:01 EST
To narrow possibilities, please try setting only the temp directory using "zoodiscovery.tempDir".
Comment 11 Markus Kuppe CLA 2010-12-06 14:16:51 EST
(In reply to comment #10)
> To narrow possibilities, please try setting only the temp directory using
> "zoodiscovery.tempDir".

Setting the property, all unit tests crash with:

java.lang.NullPointerException
	at org.eclipse.ecf.provider.zookeeper.core.internal.Configuration.clean(Configuration.java:215)
	at org.eclipse.ecf.provider.zookeeper.core.internal.Configuration.configure(Configuration.java:77)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer.init(ZooDiscoveryContainer.java:123)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer.registerService(ZooDiscoveryContainer.java:400)
	at org.eclipse.ecf.tests.discovery.DiscoveryTest.registerService(DiscoveryTest.java:65)
	at org.eclipse.ecf.tests.discovery.DiscoveryTest.addListenerRegisterAndWait(DiscoveryTest.java:82)
	at org.eclipse.ecf.tests.discovery.DiscoveryTest.addServiceListener(DiscoveryTest.java:94)
	at org.eclipse.ecf.tests.discovery.DiscoveryTest.testAddServiceListenerIServiceListener(DiscoveryTest.java:213)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at junit.framework.TestCase.runTest(TestCase.java:168)
	at junit.framework.TestCase.runBare(TestCase.java:134)
	at junit.framework.TestResult$1.protect(TestResult.java:110)
	at junit.framework.TestResult.runProtected(TestResult.java:128)
	at junit.framework.TestResult.run(TestResult.java:113)
	at junit.framework.TestCase.run(TestCase.java:124)
	at junit.framework.TestSuite.runTest(TestSuite.java:232)
	at junit.framework.TestSuite.run(TestSuite.java:227)
	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.pde.internal.junit.runtime.RemotePluginTestRunner.main(RemotePluginTestRunner.java:62)
	at org.eclipse.pde.internal.junit.runtime.CoreTestApplication.run(CoreTestApplication.java:23)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethodWithException(EclipseAppContainer.java:587)
	at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:198)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:619)
	at org.eclipse.equinox.launcher.Main.basicRun(Main.java:574)
	at org.eclipse.equinox.launcher.Main.run(Main.java:1407)
	at org.eclipse.equinox.launcher.Main.main(Main.java:1383)
Comment 12 Ahmed Aadel CLA 2010-12-09 06:44:57 EST
Frankly, I can't reproduce your case. I gave the tests other shots, everything "seems" to be fine. I tested with some of the nicely covering examples Wim provided as well:
org.eclipse.ecf.examples.remoteservices.hello.host and its consumer, with host configured with a system property pointing to the directory zooDiscovery should
create its needed files:
In my Windows os case, this was some path like this: 
-Dzoodiscovery.tempDir="C:\Documents and Settings\name\Bureaublad\workingspace"
Indeed zooDiscovery did create the needed files:
-C:\Documents and Settings\name\Bureaublad\workingspace\
|__zookeeperData (whose name can be changed using: "zoodiscovery.dataDir")
 +__version-2 (where data and log goes)
 |__ zoo (configurational file)

What I can say is, when zooDiscovery is configuring a zooKeeperServer to run, and find out that the data directory (e.g. zookeeerData in above case which is the default name or other name given by system property "zoodiscovery.dataDir") already exists, it cleans it and reuse it. A zooKeeper server is always started locally  except when using configurational falvor:  "zoodiscovery.flavor.centralized" with its value pointing to another IP than the local one, making the local zooDiscovery just a client.

All in all, 
- "zoodiscovery.tempDir" a property to  configure a "temp" directory other than the default one designated by environment variable "java.io.tmpdir". From the case  above this is: "C:\Documents and Settings\name\Bureaublad\workingspace"
- "zoodiscovery.dataDir to configure a directory where working data is set. In the above case it was not set, so it defaults to "zookeeperData".

Hope this helps.
Comment 13 Wim Jongman CLA 2010-12-09 07:36:41 EST
(In reply to comment #12)
Could this be the problem that the ZD tests are running in parallel or that one finishes after the other test starts?
Comment 14 Markus Kuppe CLA 2010-12-09 08:44:58 EST
(In reply to comment #13)
> (In reply to comment #12)
> Could this be the problem that the ZD tests are running in parallel or that one
> finishes after the other test starts?

The test framework runs tests sequentially. So unless ZooDiscovery cleans up asynchronously, it don't see how a race condition could be the cause.
Comment 15 Markus Kuppe CLA 2010-12-09 08:46:43 EST
(In reply to comment #12)
> Frankly, I can't reproduce your case. I gave the tests other shots, everything
> "seems" to be fine. I tested with some of the nicely covering examples Wim

[...]

> In my Windows os case, this was some path like this: 

So this issue might only come up on Linux/Unix? It occurs on my local Linux machine as well as the "official" build machine (which happens to run Linux too) [0].

[0] https://build.ecf-project.org/hudson/job/C-HEAD-discovery.zookeeper.feature/
Comment 16 Wim Jongman CLA 2010-12-09 10:33:45 EST
(In reply to comment #15)
> (In reply to comment #12)
> > Frankly, I can't reproduce your case. I gave the tests other shots, everything
> > "seems" to be fine. I tested with some of the nicely covering examples Wim
> 
> [...]
> 

what does [...] mean?
Comment 17 Markus Kuppe CLA 2010-12-09 10:37:00 EST
(In reply to comment #16)
> what does [...] mean?

Only that I left something from the original post out in the quote.
Comment 18 Wim Jongman CLA 2010-12-09 11:07:51 EST
Does it fail every time? It seems to be included in the 3.4 release and we did not make any changes after that. 

Also, was the 3.4 build on the old builder or already on this one? If the new machine is much faster, that could explain the problem?

I do not see any references in zookeeper for an async cleanup. I am running the build and keeping logs to see if it always fails at the same place.
Comment 19 Ahmed Aadel CLA 2010-12-09 11:41:42 EST
(In reply to comment #15)
> (In reply to comment #12)
> > Frankly, I can't reproduce your case. I gave the tests other shots, everything
> > "seems" to be fine. I tested with some of the nicely covering examples Wim
> 
> [...]
> 
> > In my Windows os case, this was some path like this: 
> 
> So this issue might only come up on Linux/Unix? It occurs on my local Linux
> machine as well as the "official" build machine (which happens to run Linux
> too) [0].
> 
> [0]
> https://build.ecf-project.org/hudson/job/C-HEAD-discovery.zookeeper.feature/

That is a guess I cannot afford. 

To factor things out, I think of making (for test purposes) any starting zooDiscovery instance use a unique data directory path (therefore, no explicit clean up) so we can see whether the problem (files being deleted while still in use)get solved. I'll try and make a test-patch of it so that you (Markus, please) can test-apply it on your side. (as I can't reproduce the case.) What do you think Wim?

By the way, what has changed recently that could have made same green tests suddenly break? Anyway, this is a good case to solve and make zooDiscovery even more robust :)
Comment 20 Wim Jongman CLA 2010-12-09 12:16:44 EST
Hi Ahmed, yes, sure can you run the test locally to see what happens?
Comment 21 Markus Kuppe CLA 2010-12-09 16:06:12 EST
(In reply to comment #19)

> To factor things out, I think of making (for test purposes) any starting
> zooDiscovery instance use a unique data directory path (therefore, no explicit
> clean up) so we can see whether the problem (files being deleted while still in
> use)get solved. I'll try and make a test-patch of it so that you (Markus,
> please) can test-apply it on your side. (as I can't reproduce the case.)

Sure, will try the patch.

Btw. when I debugged zoodiscovery, it looked like disconnected ZooDiscoveryContainer instances were still in use by the unit tests (e.g. calling doStart(...) on those instances).
Comment 22 Ahmed Aadel CLA 2010-12-14 06:30:10 EST
Created attachment 185123 [details]
zoodiscovery Test Patch 

Hi Markus, would you please give this patch a shot. I tested it locally and it remains fine.
(zoodiscovery Test Patch -- attached)
Comment 23 Markus Kuppe CLA 2010-12-14 11:28:07 EST
Ahmed, with your patch the FNFE and the subsequent crash is replaced by:

!ENTRY org.eclipse.osgi 4 0 2010-12-14 17:21:54.560
!MESSAGE An unexpected runtime error has occurred.
!STACK 0
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.eclipse.ecf.provider.zookeeper.core.internal.Configuration.<init>(Configuration.java:63)
	at org.eclipse.ecf.provider.zookeeper.core.internal.Configuration.<init>(Configuration.java:55)
	at org.eclipse.ecf.provider.zookeeper.core.internal.Configurator.createConfig(Configurator.java:41)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer.init(ZooDiscoveryContainer.java:122)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer.connect(ZooDiscoveryContainer.java:327)
	at org.eclipse.ecf.provider.discovery.CompositeDiscoveryContainer.addContainer(CompositeDiscoveryContainer.java:387)
	at org.eclipse.ecf.internal.provider.discovery.Activator$1.getService(Activator.java:108)
	at org.eclipse.osgi.internal.serviceregistry.ServiceUse$1.run(ServiceUse.java:123)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.eclipse.osgi.internal.serviceregistry.ServiceUse.getService(ServiceUse.java:121)
	at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.getService(ServiceRegistrationImpl.java:468)
	at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.getService(ServiceRegistry.java:468)
	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.getService(BundleContextImpl.java:594)
	at org.osgi.util.tracker.ServiceTracker.addingService(ServiceTracker.java:450)
	at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:979)
	at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:1)
	at org.osgi.util.tracker.AbstractTracked.trackAdding(AbstractTracked.java:262)
	at org.osgi.util.tracker.AbstractTracked.trackInitial(AbstractTracked.java:185)
	at org.osgi.util.tracker.ServiceTracker.open(ServiceTracker.java:348)
	at org.osgi.util.tracker.ServiceTracker.open(ServiceTracker.java:283)
	at org.eclipse.ecf.tests.discovery.Activator.getDiscoveryLocator(Activator.java:73)
	at org.eclipse.ecf.tests.discovery.DiscoveryServiceTest.getDiscoveryLocator(DiscoveryServiceTest.java:49)
	at org.eclipse.ecf.tests.discovery.AbstractDiscoveryTest.setUp(AbstractDiscoveryTest.java:88)
	at org.eclipse.ecf.tests.discovery.DiscoveryServiceTest.setUp(DiscoveryServiceTest.java:41)
	at junit.framework.TestCase.runBare(TestCase.java:132)
	at junit.framework.TestResult$1.protect(TestResult.java:110)
	at junit.framework.TestResult.runProtected(TestResult.java:128)
	at junit.framework.TestResult.run(TestResult.java:113)
	at junit.framework.TestCase.run(TestCase.java:124)
	at junit.framework.TestSuite.runTest(TestSuite.java:232)
	at junit.framework.TestSuite.run(TestSuite.java:227)
	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.pde.internal.junit.runtime.RemotePluginTestRunner.main(RemotePluginTestRunner.java:62)
	at org.eclipse.pde.internal.junit.runtime.CoreTestApplication.run(CoreTestApplication.java:23)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethodWithException(EclipseAppContainer.java:587)
	at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:198)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:621)
	at org.eclipse.equinox.launcher.Main.basicRun(Main.java:576)
	at org.eclipse.equinox.launcher.Main.run(Main.java:1409)
	at org.eclipse.equinox.launcher.Main.main(Main.java:1385)
Comment 24 Markus Kuppe CLA 2010-12-14 11:43:49 EST
The parameter for org.eclipse.ecf.provider.zookeeper.core.internal.Configuration.Configuration(String) is [org.eclipse.ecf.provider.discovery.CompositeDiscoveryContainer]. Which indicates that this happens only if ZooDiscovery gets used in combination with the CompositeDiscoveryContainer.
Without CDC deployed, the following error appears on the console:

java.io.IOException: Not able to find valid snapshots in /tmp/zdd4f4bf67803fa4993b0fced4470066fa3/version-2
	at org.apache.zookeeper.server.persistence.FileSnap.deserialize(FileSnap.java:104)
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:124)
	at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
	at org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:240)
	at org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:366)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer.startStandAlone(ZooDiscoveryContainer.java:183)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer$3.run(ZooDiscoveryContainer.java:162)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
java.net.BindException: Address already in use
	at sun.nio.ch.Net.bind(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
	at org.apache.zookeeper.server.NIOServerCnxn$Factory.<init>(NIOServerCnxn.java:145)
	at org.apache.zookeeper.server.NIOServerCnxn$Factory.<init>(NIOServerCnxn.java:126)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer.startStandAlone(ZooDiscoveryContainer.java:199)
	at org.eclipse.ecf.provider.zookeeper.core.ZooDiscoveryContainer$3.run(ZooDiscoveryContainer.java:162)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Comment 25 Markus Kuppe CLA 2010-12-17 10:12:09 EST
Since the patch prevents ZooDiscovery from crashing the runtime, I have committed the patch to master:
http://git.eclipse.org/c/ecf/org.eclipse.ecf.git/commit/?id=e48cd9ad87ec64fa1ed442624302caaf0c50f69d

Leaving the bug open for Ahmed to decide if this fully addresses the issue.
Comment 26 Ahmed Aadel CLA 2010-12-28 07:10:40 EST
Thanks Markus. Please keep this bug open. I'm not confident yet about the patch's soundness/completeness looking at the following:

<omitted lines from comment #23>
>Ahmed, with your patch the FNFE and the subsequent crash is replaced by:
java.lang.ArrayIndexOutOfBoundsException: 1
</omitted lines>

<omitted lines from comment #24>
java.io.IOException: Not able to find valid snapshots in
/tmp/zdd4f4bf67803fa4993b0fced4470066fa3/version-2
</omitted lines>

For the record, could you please provide some bug whereabouts. I mean os, test bundle, paramaters used..
Comment 27 Markus Kuppe CLA 2010-12-28 08:46:02 EST
(In reply to comment #26)
> Thanks Markus. Please keep this bug open. I'm not confident yet about the
> patch's soundness/completeness looking at the following:
> 
> <omitted lines from comment #23>
> >Ahmed, with your patch the FNFE and the subsequent crash is replaced by:
> java.lang.ArrayIndexOutOfBoundsException: 1
> </omitted lines>
> 
> <omitted lines from comment #24>
> java.io.IOException: Not able to find valid snapshots in
> /tmp/zdd4f4bf67803fa4993b0fced4470066fa3/version-2
> </omitted lines>
> 
> For the record, could you please provide some bug whereabouts. I mean os, test
> bundle, paramaters used..

Still same environment. Ubuntu Linux 10.10 x86_64 with Java 1.6 and the checked-in Zookeeper .launch.
Comment 28 Wim Jongman CLA 2011-02-15 15:38:16 EST
(In reply to comment #27)

Any news on this. Is the patch holding?
Comment 29 Wim Jongman CLA 2011-03-04 03:23:20 EST
Hi, I see no more problems in the unit test. Setting to fixed. 
thanks Markus and Ahmed.