| Summary: | CDO Server fails after losing connection with a client | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Modeling] EMF | Reporter: | Alena Repina <alena> | ||||||||||
| Component: | cdo.core | Assignee: | Project Inbox <emf.cdo-inbox> | ||||||||||
| Status: | CLOSED WORKSFORME | QA Contact: | |||||||||||
| Severity: | normal | ||||||||||||
| Priority: | P3 | CC: | caspar_d, stepper | ||||||||||
| Version: | 4.2 | ||||||||||||
| Target Milestone: | --- | ||||||||||||
| Hardware: | PC | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | |||||||||||||
| Attachments: |
|
||||||||||||
Created attachment 207591 [details]
CDO server configuration
Created attachment 207617 [details]
CDO server configuration
Caspar, you're the expert for reconnecting sessions. Do those include a HeartBeatProtocol channel? Today CDO server failed again and this time I've got an exception in logs:
java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:480)
at java.nio.DirectByteBuffer.getShort(DirectByteBuffer.java:529)
at org.eclipse.net4j.signal.SignalProtocol.handleBuffer(SignalProtocol.java:194)
at org.eclipse.spi.net4j.Channel$ReceiverWork.run(Channel.java:352)
at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:26)
at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:1)
at org.eclipse.net4j.util.concurrent.QueueWorker.doWork(QueueWorker.java:81)
at org.eclipse.net4j.util.concurrent.QueueWorker.work(QueueWorker.java:72)
at org.eclipse.net4j.util.concurrent.Worker$WorkerThread.run(Worker.java:206)
Now I'm not sure that bad connection between CDO server and CDO client really can cause CDO server fail. Nevertheless I'm going to try RecoverySession, you mentioned.
The content of attachment 207617 [details] has been deleted by Eclipse Webmaster <webmaster@eclipse.org> who provided the following reason: Requested by poster The token used to delete this attachment was generated at 2011-11-29 09:23:18 EST. The content of attachment 207591 [details] has been deleted by Eclipse Webmaster <webmaster@eclipse.org> who provided the following reason: Requested by poster The token used to delete this attachment was generated at 2011-11-29 10:27:41 EST. Created attachment 207660 [details]
CDO server configuration
Created attachment 207709 [details]
Last 200 lines of CDO logs
Today CDO server failed again. Reconnecting session didn't help. This is the tail of the latest CDO log.
(In reply to comment #3) > Caspar, you're the expert for reconnecting sessions. Do those include a > HeartBeatProtocol channel? Yes, optionally, see RecoveringCDOSessionImpl.createTCPConnector. It is configured by the RecoveringCDOSessionConfiguration instance, see #setHeartBeatEnabled(boolean) there. (In reply to comment #4) > Today CDO server failed again and this time I've got an exception in logs: > java.nio.BufferUnderflowException This *can* be a follow-up problem after your TCP connection has timed out. > Now I'm not sure that bad connection between CDO server and CDO client really > can cause CDO server fail. No that shouldn't make the server fail, only the client with the outtimed connection should of course fail. What makes you think that the server fails? I'm not sure what caused server fail. Usually I start CDO server at morning and I find that CDO server process is down next morning. Every time log ends with something like that: Connection-Keep-Alive-DBStore@5 [debug] DB connection keep-alive task activated Usually I don't see any exceptions in log, unfortunately. You can add a LifecycleEventAdapter via repository.addListener() and set a breakpoint in doDeactivate() to see who's causing the deactivation. Moving all open issues to 4.2. Open bugs can be ported to 4.1 maintenance after they've been fixed in master. No activity or ping here for a year. Please reopen this bug if you feel a need. |
Build Identifier: I've got CDO server on Linux machine and CDO client on MacOS X. I regularly find CDO server failed after night. Tail of the CDO log contains following message: Socket channel closed: java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] It says that channel is closed between CDO server (10.120.68.101:2036) and CDO client (10.253.26.231:58722) despite they are connected. After that TCP connection is deactivated and server fails. I attached CDO server configuration and there's the tail of the CDO logs: Thread-7 [debug] Ordering server operation INTEREST WRITE java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] = true TCPSelector [debug] Executing server operation INTEREST WRITE java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] = true TCPSelector [debug] Setting interest READ|WRITE (was read) TCPSelector [debug] Writing java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] TCPSelector [debug.buffer] Writing 5 bytes (EOS) 00 00 00 65 01 TCPSelector [debug.buffer] Retaining Buffer@50[RELEASED] TCPSelector [debug] Ordering server operation INTEREST WRITE java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] = false TCPSelector [debug] Executing server operation INTEREST WRITE java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] = false TCPSelector [debug] Setting interest READ (was read|write) TCPSelector [debug] Reading java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] TCPSelector [debug.buffer] Obtained Buffer@47[INITIAL] TCPSelector [debug.buffer] Retaining Buffer@47[RELEASED] TCPSelector [debug] Socket channel closed: java.nio.channels.SocketChannel[connected local=/10.120.68.101:2036 remote=/10.253.26.231:58722] Thread-8 [debug.lifecycle] Deactivating TCPServerConnector[10.253.26.231:58,722] Thread-8 [debug.lifecycle] Deactivating Channel[Control, SERVER] Thread-8 [debug.lifecycle] Deactivating ChannelReceiveSerializer@41 Thread-8 [debug.connector] Setting state DISCONNECTED (was connected) for TCPServerConnector[null:0] Thread-8 [debug.lifecycle] Deactivating Channel[1, SERVER, cdo] Thread-8 [debug.lifecycle] Deactivating ChannelReceiveSerializer@45 Thread-8 [debug.lifecycle] Deactivating SignalProtocol[cdo] Thread-8 [debug.lifecycle] Deactivating Session[2] Thread-8 [debug.acceptor] Removed connector TCPServerConnector[null:0] Connection-Keep-Alive-DBStore@5 [debug] DB connection keep-alive task activated Reproducible: Always