Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 325105

Summary: Websocket connection is not closed on client disconnect, after a while sendMessage blocks
Product: [RT] Jetty Reporter: Tom <tderks>
Component: serverAssignee: Greg Wilkins <gregw>
Status: RESOLVED FIXED QA Contact:
Severity: critical    
Priority: P3 CC: jetty-inbox, mgorovoy
Version: unspecified   
Target Milestone: 7.1.x   
Hardware: PC   
OS: Linux   
Whiteboard:
Attachments:
Description Flags
Stacktrace of exception thrown when server continues after it hung for 10minuts
none
java.net.SocketException: Connection timed out none

Description Tom CLA 2010-09-13 08:08:46 EDT
Build Identifier: 

A websocket connection is not closed after maxIdleTime after a client disconnect.
After a few minutes the socket buffer fills up and sendMessage blocks, hanging the server for all websocket clients.

Reproducible: Always

Steps to Reproduce:
1. Connect alot of clients to the server (eg 10.000)
2. Connect/Disconnect fast, eg by clicking refresh/forward/back in browser (reproducable in 30 seconds of clicking for me)
3. A websocket connection is not closed after maxIdleTime. After like 5 to 10 minutes the server hangs.
Comment 1 Tom CLA 2010-09-13 08:12:23 EDT
Created attachment 178735 [details]
Stacktrace of exception thrown when server continues after it hung for 10minuts
Comment 2 Tom CLA 2010-09-13 08:15:49 EDT
This happens in both jetty 7.1.4.v20100610 and latest 7.2.0-SNAPSHOT
Comment 3 Greg Wilkins CLA 2010-09-14 07:47:52 EDT
I've fixed the handling of calling ondisconnect, at least for idle timeouts.
checked in at r2273

I need to add some unit tests to check for direct closes and sending after close.
Comment 4 Greg Wilkins CLA 2010-09-14 08:09:16 EDT
r2274  added tests for close and sending after ondisconnect. 
All looks good now.
Comment 5 Tom CLA 2010-09-14 09:02:05 EDT
I just tried it with trunk r2274, but i can still reproduce the problem.
It gave a socket timeout instead of a broken pipe exception now though.
Comment 6 Tom CLA 2010-09-14 09:03:08 EDT
Created attachment 178822 [details]
java.net.SocketException: Connection timed out
Comment 7 Greg Wilkins CLA 2010-09-14 20:26:16 EDT
Tom,

does the server still hang?
are you getting ondisconnect calls?

can you perhaps attach a test harness that demonstrates this?
Comment 8 Tom CLA 2010-09-15 06:43:39 EDT
Yes the server still hangs.
Ondisconnect isn't called for the connection which later causes the problem.

I did some further testing and i can only reproduce the problem with a lvs loadbalancer in between(not distributing to multiple servers). Without it connections are instantly closed, probably by sending a proper disconnect from the browser.

With the load balancer inbetween, the connections from the browser are not closed instantly, but after the websocket timeout. However sometimes the connection doesn't timeout causing the problems.

I have no experience with test harnesses, sorry.
Comment 9 Tom CLA 2010-09-15 08:25:44 EDT
Seems to be an issue related to DR(direct return) routing of LVS loadbalancer.
Since traffic from server goes directly to the client and not through the load balancer it doesn't see close commands/ACKS.

I'm not sure if the end result problems are problems with the timeouts in the load balancer (with is not strictly TCP compliant anymore). Or jetty not handling this unusual situation correctly. I don't know if this issue can occur by normal packet loss/network problems(atleast alot more unlikely)