Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 356907 - High CPU / Mem usage, BUFFER_OVERFLOW exception thrown
Summary: High CPU / Mem usage, BUFFER_OVERFLOW exception thrown
Status: CLOSED FIXED
Alias: None
Product: Jetty
Classification: RT
Component: server (show other bugs)
Version: 7.4.5   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: 7.5.x   Edit
Assignee: Greg Wilkins CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-07 06:18 EDT by Even Andre Fiskvik CLA
Modified: 2013-01-30 10:59 EST (History)
1 user (show)

See Also:


Attachments
java heap dump (49.14 KB, text/plain)
2011-09-07 06:19 EDT, Even Andre Fiskvik CLA
no flags Details
java heap dump #2 (47.78 KB, text/plain)
2011-09-07 06:19 EDT, Even Andre Fiskvik CLA
no flags Details
java heap dump #3 (48.20 KB, text/plain)
2011-09-07 06:20 EDT, Even Andre Fiskvik CLA
no flags Details
java heap dump #4 (46.71 KB, text/plain)
2011-09-07 06:20 EDT, Even Andre Fiskvik CLA
no flags Details
java heap dump #5 (47.06 KB, text/plain)
2011-09-07 06:20 EDT, Even Andre Fiskvik CLA
no flags Details
java heap dump #6 (48.90 KB, text/plain)
2011-09-07 06:20 EDT, Even Andre Fiskvik CLA
no flags Details
java heap dump #7 (46.35 KB, text/plain)
2011-09-07 06:20 EDT, Even Andre Fiskvik CLA
no flags Details
debuglogs_2ndtime.tar.gz (1.13 MB, application/x-gzip)
2011-09-15 03:36 EDT, Even Andre Fiskvik CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Even Andre Fiskvik CLA 2011-09-07 06:18:39 EDT
Build Identifier: 

Not sure what causes the BUFFER_OVERFLOW exception:

java.lang.IllegalStateException: Status = BUFFER_OVERFLOW HandshakeStatus = NEED_UNWRAP|bytesConsumed = 0 bytesProduced = 0 151 16660
	at org.eclipse.jetty.io.nio.SslSelectChannelEndPoint.unwrap(SslSelectChannelEndPoint.java:780)
	at org.eclipse.jetty.io.nio.SslSelectChannelEndPoint.fill(SslSelectChannelEndPoint.java:346)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:289)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:214)
	at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:411)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:535)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:40)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
	at java.lang.Thread.run(Thread.java:619)


At the same time there is high / abnormal CPU usage of the java process.
Normal is 3-4%, now uses 50-100% CPU.
The memory usage also seems pretty high:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND     
 1057 root      18   0 2561m 2.0g  10m S 86.5 25.4 509:54.18 java    

Tasks: 203 total,   1 running, 202 sleeping,   0 stopped,   0 zombie
Cpu(s): 11.7%us,  0.2%sy,  0.0%ni, 88.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8174016k total,  7443224k used,   730792k free,   458424k buffers
Swap:  6094840k total,    16248k used,  6078592k free,  1722180k cached




Reproducible: Sometimes

Steps to Reproduce:
1. Having the service run for about a month, low number of active sessions < 40
Comment 1 Even Andre Fiskvik CLA 2011-09-07 06:19:27 EDT
Created attachment 202877 [details]
java heap dump
Comment 2 Even Andre Fiskvik CLA 2011-09-07 06:19:48 EDT
Created attachment 202878 [details]
java heap dump #2
Comment 3 Even Andre Fiskvik CLA 2011-09-07 06:20:00 EDT
Created attachment 202879 [details]
java heap dump #3
Comment 4 Even Andre Fiskvik CLA 2011-09-07 06:20:16 EDT
Created attachment 202880 [details]
java heap dump #4
Comment 5 Even Andre Fiskvik CLA 2011-09-07 06:20:29 EDT
Created attachment 202881 [details]
java heap dump #5
Comment 6 Even Andre Fiskvik CLA 2011-09-07 06:20:43 EDT
Created attachment 202882 [details]
java heap dump #6
Comment 7 Even Andre Fiskvik CLA 2011-09-07 06:20:58 EDT
Created attachment 202883 [details]
java heap dump #7
Comment 8 Greg Wilkins CLA 2011-09-08 03:55:23 EDT
Thanks for the thread dumps, but unfortunately they have not revealed anything useful.

It would be good if you could run your server with JMX enabled, then you would be able to use jconsole or similar to dump the server state - which will give a detailed view of each end point in the selector.  Hopefully this will tell us what is looping.

Better yet... maybe 7.5.1 will have fixed the problem.

At the very least, please keep us informed as to the frequency of recurrences
Comment 9 Even Andre Fiskvik CLA 2011-09-15 03:34:57 EDT
I just got the issue today as well (about a week after the last occurrence). 
This time it seems the heap dump is much larger and with more BLOCKED threads.

I have added the heap dump logs in an attachment called "debuglogs_2ndtime.tar.gz".
Comment 10 Even Andre Fiskvik CLA 2011-09-15 03:36:04 EDT
Created attachment 203387 [details]
debuglogs_2ndtime.tar.gz
Comment 11 Even Andre Fiskvik CLA 2011-09-15 03:37:39 EDT
Regarding last comment, the exception I got this time is:

 .12.100, connection state CONNECTED
2011-09-15 07:54:35.259:WARN::handle failed
java.nio.BufferOverflowException
        at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:311)
        at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:290)
        at sun.nio.ch.IOUtil.write(IOUtil.java:70)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
        at org.eclipse.jetty.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:219)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:282)
        at org.eclipse.jetty.io.nio.SslSelectChannelEndPoint.flush(SslSelectChannelEndPoint.java:638)
        at org.eclipse.jetty.io.nio.SslSelectChannelEndPoint.fill(SslSelectChannelEndPoint.java:462)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:289)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:214)
        at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:411)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:535)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:40)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
        at java.lang.Thread.run(Thread.java:619)
Comment 12 Greg Wilkins CLA 2011-09-15 23:18:33 EDT
It looks like something was going on with logback as well that caused all the BLOCKED threads.  Perhaps it would be worthwhile showing one of these dumps to the logback guys to get their thoughts.

But eitherway, those blocked threads would not be causing the high CPU and other symptoms. They are likely to be either a symptom or unrelated, but not a cause.


I'll analyse these dumps a bit more today, but getting a JMX dump would also be really valuable as it has a lot more info.

Alternately, are you able to run a SNAPSHOT, in which case I will add some extra debug instrumentation so that we can capture a lot more info should we get a BUFFER_OVERFLOW.


regards
Comment 13 Greg Wilkins CLA 2011-09-16 01:20:47 EDT
This latest Buffer overflow is very strange!   It looks like a JVM bug to me, as the contract of Channel.write should not allow a buffer overflow exception.  The buffer it is copying to is entirely hidden from us and if it is full, it should just return and say that 0 bytes were written.

What JVM are you using?
Comment 14 Even Andre Fiskvik CLA 2011-10-19 08:39:57 EDT
I was using an old version of the JDK, now updated to jdk1.7.0 but the problem still persists.
Comment 15 Greg Wilkins CLA 2011-11-07 19:51:34 EST
Hmmm still very strange.

I'm working on the jetty-7.6 branch which has a re factored approach to SSL and NIO connections - specifically to deal with a bunch of issues that we have had.

It might be worthwhile trying that branch out.   (Note that it is not currently passing all tests, but it close so worth trying... or waiting a few days if you want a clean build).

regards
Comment 16 Greg Wilkins CLA 2012-01-10 21:02:01 EST
can you try 7.6.0.RC3 ?
Comment 17 Even Andre Fiskvik CLA 2012-01-11 01:43:56 EST
I'm currently testing "7.5.4.v20111024" to see if it will fix the issue. Only been testing it for some days so far, do you still think I should move to testing Jetty 7-6 branch?
Comment 18 Even Andre Fiskvik CLA 2012-01-16 08:35:12 EST
My problems were not fixed in 7.5.4.v20111024, testing out 7.6.0.RC4 now.
If this does not help I will remove the logger framework from the code and see if that changes anything.
Comment 19 Even Andre Fiskvik CLA 2012-01-17 05:27:19 EST
Tested out 7.6.0.RC4 and updated all other libraries (still running cometd 2.3.1) and the issue has become much worse now. After an hour running or so, the CPU usage boosts up to 300-400% on the server.

Testing yo turn off all logging I've added to the project now to see if that helps.
Comment 20 Jesse McConnell CLA 2012-08-23 15:03:49 EDT
Is this still happening after that latest jvm update?  the last 1.7u6 seems to have addressed a number of outstanding odd jvm issues
Comment 21 Jesse McConnell CLA 2013-01-30 10:59:30 EST
over a year with no additional info and we have squashed many issues in 7.6 so calling this one, please reopen if needed