Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 360500 - Hudson is unavailable (Error 503 "Service unavailable")
Summary: Hudson is unavailable (Error 503 "Service unavailable")
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: CI-Jenkins (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-10-11 03:20 EDT by Matthias Sohn CLA
Modified: 2013-06-05 09:03 EDT (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Sohn CLA 2011-10-11 03:20:31 EDT
Hudson seems to be unavailable, https://hudson.eclipse.org/hudson/ responds with 
Error 503 "Service unavailable"
Comment 1 Nicolas Bros CLA 2011-10-11 03:46:38 EDT
I'm getting this error too.
Comment 2 Denis Roy CLA 2011-10-11 08:56:21 EDT
I've restarted the master.
Comment 3 Nicolas Bros CLA 2011-10-11 10:14:00 EDT
It worked for an hour, and now it stopped responding : the page doesn't load, and I got after a few minutes:

Bad Gateway!
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /hudson/job/emffacet-nightly/.
Reason: Error reading from remote server
If you think this is a server error, please contact the webmaster.

Error 502

hudson.eclipse.org
Tue Oct 11 10:10:46 2011
Apache/2.2.10 (Linux/SUSE)
Comment 4 Martin Oberhuber CLA 2011-10-11 10:40:42 EDT
Still bad gateway for me too.
Comment 5 Eclipse Webmaster CLA 2011-10-11 10:45:37 EDT
Not a good start to the upgrade.  Here's what I'm seeing in the logs:

[Winstone 2011/10/11 10:14:47] - Error within request handler thread
java.lang.OutOfMemoryError: PermGen space

I've cranked up the memory limit to 2.5G and restarted the process.

-M.
Comment 6 Gunnar Wagenknecht CLA 2011-10-11 10:48:18 EDT
(In reply to comment #5)
> I've cranked up the memory limit to 2.5G and restarted the process.

That's won't help for the PermGen space error. What's the command line you use?
Comment 7 Sebastian Zarnekow CLA 2011-10-11 10:49:08 EDT
Did you increase the PermGen size or the heap size?

-XX:MaxPermSize=192m or something should do the trick.
Comment 8 Eclipse Webmaster CLA 2011-10-11 10:50:13 EDT
I just noticed that.  So here are the options:

-Xms1000m -Xmx2500m -XX:MaxPermSize=1024m

-M.
Comment 9 Sebastian Zarnekow CLA 2011-10-11 10:52:24 EDT
-XX:MaxPermSize=1024m seems to be far to large. IIRC the default is 64m and 256m should be enough for _most_ apps on a 64bit vm, 128m is usually suffient for a 32bit vm.
Comment 10 Eclipse Webmaster CLA 2011-10-11 10:54:22 EDT
Well, I can turn it down, but since that requires a restart(of which we've already had several today), is there a risk from using a value with this much 'overkill'?

-M.
Comment 11 Martin Oberhuber CLA 2011-10-11 10:55:09 EDT
Whoa that's a lot of mem :) 
You might want to consider

 -XX:+HeapDumpOnOutOfMemoryError

to understand who's eating up all your mem... this will generate a heap dump which you can then analyze in Eclipse MAT.

Does Hudson have any recommendations for the amount of mem to provide ? Note that on the Eclipse Infocenter we found a memory hole with this approach, where giving all the mem of the world wouldn't have sufficed since it leaked HTTP sessions.
Comment 12 Sebastian Zarnekow CLA 2011-10-11 10:58:11 EDT
(In reply to comment #10)
> Well, I can turn it down, but since that requires a restart(of which we've
> already had several today), is there a risk from using a value with this much
> 'overkill'?
> 
> -M.

If I'm not mistaken, the PermGen memory will reduce the avaiable memory for the heap but I'm not 100% sure about that.
Comment 13 Eclipse Webmaster CLA 2011-10-11 11:04:38 EDT
Ok, I've turned the PermGen setting down to 256m and added the heap dump option.  I'll wait for a while before restarting with these options so at least some builds will happen.

-M.
Comment 14 Matthias Sohn CLA 2011-10-11 17:04:01 EDT
(In reply to comment #13)
> Ok, I've turned the PermGen setting down to 256m and added the heap dump
> option.  I'll wait for a while before restarting with these options so at least
> some builds will happen.
> 
> -M.

You may consider to use VisualVM [1] to monitor how much PermGen is actually used by Hudson. This should help to find the right VM settings.

[1] http://download.oracle.com/javase/6/docs/technotes/guides/visualvm/monitor_tab.html
Comment 15 Matthias Sohn CLA 2011-10-11 17:35:49 EDT
Could you also check sandbox hudson's VM settings, it also behaves strange I observed repeatedly hanging build jobs and the UI is unresponsive. Now I also got an OOM error due to insufficient permgen space:

org.apache.commons.jelly.JellyTagException: jar:file:/opt/users/hudsonbuild/.hudson/war/WEB-INF/lib/hudson-core-2.1.0.jar!/hudson/model/View/index.jelly:35:46:  PermGen space
	at org.apache.commons.jelly.impl.TagScript.handleException(TagScript.java:728)
	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:290)
	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
	at org.kohsuke.stapler.jelly.CallTagLibScript$1.run(CallTagLibScript.java:98)
	at org.apache.commons.jelly.tags.define.InvokeBodyTag.doTag(InvokeBodyTag.java:91)
	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:270)
	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
	at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105)
	at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:119)
	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
	at org.kohsuke.stapler.jelly.CompressTag.doTag(CompressTag.java:44)
	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:270)
	at org.kohsuke.stapler.jelly.JellyViewScript.run(JellyViewScript.java:63)
	at org.kohsuke.stapler.jelly.DefaultScriptInvoker.invokeScript(DefaultScriptInvoker.java:63)
	at org.kohsuke.stapler.jelly.DefaultScriptInvoker.invokeScript(DefaultScriptInvoker.java:53)
	at org.kohsuke.stapler.jelly.JellyClassTearOff.serveIndexJelly(JellyClassTearOff.java:72)
	at org.kohsuke.stapler.jelly.JellyFacet.handleIndexRequest(JellyFacet.java:114)
	at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:551)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:640)
	at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:606)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:640)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:478)
	at org.kohsuke.stapler.Stapler.service(Stapler.java:160)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:45)
	at winstone.ServletConfiguration.execute(ServletConfiguration.java:249)
...
Caused by: java.lang.OutOfMemoryError: PermGen space
Comment 16 Martin Oberhuber CLA 2011-10-12 06:57:28 EDT
(In reply to comment #14)
> You may consider to use VisualVM [1] to monitor how much PermGen is actually
> used by Hudson. This should help to find the right VM settings.

Thanks for the idea - in fact Eclipse MAT won't help diagnosing PermGen issues, so looking at the running process with VisualVM may be better advice here.

In my experience, the default -XX:MaxPermSize=64m is too small for most applications and -XX:MaxPermSize=256m is ample for all I've done so far. Note that in one case I got "out of PermGen" due to an overall "ulimit -v 1024000" on any process although the -XX:MaxPermSize was large enough.
Comment 17 Matthias Sohn CLA 2011-10-20 01:38:33 EDT
Updating this bug to reflect some private communication I had with webmaster.

Thanks for updating the gerrit-trigger plugin to its latest release 2.3.1 as I requested. Unfortunately this seems to be not matching the version of the git plugin. 
We are now hitting NoSuchMethod exceptions at runtime [1].

You told me that updating the git plugin would require to also update hudson again. 

I tried to analyze this problem but I failed since I am struggling to compile the gerrit-trigger plugin. Hence in order to not do the wrong thing here I filed https://issues.jenkins-ci.org/browse/JENKINS-11411 to get advise from the experts.

[1] https://hudson.eclipse.org/sandbox/job/jgit.gerrit/901/console
Comment 18 Matthias Sohn CLA 2011-10-20 07:46:51 EDT
2011/10/18 Webmaster(Matt Ward) <webmaster@eclipse.org>
> Hi Matthias,
>
> We're currently running 2.0.1 for the git plugin(latest is 2.1.1_1).  
> If you really want me to update the plugin, I'm going to have to 
> update hudson as well(since the Git plugin is pretty adamant about 
> needing 2.1.2).

Steffen Pingel reported [1] that he is using this combination successfully
on http://ci.mylyn.org/:

 Hudson 2.1.2
 Hudson GIT Plugin 2.1.1_1
 Gerrit Trigger 2.3.2-SNAPSHOT (private-09/27/2011 13:52-jenkins)

Hence I would suggest to update Hudson to 2.1.2 and the Git plugin to 2.1.1_1
to fix this problem

[1] http://dev.eclipse.org/mhonarc/lists/egit-dev/msg02430.html
Comment 19 Matthias Sohn CLA 2011-10-20 12:36:09 EDT
updating to 

Hudson 2.1.2
Hudson GIT Plugin 2.1.1_1

fixed the problem