Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 371025 - Hudson (at least slave one) appears to have "stopped" communicating
Summary: Hudson (at least slave one) appears to have "stopped" communicating
Status: CLOSED WORKSFORME
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: CI-Jenkins (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-08 18:45 EST by David Williams CLA
Modified: 2012-03-01 14:40 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2012-02-08 18:45:12 EST
After the restart this afternoon, some of my jobs ran, and ran fine, but then one ran on "slave 1". As far as I can tell, the job is actually finished, but it still shows as "running" and if I try to look at console output, it just spins and spins. 

https://hudson.eclipse.org/hudson/job/indigo.runReports/371/console
Comment 1 David Williams CLA 2012-02-08 19:31:37 EST
I ended up forceably "killing" the build. 

https://hudson.eclipse.org/hudson/view/Repository%20Aggregation/job/indigo.runReports/371/

There is hardly nothing in its "console log", compare with console log of
https://hudson.eclipse.org/hudson/view/Repository%20Aggregation/job/indigo.runReports/370/

So I wonder if it was "hung" because it could not write to log as usual?
Comment 2 Denis Roy CLA 2012-02-08 21:53:08 EST
New job threads seem to stall with this:

FATAL: cannot assign instance of hudson.EnvVars to field hudson.plugins.git.GitSCM$3.val$environment of type hudson.EnvVars in instance of hudson.plugins.git.GitSCM$3
java.lang.ClassCastException: cannot assign instance of hudson.EnvVars to field hudson.plugins.git.GitSCM$3.val$environment of type hudson.EnvVars in instance of hudson.plugins.git.GitSCM$3
	at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2032)
	at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1212)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1953)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
	at hudson.remoting.UserRequest.deserialize(UserRequest.java:178)
	at hudson.remoting.UserRequest.perform(UserRequest.java:98)
	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
	at hudson.remoting.Request$2.run(Request.java:283)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Comment 3 David Williams CLA 2012-02-09 01:10:41 EST
Another job (this time on 'master') seems to be "hung" in a similar way .... This time is did print the log, but seems its waiting for something else to happen (like, checking triggers, or something). 

https://hudson.eclipse.org/hudson/job/juno.runAggregator/271/console
Comment 4 David Williams CLA 2012-02-09 08:44:17 EST
Another "hosed up" observation, There is a "juno.runAggregator" in the que, saying that it can't run because there is already a "juno.runAggregator" job running (#272 and #271 respectively) but I do not see any job #271 running anywhere. Am I seeing something wrong?
Comment 5 Denis Roy CLA 2012-02-09 08:53:02 EST
You're seeing that right...   I think Hudson is hosed.  Matt should be in at any moment to save us all.
Comment 6 Eclipse Webmaster CLA 2012-02-09 11:32:20 EST
I suspect that as with all our Hudson issues that a restart is about the only solution(it appears to be the only hammer we have....).  However I've just set one of the Hudson team up with access, so I'll see if they can get some data(stacktrace or similar) that may help explain what's happening.  

-M.
Comment 7 Eclipse Webmaster CLA 2012-03-01 14:40:16 EST
Hudson has been restarted, and we've made a couple of other changes as suggest be the Hudosn dev team.

I'm going to close this as 'worksforme', but please reopen if it stops working again.

-M.