Community
Participate
Working Groups
Builds have been stalling for in excess of 4 hours for a variety of jobs for over 24 hours. Please restart.
Presently there are more than 20 jobs in the queue waiting for next available executor. Meanwhile, all servers except hudson-slave1 are idle. Several of the queued jobs are not restricted to a specific build environment so it seems the load balancer, if there is one, is doing a poor job.
I've 'restarted' the slave. While the jobs may not be restricted Hudson is setup to limit jobs on most of the nodes, simply because: 1) The test machines are for testing(mostly UI), they don't have the same selection of build tools. 2) Fastlane is meant for the release crunch. So that leaves hudson with the master and a slave to spread the jobs across. But I see that the master was set to limit the jobs it ran, so I've removed that restriction. -M.
I had a job that I really wanted to run, so I changed it to run on master. It still ended up in the queue and didn't reach the master until the queue had been processed to a point where it was small enough. At that point, the master kicked in and processed my job. My question is, why didn't that happen immediately? Why did my master job need to wait for jobs on the slave?
My suspicion after watching not very much happening was that access to some critical resource such as a password file was being serialized, so the very aberrant slave-1 builds could stall everything else. This might also correlate with your problems accessing dev.eclipse.org.
Slave1 seems to be in an even worse mode than before. Jobs aren't getting anywhere and trying to view their respective logs yields a 404. The jobs reported running for each project on the overview page doesn't exist when you look the projects individually.
At this time, the entire Hudson infra has been rebooted. Several builds have completed successfully, no hangs, no mess. Looks like we're good for another couple of weeks. Perhaps we should schedule a reboot of the Hudson servers every weekend.
I've just done two builds and got Building remotely on hudson-slave1 hudson.util.IOException2: remote file operation failed: /opt/users/hudsonbuild/workspace/buckminster-mdt-ocl-core-3.1-nightly at hudson.remoting.Channel@7fc24c83:hudson-slave1 at hudson.FilePath.act(FilePath.java:749) at hudson.FilePath.act(FilePath.java:735) at hudson.scm.CVSSCM.isUpdatable(CVSSCM.java:676) at hudson.scm.CVSSCM.checkout(CVSSCM.java:361) at hudson.model.AbstractProject.checkout(AbstractProject.java:1118) at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:480) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:412) at hudson.model.Run.run(Run.java:1337) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:140) Caused by: java.io.IOException: Remote call on hudson-slave1 failed at hudson.remoting.Channel.call(Channel.java:638) at hudson.FilePath.act(FilePath.java:742) ... 10 more Caused by: java.lang.LinkageError: loader (instance of hudson/remoting/RemoteClassLoader): attempted duplicate class definition for name: "hudson/model/ModelObject" at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632) at java.lang.ClassLoader.defineClass(ClassLoader.java:616) at java.lang.ClassLoader.defineClass(ClassLoader.java:466) at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:151) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632) at java.lang.ClassLoader.defineClass(ClassLoader.java:616) at java.lang.ClassLoader.defineClass(ClassLoader.java:466) at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:151) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.getDeclaredFields0(Native Method) at java.lang.Class.privateGetDeclaredFields(Class.java:2291) at java.lang.Class.getDeclaredField(Class.java:1880) at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1610) at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:52) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:425) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:413) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:310) at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:547) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1583) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1496) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1732) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.UserRequest.deserialize(UserRequest.java:178) at hudson.remoting.UserRequest.perform(UserRequest.java:98) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Archiving artifacts
(In reply to comment #7) [snip] > java.lang.LinkageError: loader (instance of hudson/remoting/RemoteClassLoader): > attempted duplicate class definition for name: "hudson/model/ModelObject" at That looks a lot like this: http://issues.hudson-ci.org/browse/HUDSON-6604
> Perhaps we should schedule a reboot of the Hudson servers every weekend. From the bug I linked to in the previous comment: "We are restarting hudson each Sunday afternoon to evade problems with memory leaks" I thought that was funny.
I've restarted slave1 since other jobs were failing with that exception (the bug mentions that as well). Let me know if you still encounter this.
No change. Same exception on build 299.
(In reply to comment #11) > No change. Same exception on build 299. Yeah, getting the issue with wst build as well.
I think I managed to kick slave1 cleanly. Dave, I think your build is running OK now, isn't it?
(In reply to comment #13) > I think I managed to kick slave1 cleanly. Dave, I think your build is running > OK now, isn't it? Yep, we are good.
The MDT build seems to be humming along as well. *fingers crossed*
Yes looks good. Ta. But I think the swtbot-e37 started before you did a clean restart. Might it yet dirty the restart?
*** Bug 337630 has been marked as a duplicate of this bug. ***
slave 1 is stuck again.
No need for 2 bugs to talk about the same problem.
slave 1 need a restart again. log from https://hudson.eclipse.org/hudson/me/my-views/view/Obeo/job/emf-eef-master/31/console : Started by user sbouchet Building remotely on hudson-slave1 hudson.util.IOException2: remote file operation failed: /opt/users/hudsonbuild/workspace/emf-eef-master at hudson.remoting.Channel@3e1af961:hudson-slave1 at hudson.FilePath.act(FilePath.java:754) at hudson.FilePath.act(FilePath.java:740) at hudson.scm.CVSSCM.isUpdatable(CVSSCM.java:439) at hudson.scm.CVSSCM.checkout(CVSSCM.java:310) at hudson.model.AbstractProject.checkout(AbstractProject.java:1229) at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:507) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:424) at hudson.model.Run.run(Run.java:1367) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:145) Caused by: java.io.IOException: Remote call on hudson-slave1 failed at hudson.remoting.Channel.call(Channel.java:659) at hudson.FilePath.act(FilePath.java:747) ... 10 more Caused by: java.lang.LinkageError: loader (instance of hudson/remoting/RemoteClassLoader): attempted duplicate class definition for name: "hudson/model/AbstractProject" at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632) at java.lang.ClassLoader.defineClass(ClassLoader.java:616) at java.lang.ClassLoader.defineClass(ClassLoader.java:466) at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:151) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2427) at java.lang.Class.getDeclaredMethod(Class.java:1935) at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1382) at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:52) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:438) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:413) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:310) at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:547) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1583) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1496) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1732) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.UserRequest.deserialize(UserRequest.java:178) at hudson.remoting.UserRequest.perform(UserRequest.java:98) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:283) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Archiving artifacts Sending e-mails to: goulwen.lefur@obeo.fr stephane.bouchet@obeo.fr [DEBUG] Skipping watched dependency update for build: emf-eef-master #31 due to result: FAILURE Finished: FAILURE
Ok I've restarted the slave process. -M.
thanks, it did it :)
once agin, slave 1 needs to be restarted : hudson.util.IOException2: remote file operation failed: /opt/users/hudsonbuild/workspace/mylyn-docs-intent-0.7-nightly at hudson.remoting.Channel@4d150188:hudson-slave1 at hudson.FilePath.act(FilePath.java:754) at hudson.FilePath.act(FilePath.java:740) at hudson.plugins.git.GitSCM.gerRevisionToBuild(GitSCM.java:843) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:620) at hudson.model.AbstractProject.checkout(AbstractProject.java:1229) at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:507) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:424) at hudson.model.Run.run(Run.java:1367) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:145) Caused by: java.io.IOException: Remote call on hudson-slave1 failed at hudson.remoting.Channel.call(Channel.java:659) at hudson.FilePath.act(FilePath.java:747) ... 10 more Caused by: java.lang.LinkageError: loader (instance of hudson/remoting/RemoteClassLoader): attempted duplicate class definition for name: "hudson/model/Run" at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632) at java.lang.ClassLoader.defineClass(ClassLoader.java:616) at java.lang.ClassLoader.defineClass(ClassLoader.java:466) at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:151) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.getDeclaredFields0(Native Method) at java.lang.Class.privateGetDeclaredFields(Class.java:2291) at java.lang.Class.getDeclaredField(Class.java:1880) at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1610) at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:52) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:425) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:413) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:310) at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:547) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1583) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1496) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1732) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.UserRequest.deserialize(UserRequest.java:178) at hudson.remoting.UserRequest.perform(UserRequest.java:98) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:283) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)
I've restarted the slave process. -M.
Hudson slave1 seems to be stuck again. I think it needs to be restarted.
I've disconnected the slave, rebooted the host and reconnected it. -M.
All builds launched under hudson-slave1 are taking much more time than usual and seem to be stuck. I think slave 1 needs a restart
*** Bug 366413 has been marked as a duplicate of this bug. ***
The slave was restarted a few hours ago.
(In reply to comment #29) > The slave was restarted a few hours ago. Since the slave1 fail of yesterday our build still fails. [ERROR] Internal error: java.lang.RuntimeException: org.apache.maven.MavenExecutionException: Could not setup plugin ClassRealm: Plugin org.eclipse.dash.maven:eclipse-signing-maven-plugin:1.0.3 or one of its dependencies could not be resolved: Failure to find org.eclipse.dash.maven:eclipse-signing-maven-plugin:jar:1.0.3 in http://repo1.maven.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced -> [Help 1] org.apache.maven.InternalErrorException: Internal error: java.lang.RuntimeException: org.apache.maven.MavenExecutionException: Could not setup plugin ClassRealm at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:163) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:445)
unrelated, sorry.
Created attachment 208371 [details] log for "cannot assign instance of hudson.model.StreamBuildListener" Hi, I cannot build on slave1. I get the following error message at the beginning of the build: FATAL: cannot assign instance of hudson.model.StreamBuildListener to field hudson.scm.subversion.WorkspaceUpdater$UpdateTask.listener of type hudson.model.TaskListener in instance of hudson.scm.SubversionSCM$CheckOutTask (see attached log)
re-opening as per the previous comment
I've restarted slave1. -M.
I think Salve 1 needs to be restarted : when launching my build, I get the following exception (full stack available at https://hudson.eclipse.org/hudson/job/mylyn-docs-intent-0.7-nightly/212/console) hudson.util.IOException2: remote file operation failed: /opt/users/hudsonbuild/workspace/mylyn-docs-intent-0.7-nightly at hudson.remoting.Channel@5c5e1f94:hudson-slave1 at hudson.FilePath.act(FilePath.java:754) at hudson.FilePath.act(FilePath.java:740) at hudson.plugins.git.GitSCM.gerRevisionToBuild(GitSCM.java:843) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:620) at hudson.model.AbstractProject.checkout(AbstractProject.java:1229) at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:507) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:424) at hudson.model.Run.run(Run.java:1367) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:145) Caused by: java.io.IOException: Remote call on hudson-slave1 failed at hudson.remoting.Channel.call(Channel.java:659) at hudson.FilePath.act(FilePath.java:747) ... 10 more Caused by: java.lang.LinkageError: loader (instance of hudson/remoting/RemoteClassLoader): attempted duplicate class definition for name: "hudson/model/AbstractProject" at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632) at java.lang.ClassLoader.defineClass(ClassLoader.java:616) at java.lang.ClassLoader.defineClass(ClassLoader.java:466) at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:151) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131)
(In reply to comment #35) > I think Salve 1 needs to be restarted : > when launching my build, I get the following exception (full stack available at > https://hudson.eclipse.org/hudson/job/mylyn-docs-intent-0.7-nightly/212/console) > [...] > Caused by: java.lang.LinkageError: loader (instance of > hudson/remoting/RemoteClassLoader): attempted duplicate class definition for > name: "hudson/model/AbstractProject" The same error shows up during SCM polls which makes Hudson silently fail to trigger builds upon new commits.
This still seems to be a problem. Pleasee restart.
PLEASE restart!
Eike: At least the config of the other machines seems to have been fixed, so for MDT/OCL I was able to just force the build to be on master.
Builds are currently taking a lot longer than usual on hudson-slave1. see [cross-project-issues-dev] "Build timed out, aborting" with JUnit tests Maybe Hudson needs to be restarted again?
Earlier I restarted slave1 and reduced the number of executors to 3.
emf-core has taken 3.5 hours so far when it usually takes 30 mins. fastlane seems wierd too.
Slave 1 currently has a couply of typically 5 minute jobs reaching the four hour point. Also slave 8.
There were 3 linuxtools jobs that had been consuming a ton of CPU cycles, and had been running since March 3. I've killed them.
Hudson seems unhappy. Two MDT/OCL jobs just failed with stupid timeout/connection errors. slave1 has many very long running jobs (Hudson test bed at over 24 hours).
Fixed ... for now.
(In reply to comment #47) > Fixed ... for now. Slave1 has long running jobs again, one has been running for two days.
I've killed off the jobs that were taking forever. Let's see what it does with the new jobs.
I'm seeing intermediate wierd results; tests timing out through lack of synchronization. Also Hudson web access has been really really slow most of the day.
Just had two stupid failures on slave1; master is much better. Caused by: java.lang.LinkageError: loader (instance of hudson/remoting/RemoteClassLoader): attempted duplicate class definition for name: "hudson/model/AbstractProject"
(In reply to comment #51) Ok, I've restarted slave1. -M
Fastlane is failing with stupid Hudson like channel exception reasons: https://hudson.eclipse.org/hudson/job/buckminster-mdt-ocl-core-3.2-master/678/console
I've restarted fastlane. -M.