| Summary: | Shared instance (actually its slaves) are all offline | ||
|---|---|---|---|
| Product: | Community | Reporter: | David Williams <david_williams> |
| Component: | CI-Jenkins | Assignee: | CI Admin Inbox <ci.admin-inbox> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | critical | ||
| Priority: | P3 | CC: | daniel_megert, webmaster |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | Linux | ||
| Whiteboard: | |||
|
Description
David Williams
With my semi-admin privledges, tried to start huson-slave4, but it simply said following in log: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Channel$ReaderThread.run(Channel.java:1030) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.Channel$ReaderThread.run(Channel.java:1024) (Well, it said more than that, but I've "lost" it already, and that's what it ended with.) Note: I've marked as blocker since we've already lost tests for two "nightly builds" ... and our Neon M1 stabilization builds are starting with Sunday night's build. Even the nightlies would be blocker worthy, but the stabilization I-builds are "emergency blockers" ... if there is such as thing. I'm not seeing any delays in the response of the main page, and as of right now (2pm EST) almost all the slaves are reporting 'idle'. Yy guess is that the weekly slave restarts that ran this morning probably 'unblocked' anything that was stuck. -M. (In reply to Eclipse Webmaster from comment #3) > I'm not seeing any delays in the response of the main page, and as of right > now (2pm EST) almost all the slaves are reporting 'idle'. If they are idle then something is still wrong since we are waiting on further test results. See e.g. http://download.eclipse.org/eclipse/downloads/drops4/N20150801-1500/ Well the Mac slave is running a build and I did have to restart the windows slave, due to a crash of the slave process. -M. (In reply to Eclipse Webmaster from comment #5) > Well the Mac slave is running a build and I did have to restart the windows > slave, due to a crash of the slave process. > > -M. k, thanks Matt! Let's see how it goes. Mac and Linux tests ran for N20150731-2000, and N20150801-1500 (but not Windows, since it failed quickly with the typical hudson.remoting.Channel@35322e51:windows7tests hudson.util.IOException2: remote file operation failed: <https://hudson.eclipse.org/hudson/job/ep46N-unit-win32/ws/> at hudson.remoting.Channel@35322e51:windows7tests at hudson.FilePath.act(FilePath.java:754) (And, the Windows machine is not part of the "auto restart", AFAIK) Since Windows slave was restarted, I've restarted the tests for Windows machine for N20150731-2000 and N20150801-1500, and appears to be running normally, so, will declare this "fixed" (even though tests won't be complete for some time ... on Monday). (In reply to David Williams from comment #7) > Mac and Linux tests ran for N20150731-2000, and N20150801-1500 (but not > Windows, since it failed quickly with the typical > > hudson.remoting.Channel@35322e51:windows7tests > hudson.util.IOException2: remote file operation failed: > <https://hudson.eclipse.org/hudson/job/ep46N-unit-win32/ws/> at > hudson.remoting.Channel@35322e51:windows7tests > at hudson.FilePath.act(FilePath.java:754) > > (And, the Windows machine is not part of the "auto restart", AFAIK) > > Since Windows slave was restarted, I've restarted the tests for Windows > machine for N20150731-2000 and N20150801-1500, and appears to be running > normally, so, will declare this "fixed" (even though tests won't be complete > for some time ... on Monday). Update: still no Windows test results for the mentioned builds and so far only Linux test results for I20150802-2000. So, either it is still or again broken, or it takes very long, which is also bad for us. Looks like the results slowly arrive: Windows test results for N20150731-2000 Mac test results for I20150802-2000 Still missing are Windows test results for N20150801-1500 and I20150802-2000. (In reply to Dani Megert from comment #9) > Looks like the results slowly arrive: > Windows test results for N20150731-2000 > Mac test results for I20150802-2000 > > Still missing are Windows test results for N20150801-1500 and I20150802-2000. I commented on these on platform-releng-dev list ... it just takes Windows a long time to catch up, since a) slow machine, and b) can only run one test-build at a time, since we need a dedicated display, on Windows. Those might be another problem :) but, not this bug. Thanks all, |