Community
Participate
Working Groups
Hi, Cannot open https://hudson.eclipse.org/rcptt/
*** Bug 516950 has been marked as a duplicate of this bug. ***
hipp3 and RCPTT HIPP are back online.
https://hudson.eclipse.org/rcptt/ is unavailable again
It is available now. Are there any problems with the hipp? It is very difficult to use this instance when it is unstable.
The host machine (hipp3) is having issues. We are still investigating. RCPTT HIPP is online, therefore closing.
https://hudson.eclipse.org/rcptt/ is unavailable now, error message is shown: > This CI instance is currently unavailable. It may be turned off, or it may be unresponsive. Members of the project can restart this service using the HIPP Control tools in their Eclipse Foundation account (login required). > If the problem persists, please contact the project team on their forum or file a bug.
Hipp3 is experiencing an issue that is generating GPFs, we are investigating. -M.
Currently unavailable
*** Bug 517433 has been marked as a duplicate of this bug. ***
Hipp3 rebooted itself again about 30 minutes ago which caused the downtime. Same situation as comment 7, and we are still investigating. In the meantime, the hipps have been brought back online
Thanks for clarification
(In reply to Derek Toolan from comment #10) > Hipp3 rebooted itself again about 30 minutes ago which caused the downtime. > Same situation as comment 7, and we are still investigating. > > In the meantime, the hipps have been brought back online I still can't reach the JGit HIPP. Did it crash again ?
(In reply to Matthias Sohn from comment #12) > (In reply to Derek Toolan from comment #10) > > Hipp3 rebooted itself again about 30 minutes ago which caused the downtime. > > Same situation as comment 7, and we are still investigating. > > > > In the meantime, the hipps have been brought back online > > I still can't reach the JGit HIPP. Did it crash again ? It did, and its the same fault again. I brought the hipps back online once again.
I could reach it for a short time and now requests are timing out so it seems to be gone again
retrying once more it seems it's responding but very slowly
The system became unresponsive so we've had to completely restart it. I'm currently running some tests on the memory to try an isolate the issue. Once the tests are complete we'll restart the instances hosted on this node. -M.
Any ETA when we can expect Hudson to be back online ?
the second retry to restart the JGit HIPP succeeded so it's back online
After serveral hours the system restarted itself while running the memory tests. While not conclusive we're going to go forward presuming that some of the RAM is bad. I'll have to try and find a donor system so we can try replacing the RAM. All hipps hosted on this machine have been restarted. -M.
I've found a donor with compatible RAM that we can use. The catch is there will be some performance impact as the donor only has 64G(the current system has 128G). I'm going to swap the RAM Monday June 5th at 8am EDT, and I expect the server to be offline for about an hour or so. Once the swap is finished we'll test it on the donor and if we find a fault we'll look into replacing the DIMM(s) in question. If not and hipp3 remains unstable we'll engage with the hardware vendor about a repair/replacement. -M.
So the memory swap was successful, however hipp3 has faulted a couple of times today so I'm not convinced that memory is the culprit. The donor is still running tests on hipp3's RAM so I'll wait for that to finish but it's looking like there is something else responsible. Once the test results are in I'll look into getting in touch with the vendor and we'll go from there. -M.
Unfortunately HIPP3 is unavailable again, we will need to wait for webmaster to bring it back online.
*** Bug 518112 has been marked as a duplicate of this bug. ***
*** Bug 518106 has been marked as a duplicate of this bug. ***
*** Bug 518115 has been marked as a duplicate of this bug. ***
#1) The host has been restarted. #2) I think this indicates that RAM was not the cause of these outages. We'll engage with the hardware vendor. -M.
I'm not sure if it's related but our kura-develop job is failing the Sonar step due to an JDBC connection problem [1]. [1] https://hudson.eclipse.org/kura/job/kura-develop/1076/consoleFull
(In reply to Cristiano De Alti from comment #27) I don't think that's related to the issue at hand. I've spoken to our vendor and they are recommending we upgrade our firmware. As such I'm going to shut the system down Friday June 16th at 3:30pm EDT and I'll replace the RAM and run the updates at the same time. I'm expecting the system will be down for about an hour and a half while all of this happens. -M.
Since the firmware updates the system has been much more stable. Closing as 'resolved', please re-open if something goes wrong. -M.
*** Bug 519731 has been marked as a duplicate of this bug. ***
hipp3 is down again. :(
*** Bug 519752 has been marked as a duplicate of this bug. ***
(In reply to Frederic Gurr from comment #31) > hipp3 is down again. :( then reopening
hipp3 is back online.
HIPP3 is unavailable now (try to use https://hudson.eclipse.org/rcptt/)
(In reply to Viktoria Dlugopolskaya from comment #35) > HIPP3 is unavailable now (try to use https://hudson.eclipse.org/rcptt/) hipp3 is back online.