| Summary: | Platform servers need to be restarted? | ||
|---|---|---|---|
| Product: | Community | Reporter: | Andrey Loskutov <loskutov> |
| Component: | Servers | Assignee: | Eclipse Webmaster <webmaster> |
| Status: | CLOSED INVALID | QA Contact: | |
| Severity: | major | ||
| Priority: | P3 | CC: | akurtakov, christian.dietrich.opensource, daniel_megert, Lars.Vogel, mikael.barbero, mistria |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | Linux | ||
| See Also: | https://bugs.eclipse.org/bugs/show_bug.cgi?id=540124 | ||
| Whiteboard: | |||
|
Description
Andrey Loskutov
Probably this is also the reason why official builds did not run for Linux : http://download.eclipse.org/eclipse/downloads/drops4/I20181012-1800/testResults.php I've managed to restart Platform UI Gerrit, will see if this helps (I guess not, but who knows). I've re-triggered the build for https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16088/ - if this does not succeed, we still have a problem. I see nothing special on the infra side. The job has been configured for a long time to abort when stuck. The configured may need to be tweaked. Note however that yesterday, a change has been made by Lars to the job config: clean verify -Pbuild-individual-bundles -Pbree-libs has been changed to clean verify -Pbuild-individual-bundles -Pbree-libs -fae Asking maven to fail at end may be the cause of the timeout. (In reply to Mikaël Barbero from comment #3) > I see nothing special on the infra side. The job has been configured for a > long time to abort when stuck. The configured may need to be tweaked. > > Note however that yesterday, a change has been made by Lars to the job > config: > > clean verify -Pbuild-individual-bundles -Pbree-libs > > has been changed to > > clean verify -Pbuild-individual-bundles -Pbree-libs -fae > > Asking maven to fail at end may be the cause of the timeout. You can see that https://ci.eclipse.org/releng/job/ep410I-unit-cen64-gtk3/ also timeouts and so do https://ci.eclipse.org/pde/job/eclipse.pde.ui-Gerrit/ . So smth is fishy with infra IMHO. (In reply to Andrey Loskutov from comment #2) > I've managed to restart Platform UI Gerrit, will see if this helps (I guess > not, but who knows). > I've re-triggered the build for > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16088/ - if > this does not succeed, we still have a problem. The build is still running (since 1.5 hours) => no, Gerrit restart didn't help :-( (In reply to Mikaël Barbero from comment #3) > I see nothing special on the infra side. The job has been configured for a > long time to abort when stuck. The configured may need to be tweaked. > > Note however that yesterday, a change has been made by Lars to the job > config: The last successful Platform UI build on Gerrit was two days ago, so the fails started before job changes. Please also note, the *official* Platform builds aren't succeeded for all Linux since I20181012-1800 build. So something got broken on Friday or after I20181010-1800. (In reply to Andrey Loskutov from comment #5) > (In reply to Andrey Loskutov from comment #2) > > I've managed to restart Platform UI Gerrit, will see if this helps (I guess > > not, but who knows). > > I've re-triggered the build for > > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16088/ - if > > this does not succeed, we still have a problem. > > The build is still running (since 1.5 hours) => no, Gerrit restart didn't > help :-( What do you mean by gerrit restart? You restarted the Jenkins instance? or just the connection between Gerrit and Jenkins? (In reply to Mikaël Barbero from comment #6) > (In reply to Andrey Loskutov from comment #5) > > (In reply to Andrey Loskutov from comment #2) > > > I've managed to restart Platform UI Gerrit, will see if this helps (I guess > > > not, but who knows). > > > I've re-triggered the build for > > > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16088/ - if > > > this does not succeed, we still have a problem. > > > > The build is still running (since 1.5 hours) => no, Gerrit restart didn't > > help :-( > > What do you mean by gerrit restart? You restarted the Jenkins instance? or > just the connection between Gerrit and Jenkins? If I only knew... I've clicked on "Restart" icon shown next to "CI Control: Eclipse Platform: " entry on the under https://accounts.eclipse.org/users/aloskutov page. There is no hint what it is supposed to do, but it looks like it restarted Jenkins running on https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/, because just after hitting this button I've got 502 error on that URL. (In reply to Andrey Loskutov from comment #7) > If I only knew... I've clicked on "Restart" icon shown next to "CI Control: > Eclipse Platform: " entry on the > under https://accounts.eclipse.org/users/aloskutov page. > > There is no hint what it is supposed to do, but it looks like it restarted > Jenkins running on > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/, because > just after hitting this button I've got 502 error on that URL. It restarts the Jenkins instance. You've done the right thing. I'm investigating. I'm still seeing no issue with platform's JIPP. I'm still investigating. I've noted that in all aborted builds, the job is stuck in the test: ----- testGetWorkbenchWindows testGetWorkbenchWindows: setUp... Has it been changed recently so that it cannot be run non-interactively? See https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16082/console https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16083/console https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16084/console https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16088/console (In reply to Alexander Kurtakov from comment #4) > You can see that https://ci.eclipse.org/releng/job/ep410I-unit-cen64-gtk3/ > also timeouts and so do > https://ci.eclipse.org/pde/job/eclipse.pde.ui-Gerrit/ . So smth is fishy > with infra IMHO. For PDE, for all aborted builds, the jobs stay stuck in org.eclipse.ui.tests.smartimport.AllTests. Could it be related? Would you please check whether there isn't some kind of inode exhaustion? 'df -i' should give info. Also can the machine be rebooted? (just in case :). (In reply to Mikaël Barbero from comment #10) > I've noted that in all aborted builds, the job is stuck in the test: > > ----- testGetWorkbenchWindows > testGetWorkbenchWindows: setUp... > > Has it been changed recently so that it cannot be run non-interactively? > > See > > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16082/console > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16083/console > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16084/console > https://ci.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/16088/console I'm not aware about changes in related code. Running test from IDE the test passes for me on GTK 3.22 / RHEL 7.4. May there was some change in Tycho? (In reply to Alexander Kurtakov from comment #12) > Would you please check whether there isn't some kind of inode exhaustion? > 'df -i' should give info. No inode exhaustion. > Also can the machine be rebooted? (just in case :). There are several other projects running on this machine. Rebooting it for no *visible* reasons would be harsh. We are going to rerun the last tests that succeeded in order to answer the question is it machine issue or change in platform. https://ci.eclipse.org/releng/job/ep410I-unit-cen64-gtk3/41/console - if it succeeds it's infra issue, if it doesn't it's platform change. (In reply to Alexander Kurtakov from comment #16) > https://ci.eclipse.org/releng/job/ep410I-unit-cen64-gtk3/41/console - if it > succeeds it's infra issue, if it doesn't it's platform change. heh, actually the opposite See bug 540124 comment 8 about possible SWT change which might have causing this hangup in some configs / tests. https://ci.eclipse.org/releng/job/ep410I-unit-cen64-gtk3/41/ is successful. The issue will most probably better be discussed on bug 540124. |