Community
Participate
Working Groups
Follow up on bug 499399. One area I've noticed recently was the permanent unstable collaboration between Bugzilla, Gerrit, Hudson and mail server (in all possible combinations). Contribution to Eclipse becomes a "Russian roulette", and most contributors aren't even aware what goes wrong and why. If we are stick to the current ecosystem, we should *at least* think about increasing the transparency of the services state: 1) Committers should be able to see the services state. 2) There should be a way to trigger some basic infra checks (*not* by posting on a mailing list "I'm alone with XYZ issue?"). 3) Web masters should be *automatically* notified if some service doesn't work as expected.
Are you aware of the infra status page: https://dev.eclipse.org/committers/help/status.php ? Would making this page more visible already help?
(In reply to Stephan Herrmann from comment #1) > Are you aware of the infra status page: > https://dev.eclipse.org/committers/help/status.php > ? > > Would making this page more visible already help? Hmm.. Not really. One need to know all this server names and their data in details to understand if some of the values were OK or not. I'm not an admin so I don't know if there were some plugins for Hudson/Bugzilla/Gerrit to show their current services state in a human understandable format (green == OK, red == DOWN). So as far as I understood the last Hudson issue was some "over-filled" queue in hudson...
See also bug 500047 (hudson hangs) which depends on bug 500044 (nexus hangs?). This is exact the in-transparent state I mean we have today. Groking over the current status at https://dev.eclipse.org/committers/help/status.php doesn't help me: there are 5 matches for "DEAD, offline or not reporting" but none of them seem to have a relationship to hudson or nexus: barney, dbapislave, php-vm3, php-vm3, www-vm1, www-vm3. Instead, I wish I would like to see now at least 3 errors: <Alert> HIPP instance of https://hudson.eclipse.org/platform/ not operable (6 jobs out of max. 6 are hanging) <Alert> https://repo.eclipse.org/ is down <Alert> Nexus (where is it???) is down
> 1) Committers should be able to see the services state. http://status.eclipse.org/ runs in a different country and monitors Eclipse.org regularly. > 2) There should be a way to trigger some basic infra checks (*not* by > posting on a mailing list "I'm alone with XYZ issue?"). As above. I'm open to adding additional checks and monitors where it makes sense. > 3) Web masters should be *automatically* notified if some service doesn't > work as expected. We do have some automated monitors, but we typically get faster notifications from humans. Andrey, do you feel that we've done enough to improve the state of the infra to close this as fixed?
I'll be optimistic and close this as FIXED. Please reopen if you have additional suggestions!