Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 499400 - Increase the transparency of the services state
Summary: Increase the transparency of the services state
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: Servers (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 499399
  Show dependency tree
 
Reported: 2016-08-09 03:05 EDT by Andrey Loskutov CLA
Modified: 2018-09-06 14:14 EDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrey Loskutov CLA 2016-08-09 03:05:10 EDT
Follow up on bug 499399.

One area I've noticed recently was the permanent unstable collaboration between Bugzilla, Gerrit, Hudson and mail server (in all possible combinations). Contribution to Eclipse becomes a "Russian roulette", and most contributors aren't even aware what goes wrong and why.

If we are stick to the current ecosystem, we should *at least* think about increasing the transparency of the services state:

1) Committers should be able to see the services state.
2) There should be a way to trigger some basic infra checks (*not* by posting on a mailing list "I'm alone with XYZ issue?").
3) Web masters should be *automatically* notified if some service doesn't work as expected.
Comment 1 Stephan Herrmann CLA 2016-08-11 11:58:37 EDT
Are you aware of the infra status page:
  https://dev.eclipse.org/committers/help/status.php
?

Would making this page more visible already help?
Comment 2 Andrey Loskutov CLA 2016-08-12 05:02:21 EDT
(In reply to Stephan Herrmann from comment #1)
> Are you aware of the infra status page:
>   https://dev.eclipse.org/committers/help/status.php
> ?
> 
> Would making this page more visible already help?

Hmm.. Not really. One need to know all this server names and their data in details to understand if some of the values were OK or not.

I'm not an admin so I don't know if there were some plugins for Hudson/Bugzilla/Gerrit to show their current services state in a human understandable format (green == OK, red == DOWN). So as far as I understood the last Hudson issue was some "over-filled" queue in hudson...
Comment 3 Andrey Loskutov CLA 2016-08-22 05:28:48 EDT
See also bug 500047 (hudson hangs) which depends on bug 500044 (nexus hangs?). 

This is exact the in-transparent state I mean we have today.

Groking over the current status at https://dev.eclipse.org/committers/help/status.php doesn't help me: there are 5 matches for "DEAD, offline or not reporting" but none of them seem to have a relationship to hudson or nexus:
barney, dbapislave, php-vm3, php-vm3, www-vm1, www-vm3.

Instead, I wish I would like to see now at least 3 errors:

<Alert> HIPP instance of https://hudson.eclipse.org/platform/ not operable (6 jobs out of max. 6 are hanging)
<Alert> https://repo.eclipse.org/ is down
<Alert> Nexus (where is it???) is down
Comment 4 Denis Roy CLA 2017-11-15 09:34:37 EST
> 1) Committers should be able to see the services state.

http://status.eclipse.org/ runs in a different country and monitors Eclipse.org regularly.


> 2) There should be a way to trigger some basic infra checks (*not* by
> posting on a mailing list "I'm alone with XYZ issue?").

As above. I'm open to adding additional checks and monitors where it makes sense.

> 3) Web masters should be *automatically* notified if some service doesn't
> work as expected.

We do have some automated monitors, but we typically get faster notifications from humans.

Andrey, do you feel that we've done enough to improve the state of the infra to close this as fixed?
Comment 5 Denis Roy CLA 2018-09-06 14:14:15 EDT
I'll be optimistic and close this as FIXED. Please reopen if you have additional suggestions!