| Summary: | signing service unreliable | ||
|---|---|---|---|
| Product: | Community | Reporter: | Christian Dietrich <christian.dietrich.opensource> |
| Component: | Servers | Assignee: | Eclipse Webmaster <webmaster> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | CC: | denis.roy, mikael.barbero |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | Mac OS X | ||
| Whiteboard: | |||
|
Description
Christian Dietrich
any update here? I see once again errors from TSA Server returned HTTP response code: 400 for URL: http://sha256timestamp.ws.symantec.com/sha256/timestamp One way to mitigate this would be to implement https://github.com/eclipse-cbi/org.eclipse.cbi/issues/27. This item is not planned yet. as i dont see the problems in reruns do we run the job at a bad time in night or do the outages happen all over the day? Here is the list of previously recorded TSA downtime (UTC) 2021-05-03 06:38:48 2021-05-03 06:51:36 2021-05-05 06:29:52 2021-05-05 between 17:12:21 and 17:17:04 2021-05-06 09:29:05 2021-05-06 15:15:36 2021-05-06 22:55:30 2021-05-06 23:02:29 2021-05-07 00:33:39 2021-05-07 03:17:40 2021-05-07 08:20:39 2021-05-07 08:48:15 2021-05-09 06:49:05 2021-05-10 09:09:04 2021-05-10 15:11:19 2021-05-11 06:53:38 2021-05-11 07:41:09 2021-05-11 08:46:27 2021-05-11 11:15:57 2021-05-11 14:49:59 2021-05-11 15:13:58 2021-05-11 15:17:05 2021-05-11 15:25:57 2021-05-11 22:41:17 2021-05-12 06:27:14 2021-05-12 08:18:05 2021-05-12 13:14:54 2021-05-12 14:17:43 2021-05-12 20:55:16 2021-05-13 22:14:46 2021-05-14 01:02:27 2021-05-14 04:17:34 2021-05-14 11:37:46 2021-05-15 23:12:00 2021-05-16 22:29:23 2021-05-16 20:23:38 2021-05-16 22:29:23 2021-05-17 04:23:30 2021-05-17 16:33:36 2021-05-17 between 22:26:33 and 22:38:19 2021-05-20 between 01:53:39 and 01:55:16 (note that timestamps in your Jenkins logs, e.g. https://ci.eclipse.org/xtext/job/releng/job/sign-and-deploy/983/console are in EDT, ie UTC-4) You see that most of the downtime are episodic and a single retry usually makes it unnoticeable to projects. The 3 longer downtime matches the issues you've been experiencing. Again, we may be able to mitigate this with https://github.com/eclipse-cbi/org.eclipse.cbi/issues/27 and/or you could increase your curl retry number (currently --retry 3) or increase the delay between retry (currently 10 seconds --retry-delay 10). I'd advise to do even better and completely remove the --retry-delay parameter and make curl use its exponential backoff algorithm: When curl is about to retry a transfer, it will first wait one second and then for all forthcoming retries it will double the waiting time until it reaches 10 minutes which then will be the delay between the rest of the retries. By using --retry-delay you disable this exponential backoff algorithm. See also --retry-max-time to limit the total time allowed for retries. With a --retry count set to 8 or 10, you should be able to alleviate those issues altogether without any penalty. ok, i have inced the count 10 and removed the deploy config. lets see what happens Assuming fixed. Please reopen if not. |