Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 541306

Summary: Loading problem of p2 metadata on jenkins.eclipse.org for Wild Web Developer
Product: Community Reporter: Gautier de SAINT MARTIN LACAZE <gautier.desaintmartinlacaze>
Component: CI-JenkinsAssignee: CI Admin Inbox <ci.admin-inbox>
Status: RESOLVED FIXED QA Contact:
Severity: blocker    
Priority: P3 CC: denis.roy, mistria, webmaster
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard:

Description Gautier de SAINT MARTIN LACAZE CLA 2018-11-19 08:21:07 EST
Hello, 

For Wild Web Developer we have a recurrent problem on our instance of jenkins.eclipse.org. 

We can't load content.jar from http://download.eclipse.org/eclipse/updates/4.9/
I tried on my laptop, I don't have any problem. 

----
[ERROR] Failed to resolve target definition /home/jenkins/workspace/Wildwebdeveloper_master-ROJ4BPCU4Q5VTI7VBBSR4KWQVLVOSBZHJPOZ6FPJL3JEULICJSWQ/target-platform/target-platform.target: Failed to load p2 metadata repository from location http://download.eclipse.org/eclipse/updates/4.9/: Unable to read repository at http://download.eclipse.org/eclipse/updates/4.9. Unable to read repository at http://download.eclipse.org/eclipse/updates/4.9/R-4.9-201809060745/content.xml.xz. Read timed out -> [Help 1]
----

See : https://jenkins.eclipse.org/wildwebdeveloper/job/Wildwebdeveloper/job/master/18/consoleFull

Is there a problem on the infrastructure or should I made a mistake in the target platform?
Comment 1 Denis Roy CLA 2018-11-19 10:04:31 EST
This is likely on our end. Under certain conditions, the downloads filesystem can become overly stressed, leading to a timeout. For now, perhaps increase the timeouts or retries?
Comment 2 Denis Roy CLA 2018-11-19 10:16:35 EST
Mind you -- I've examined your failure rates, and that frequency is not at all expected from the system load we're seeing. I think something else may be at play. I'll look into it.
Comment 3 Mickael Istria CLA 2018-11-19 11:05:05 EST
FWIW, this build runs on a Kubernetes agent -> https://github.com/eclipse/wildwebdeveloper/blob/master/Jenkinsfile#L6 . Maybe that affects the filesystem/network resolution.
Comment 4 Mickael Istria CLA 2018-11-21 11:17:01 EST
This issue is becoming a blocker for Wild Web Developer. We cannot deliver snapshot to the community members who are eager to test them, and this feedback is extremely important in the last weeks before 2018-12.

> For now, perhaps increase the timeouts or retries? 

I didn't find a way to control that with Tycho.
Comment 5 Denis Roy CLA 2018-11-21 12:26:17 EST
(In reply to Mickael Istria from comment #3)
> FWIW, this build runs on a Kubernetes agent ->
> https://github.com/eclipse/wildwebdeveloper/blob/master/Jenkinsfile#L6 .
> Maybe that affects the filesystem/network resolution.

That is, indeed, the issue. For some reason, the pod is not even connecting to download.e.o.

Can you run your build on the master for now?
Comment 7 Mickael Istria CLA 2018-11-21 13:41:35 EST
(In reply to Denis Roy from comment #6)
> https://jenkins.eclipse.org/wildwebdeveloper/job/Wildwebdeveloper/job/master/
> 27/ is successful.

Awesome! Is this based on a tweak on build-side or a fix on cluster-side? Basically, is there anything that's worth for users to know about this fix?
Comment 8 Denis Roy CLA 2018-11-21 14:20:10 EST
> Awesome! Is this based on a tweak on build-side or a fix on cluster-side?
> Basically, is there anything that's worth for users to know about this fix?


Embarrassingly, I don't know how I fixed this.

Other pods on the same OpenShift node were having similar issues. I rsh'd into the WWD pod and ran wget http://download.eclipse.org/..... and that worked. Since then, access to download.e.o has been working.
Comment 9 Mickael Istria CLA 2018-11-21 14:37:34 EST
(In reply to Denis Roy from comment #8)
> Embarrassingly, I don't know how I fixed this.
> Other pods on the same OpenShift node were having similar issues. I rsh'd
> into the WWD pod and ran wget http://download.eclipse.org/..... and that
> worked. Since then, access to download.e.o has been working.

Ok, so it seems to be on OpenShift side then and nothing to improve in the project build itself.
Thanks for this magic trick then ;)
Comment 10 Denis Roy CLA 2018-11-21 14:43:31 EST
Exactly. Now -- we have been making changes to our core routing recently, so there may have been something stale at play with that node specifically. If the issue comes back, please reopen and we'll look into it deeper.