Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 329240

Summary: Redirects from p2 repos at download.eclipse.org causes issues when behind "restrictive" corporate firewall
Product: Community Reporter: Anders Hammar <anders.g.hammar>
Component: WebsiteAssignee: phoenix.ui <phoenix.ui-inbox>
Status: RESOLVED NOT_ECLIPSE QA Contact:
Severity: normal    
Priority: P3 CC: denis.roy, gunnar, pascal
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Whiteboard: stalebug
Attachments:
Description Flags
a log of the reproduce scenario none

Description Anders Hammar CLA 2010-11-02 04:50:59 EDT
Build Identifier: 

In a corporate environment where Internet access is very restrictive and each url needs to be opened individually, the redirects when accessing P2 repos at download.eclipse.org causes great problems. The fact that what url being used is not controlled by the client behind the firewall makes this not work.
This is a problem when accessing the Galileo P2 repos at eclipse.org directly, but even when using a Galileo p2 repo mirror. The latter is due to the fact that the Galileo release repo references the Eclipse 3.5 Updates repo (which is then pulled from eclipse.org even if the Galileo rel repo accessed is a mirror).
I have not tried any Helios repo, but any redirect that can't be controlled by the client is an issue.

I have full understanding that you want to be conservative with the eclipse.org bandwidth. To help you, we're using Nexus Pro to cache these P2 repos and reduce the burden on you. We can even use a mirror closer to us. But the uncontrolled (out of our perspective) redirects breaks the entire setup as we can only allow access to a few urls specified in advance.

Reproducible: Always

Steps to Reproduce:
For example the Galileo Releases repo from a mirror:
1. Access the Galileo Rleases repo at http://ftp.ing.umu.se/mirror/eclipse/releases/galileo/
2. After a while, there is access to http://download.eclipse.org/eclipse/updates/3.5/artifacts.jar, which results in a redirect

I'll attach a log of this scenario.
Comment 1 Anders Hammar CLA 2010-11-02 04:54:24 EDT
Created attachment 182194 [details]
a log of the reproduce scenario

The requests log was captured when setting this up in an instance of the Nexus Pro repository manager.
Comment 2 Denis Roy CLA 2010-11-11 13:44:10 EST
What is the root cause, the redirect or an overly restrictive firewall?

The redirects are done when our bandwidth is saturated.  If you'd like to donate bandwidth, I'll gladly remove the redirects  :)
Comment 3 Anders Hammar CLA 2010-11-12 02:16:37 EST
(In reply to comment #2)
> What is the root cause, the redirect or an overly restrictive firewall?

I guess this depends on who's point of view we're talking about. We would say the root cause is the redirect, but I'm sure you'd say it's a restrictive firewall. :-)
I wouldn't go so far as saying "overly" though. Corporations working in some areas can't allow full/uncontrolled Internet access.

But regardless, what I find strange is that there is mirroring functionality built into P2, which we can handle (actually Nexus Pro allows us to specify which mirrors to use). But this initial redirect is just not possible to cope with.
And the prime issue here, is that there seems to be no way around this. We've tried using one of your mirrors (the one closest to us), but as the Galileo release P2 repo (which we access at the mirror) includes a reference to the Eclipse 3.5 update site (this one pointing at eclipse.org) we run into the redirect anyways. And no way to control it.

> The redirects are done when our bandwidth is saturated.  If you'd like to
> donate bandwidth, I'll gladly remove the redirects  :)

I fully understand this. I work within the Maven area and I'm very aware of that bandwidth costs money. The irony here is that we're trying to help by setting up an internal "P2 caching proxy" via the P2 proxy feature of Sonatype Nexus Pro. Today, all developers download their own updates through a separate Internet connection. We want to move to having an internal proxy for this (this is where Nexus Pro comes into play) instead. This would reduce the load of all these 50+ devs to just one central access per artifact.

One thing I don't get is why you've added this initial redirect for P2, when (as I understand things) P2 has internal handling of mirrors. The P2 internal mirror feature can be handled so that we only access pre-defined mirrors (where we can have access through the firewall). But I might be missing something here?
Comment 4 Gunnar Wagenknecht CLA 2010-11-12 02:52:25 EST
(In reply to comment #3)
> One thing I don't get is why you've added this initial redirect for P2, when
> (as I understand things) P2 has internal handling of mirrors. The P2 internal
> mirror feature can be handled so that we only access pre-defined mirrors (where
> we can have access through the firewall). But I might be missing something
> here?

The p2 mirror functionality only supports a few limited use cases. There are several issues which prevent a broader adoption.

* Metadata is not read from mirrors. Whenever p2 checks for artifacts/content (xml/jar) it checks the *original* location not the mirrors. Thus, just the "up-to-date" checks always hit the Eclipse.org infrastructure.

* The list of mirrors is located within the content/artifacts metadata. This makes it impossible for webmasters (IT staff) to configure mirrors. Plus, there is no easy way to place the list of mirrors into the metadata. It's hidden somewhere in the build process. Thus, many p2 repos come without mirror support.

* There is also no way to configure mirrors from the outside. For example, I run builds consuming p2 stuff. There is no way to *force* p2 to read bits from a very specific *internal* mirror. It always goes to the remote location.

Luckily, p2 transports today are based heavily on HTTP today. HTTP is a very well understood protocol by many IT people out there. Thus, using the possibilities of HTTP such as proxies, redirects, etc. is one way out of the bandwidth dilemma.

I'm pretty sure the list of mirrors can be generated somehow in some easy consumable format which can be handed out to corporate IT stuff to let them configure firewalls appropriately.
Comment 5 Anders Hammar CLA 2010-11-12 03:03:29 EST
(In reply to comment #4)
> Luckily, p2 transports today are based heavily on HTTP today. HTTP is a very
> well understood protocol by many IT people out there. Thus, using the
> possibilities of HTTP such as proxies, redirects, etc. is one way out of the
> bandwidth dilemma.
> 
> I'm pretty sure the list of mirrors can be generated somehow in some easy
> consumable format which can be handed out to corporate IT stuff to let them
> configure firewalls appropriately.

If the redirect is pre-defined I'm fine with that. But today it's not. We're getting redirected to one place and I have no way of ensuring that it will not change all of a sudden to a different url.
Btw, the redirect is not even to the mirror closest to us.
Comment 6 Denis Roy CLA 2011-04-06 15:18:56 EDT
When all of Eclipse.org's bandwidth is gone, I turn on a set of redirects to deflect the heavy traffic away from us.  As Gunnar mentioned, although p2 does deal with mirrors, it always pulls the catalogs from the home site, and sometimes it even considers the home site to be a valid mirror.

If your corporation has a large number of Eclipse developers, perhaps having an internal mirror of download.eclipse.org would make sense.  Regardless, if many mirrors do not work because of firewall restrictions, how does p2 behave when it fails to get a response from these mirrors?  Software installs must be very slow?

I'm not sure how I can resolve this issue.  For the most part, we are not redirecting traffic.
Comment 7 Anders Hammar CLA 2011-04-07 04:25:29 EDT
(In reply to comment #6)

> If your corporation has a large number of Eclipse developers, perhaps having an
> internal mirror of download.eclipse.org would make sense.  Regardless, if many
> mirrors do not work because of firewall restrictions, how does p2 behave when
> it fails to get a response from these mirrors?  Software installs must be very
> slow?

The irony of the whole situation is that we're trying to decrease the load on eclipse.org by setting up a Nexus Pro instance that would proxy P2 repos. So the artifacts would be cached and only downloaded once for all developers. As we've run into these issue we have had to set up full mirrors of some download sites, which has resulted in a higher load on the eclipse site (although just one download of each artifact instead of each developer downloading the artifacts directly from eclipse.org.

Please understand that I fully appreciate the reason you do redirects to handle your load, but the problem is that it doesn't work well with some enterprises' restrictive firewall rules.

> I'm not sure how I can resolve this issue.  For the most part, we are not
> redirecting traffic.

Yes, I know. Everything was working fine for a long time, but then we had this issue. And just having this issue once is bad enough as it has an impact on all developers of the organization.

One solution is, as you mention, for use to set up a full internal mirror instead of proxying eclipse.org. The drawback of that is that we need to maintain this mirror whenever there are new releases, which we don't have to when we proxy.

Would a solution where proxies (like Nexus Pro) are not redirected work for you? My thinking here is that these proxies will actually help decrease the load on eclipse.org as the artifacts are cached. (I guess a check on the user agent would be a way of doing that.)

Another solution would be for us to use a mirror close to use that doesn't do redirects. However, that doesn't work for the reproducible scenario in this issue as the (mirrored) Galileo Releases P2 site references eclipse.org for some artifacts (and we're back to the redirect problem).
Comment 8 Gunnar Wagenknecht CLA 2011-04-07 05:24:42 EDT
Isn't is possible to configure the firewall to allow the proxy access to all those URLs but not the developer machines? That would be a perfect reason for your developers to use that proxy.
Comment 9 Anders Hammar CLA 2011-04-07 05:44:53 EDT
(In reply to comment #8)
> Isn't is possible to configure the firewall to allow the proxy access to all
> those URLs but not the developer machines? That would be a perfect reason for
> your developers to use that proxy.

It would probably be possible if we would have that list of mirrors. One downside would be to maintain it should any new URL be added.
Comment 10 Gunnar Wagenknecht CLA 2011-04-07 05:49:48 EDT
My assumption was that your firewall does inspect the HTTP traffic and only allows connection to certain websites based on a whitelist. It may be possible to allow http://*.jar URLs to any host for the proxy only.
Comment 11 Anders Hammar CLA 2011-04-07 06:39:32 EDT
(In reply to comment #10)
> My assumption was that your firewall does inspect the HTTP traffic and only
> allows connection to certain websites based on a whitelist. It may be possible
> to allow http://*.jar URLs to any host for the proxy only.

I would have to check with the firewall people about this. But I have a feeling that they will not allow that as it would be a too freely assigned url space, opening up for a trojan (or whatever) getting in to send info home.
Comment 12 Eclipse Genie CLA 2015-01-10 01:05:24 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 13 Anders Hammar CLA 2015-01-12 03:27:47 EST
I'm no longer with the corp where this issue was seen, so I don't know of it is still a problem. I think the ticket can be closed though as no one else has shown any interest in it.
Comment 14 Denis Roy CLA 2015-01-12 11:15:34 EST
Let's close this one as NOT_ECLIPSE.  We use p2 redirects as a last resort, and I haven't done them in years since we've added a ton of bandwidth recently.