| Summary: | p2 meta-data files are being served from mirrors | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Community | Reporter: | David Williams <david_williams> | ||||
| Component: | Cross-Project | Assignee: | Eclipse Webmaster <webmaster> | ||||
| Status: | RESOLVED FIXED | QA Contact: | |||||
| Severity: | normal | ||||||
| Priority: | P3 | CC: | am2605, contact, irbull, mario.pierro, mober.at+eclipse, pwebster, raulfortes, scott, stephan.herrmann | ||||
| Version: | unspecified | ||||||
| Target Milestone: | --- | ||||||
| Hardware: | PC | ||||||
| OS: | Windows Server 2008 | ||||||
| Whiteboard: | |||||||
| Attachments: |
|
||||||
|
Description
David Williams
Even with the current scheme I've experienced some painfully long times of just fetching the metadata. This is time the user actually waits, eager to select the software to be installed (as opposed to the actual download when everybody will be at the coffee machine anyway). So if turning off redirection for metadata possibly creates a new bottleneck this could hurt. Are you saying retry-next-mirror is not working for metadata? Is that the point?
>
> Are you saying retry-next-mirror is not working for metadata?
> Is that the point?
No, the point is the settings on the Eclipse Foundation's webserver(s).
p2 never gets a chance to retry-next-mirror, It doesn't get that far. To be clear, I mean it doesn't get that far when it get's a 404 error while trying to get the initial content.jar.
p2 has never used the "mirrorsURL" list for this type of metadata ... only uses it once it has figured out what it wants, and goes to get the artifact(s).
When p2 decides it needs a whole bunch of content.jar/xml files, I'm not sure of the effect on performance. I'd think most the time it'd be "pass or fail", not slower. But could be failing on some content.jar requests, but still finds what it needs else where?
I forgot the mention, another reason I opened this bug is that when using the b2 aggregator, many repositories show up as "no valid repository" that the provided URL ... yet on the build machine, the b3 aggregator works fine. This could be a bug in the aggregator, or something, but also made me think it is failing to retrieve anything from the auto-mirrored site, but on build.eclipse.org it can get to download.eclipse.org just fine,
I guess the best test would be for someone to stay up late :) and write a script to try and wget many of the content.jar's provided for indigo or helios, and see how many of them fail.
But, I don't think going to Eclipse.org to get these relatively small content.jar files would be that much slower than getting them from mirrors, since presumably they are relatively small ... but, guess that's something else that could be investigated explicitly (or, evaluated by someone who actually knows about this stuff, more than I do. :)
well ... I wrote a simple script to directly "get" many of the content.jar files for indigo M3 contributions, About 50 of them. And apparently some really are invalid ... some projects "contribute" more than one URL for some reason. so those failures are not a matter of "bad mirrors". Here's some "stats" on the results. All 50 jars totaled approx. 5 Megs. (what's that ... 100K a piece? on average. When I ran the script on build.eclipse.org, none when to a mirror. When I ran here in North Carolina, most were retrieved from download.eclipse.org (as expected) but 6 were automatically redirected to ftp.osuosl.org. None resulted in 404 errors. I'll attach the simple script. Note many URLs that would require a compositeContent.jar were simple omitted from the list, to save myself a little editing or programming. More complicated (and complete) tests could be made ... but I think I'd want to do it in Java instead of bash :) I'll attach the script ... if others wanted to run it occasionally, from other locations? it'd be interesting to hear the results (if any 404's occurred and/or different mirrors used. Assigning to webmasters to reduce spamming so many on auto cc list with each message ... but doesn't mean they are solely responsible for "fixing" if they really want these requests to go to mirrors. Created attachment 182870 [details]
simple bash script to check content.jar files for Indigo M3.
(In reply to comment #3) > I'll attach the script > ... if others wanted to run it occasionally, from other locations? it'd be > interesting to hear the results (if any 404's occurred and/or different mirrors > used. Here's my mileage from Berlin: All found at download.eclipse.org except for the following which were fetched from http://ftp.osuosl.org : http://download.eclipse.org/birt/update-site/4.0-interim/content.jar http://download.eclipse.org/eclipse/updates/3.7milestones/S-3.7M3-201010281441/content.jar http://download.eclipse.org/eclipse/updates/3.7milestones/S-3.7M3-201010281441/content.jar http://download.eclipse.org/eclipse/updates/3.7milestones/compositeContent.jar http://download.eclipse.org/tools/gef/updates/releases/content.jar http://download.eclipse.org/webtools/downloads/drops/R3.3.0/S-3.3.0M3-20101104191817/repository/content.jar (duplicates are duplicates in the script :) Total time: real 2m47.507s with a high for emf: 558K in 18s. I ran it twice and results were actually the same. (In reply to comment #6) From Salzburg, Austria it's very similar: - 2:12 minutes total download time - High for EMF Releases (571K in 12 seconds) and EMF Milestones (501K in 17 seconds) - Same 6 files redirected to osuosl.org as Stephan found I agree with Stephan that the end user experience of working with the "Install new software" dialog is still not breathtaking in terms of performance. Especially when a couple of composites are enabled, such as when starting with the Eclipse 4.1 M3 SDK from http://download.eclipse.org/e4/sdk/ . > And, I'm not sure this is such a good idea.
It's all just a seasonal thing -- September and October are our busiest months of the year. We're seeing a return to normalcy, so these redirects will be turned off soon.
I figured it would be better to redirect high-traffic files to stable mirrors with gobs of bandwidth, rather than having our users wait 2 minutes to fetch meta-data files from d.e.o at 27K/sec...
(In reply to comment #8) > > And, I'm not sure this is such a good idea. > > It's all just a seasonal thing -- September and October are our busiest months > of the year. We're seeing a return to normalcy, so these redirects will be > turned off soon. > > I figured it would be better to redirect high-traffic files to stable mirrors > with gobs of bandwidth, rather than having our users wait 2 minutes to fetch > meta-data files from d.e.o at 27K/sec... Ok, well now that we know its intentional, and desirable, I guess the next step is to assess if anything in p2 (or our process) needs to change. Maybe not, I'm just asking. Let's assume for now the 404 error is rare and doesn't need any sort of fix or fallback behavior. That leaves 1) should these meta-data type files be signed? 2) does p2 handle the relative URLs correctly when compositeContent.jar is fetched from a mirror? The first is a general security type question to everyone ... the second is the more important question, and I hope p2 team knows off the top of their head, so I'll "assign" bug to Pascal to help get his attention. Well, maybe 404 is not so rare after all ... here's a comment from platform newsgroup. This is the third person having issues ... the "workaround" he refers to is going to a non-eclipse site. = = = = = = = Thanks for the workaround Shane. I have been having the exact same issue since about Friday. Here's the output of wget from my command line. C:\Users\andrew>wget http://download.eclipse.org/releases/helios/compositeConten t.jar SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc syswgetrc = C:\Program Files\gnuwin32/etc/wgetrc --2010-11-10 10:25:54-- http://download.eclipse.org/releases/helios/compositeCo ntent.jar Resolving download.eclipse.org... 206.191.52.47 Connecting to download.eclipse.org|206.191.52.47|:80... connected. HTTP request sent, awaiting response... 302 Found Location: http://www.gtlib.gatech.edu/pub/eclipse/releases/helios/compositeConte nt.jar [following] --2010-11-10 10:25:55-- http://www.gtlib.gatech.edu/pub/eclipse/releases/helios /compositeContent.jar Resolving www.gtlib.gatech.edu... 128.61.111.10, 128.61.111.11, 128.61.111.9 Connecting to www.gtlib.gatech.edu|128.61.111.10|:80... connected. HTTP request sent, awaiting response... 403 Forbidden 2010-11-10 10:25:56 ERROR 403: Forbidden. I've literally wasted 3 days on this! = = = = = And, now I am not getting a 404, but several hours after making a change to /indigo/releases/compositeContent.jar I am still getting "old" version from www.gtlib.gatech.edu Not good. $ wget http://download.eclipse.org/releases/indigo/compositeContent.jar --2010-11-12 13:43:07-- http://download.eclipse.org/releases/indigo/compositeContent.jar Resolving download.eclipse.org... 206.191.52.47 Connecting to download.eclipse.org|206.191.52.47|:80... connected. HTTP request sent, awaiting response... 302 Found Location: http://www.gtlib.gatech.edu/pub/eclipse/releases/indigo/compositeContent.jar [following] --2010-11-12 13:43:07-- http://www.gtlib.gatech.edu/pub/eclipse/releases/indigo/compositeContent.jar Resolving www.gtlib.gatech.edu... 128.61.111.11, 128.61.111.9, 128.61.111.10, ... Connecting to www.gtlib.gatech.edu|128.61.111.11|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 413 [application/x-java-archive] Saving to: `compositeContent.jar.2' I've removed the redirects. Ah well, I guess a crawling download is better than a broken one. (In reply to comment #12) > I've removed the redirects. Ah well, I guess a crawling download is better > than a broken one. Thank you. Especially for these 8 specific files ... people depend on them for downstream builds, usually immediately after the URL is "made available" and it would be very hard (if not impossible) for them to know they are getting "old" stuff. I have problem with a repository: "Unable to read repository at http://download.eclipse.org/eclipse/updates/3.6. http://download.eclipse.org/eclipse/updates/3.6 is not a valid repository location." I using 3.6.1 for Linux 64bit. Any idea ? []'s Raul Are the meta-data files still being served from mirrors? Some of our users were unable to install the plugins because of a missing EMF dependency. Running wget to fetch compositeContent.jar results in a redirect to the gatech.edu mirror, which seems to be down. >wget http://download.eclipse.org/releases/helios/compositeContent.jar --15:05:51-- http://download.eclipse.org/releases/helios/compositeContent.jar => `compositeContent.jar' Resolving download.eclipse.org... 206.191.52.47 Connecting to download.eclipse.org|206.191.52.47|:80... connected. HTTP request sent, awaiting response... 302 Found Location: http://www.gtlib.gatech.edu/pub/eclipse/releases/helios/compositeContent.jar [following] --15:05:51-- http://www.gtlib.gatech.edu/pub/eclipse/releases/helios/compositeContent.jar => `compositeContent.jar' Resolving www.gtlib.gatech.edu... 128.61.111.9, 128.61.111.10, 128.61.111.11 Connecting to www.gtlib.gatech.edu|128.61.111.9|:80... failed: Connection timed out. Connecting to www.gtlib.gatech.edu|128.61.111.10|:80... failed: Connection timed out. Connecting to www.gtlib.gatech.edu|128.61.111.11|:80... failed: Connection timed out. Any ideas? I was able to fix this by: * Disabling the existing http://download.eclipse.org/releases/helios update site (or http://download.eclipse.org/releases/indigo for Eclipse 3.7) * Adding one of the mirror sites directly, e.g. http://ftp-stud.fht-esslingen.de/pub/Mirrors/eclipse/releases/helios/ (or http://ftp-stud.fht-esslingen.de/pub/Mirrors/eclipse/releases/indigo/ for Eclipse 3.7) If the download.eclipse.org update site is not disabled, the install will proceed but it will be too slow to be usable. I suppose this is because the broken mirror is still being accessed first, switching to the new mirror after it has timed out. So, basically it seems that the redirection system in eclipse.org uses mirrors which are not working, and nothing can be done automatically from the client side to prevent this... According to cross-project posting, current issue was fixed by "I've redirected these to OSU OSL". But ... point of this bug was that metadata itself (those 8 files) should not ever come from mirrors. See comment 12. I guess it works most of the time ... but ... that's not really the way it was designed to work and will sometimes fail outright or (maybe worse) appear to work but be out of date. FYI, besides the 8 files originally mentioned, p2.index should come from mirrors either (with new things I've learned since this was opened). I understand the reasoning behind redirecting to a mirror, but ... there are risks involved when doing so ... for these 9 file names. This is about to bite us again. While not "publically released" yet, nor yet tied in to indigo SR2 -- until Friday at 9 AM ... I wanted to do an early test of platform's "access" for p2. Hence (knowing the secret location :) I pointed p2 to http://download.eclipse.org/eclipse/updates/3.7/R-3.7.2-201202080800/ and received "no repo found". Having learned from experience, I tried http://download.eclipse.org/eclipse/updates/3.7/R-3.7.2-201202080800/content.jar from a browser and wget and could see this request was being incorrectly mirrored to a mirror that did not have that file: $ wget http://download.eclipse.org/eclipse/updates/3.7/R-3.7.2-201202080800/content.jar --2012-02-22 22:13:28-- http://download.eclipse.org/eclipse/updates/3.7/R-3.7.2-201202080800/content.jar Resolving download.eclipse.org... 206.191.52.47 Connecting to download.eclipse.org|206.191.52.47|:80... connected. HTTP request sent, awaiting response... 302 Found Location: http://ftp.osuosl.org/pub/eclipse/eclipse/updates/3.7/R-3.7.2-201202080800/content.jar [following] --2012-02-22 22:13:28-- http://ftp.osuosl.org/pub/eclipse/eclipse/updates/3.7/R-3.7.2-201202080800/content.jar Resolving ftp.osuosl.org... 64.50.233.100, 64.50.236.52 Connecting to ftp.osuosl.org|64.50.233.100|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2012-02-22 22:13:28 ERROR 404: Not Found. Particularly disappointing, since my mini test script shows there are 8 mirrors containing the artifacts number of http mirrors: 8 for /eclipse/updates/3.7/R-3.7.2-201202080800/ but no way to get to them without the content.jar. If this kind of thing happens Friday, users won't get updates as expected, either, at all ... or, perhaps "inaccurate" or "partial" updates. Not sure why this was assigned to Pascal ... seems a "webmasters" problem to solve. This issue is happening again now for users of our plugins, as an EMF dependency needs to be downloaded from download.eclipse.org when they are installed. As mentioned in my previous comment, a workaround is to add one of the mirror sites to the list of available update sites - but it requires all download.eclipse.org update sites to be disabled - leaving users with unstable settings once the installation has finished. Haven't seen any issues for a while, so will re-close as fixed. Be sure to say if others see issues ... or if I am misunderstanding. |