Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 463967

Summary: Oomph should not use cGit resources
Product: [Tools] Oomph Reporter: Denis Roy <denis.roy>
Component: SetupAssignee: Ed Merks <Ed.Merks>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P3 CC: Ed.Merks, give.a.damus, julian.enoch, malaperle, mknauer, stepper, wayne.beaton
Version: 1.5.0   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
See Also: https://bugs.eclipse.org/bugs/show_bug.cgi?id=499732
https://bugs.eclipse.org/bugs/show_bug.cgi?id=499741
https://bugs.eclipse.org/bugs/show_bug.cgi?id=501194
Whiteboard:
Bug Depends on:    
Bug Blocks: 459836    
Attachments:
Description Flags
Strace of an HTTP request for a static file
none
Strace of an HTTP request for a cached cGit resource none

Description Denis Roy CLA 2015-04-06 09:35:12 EDT
As I understand it, Oomph will query an XML file from cGit (https://git.eclipse.org/c/...)  cGit calls are relatively expensive for static content, since cGit must examine the repo, translate values and pull data from compressed objects. Having millions of people doing this for static content could impact the performance of cGit and our git repos in general.

Please have a mechanism that pulls the cGit content once every 10 minutes/half-hour/hour/day and output it to the oomph download directory on download.eclipse.org instead.
Comment 1 Denis Roy CLA 2015-04-06 09:49:40 EDT
You're probably getting a faster response from cGit (git.eclipse.org) than download.eclipse.org, and that is by design, since we want Git resources to go out before bulk downloads from our server, which are mirrored everywhere.

But if the installer starts pulling from cGit, then Git as a whole will slow down, including committer pushes and Gerrit.

The current scenario is like using the 12-item Express lane at the grocery store for your basket full of groceries.

The compromise here is to put static content on www.eclipse.org. It will be served with top-priority bandwidth without costing an arm and a leg in CPU cycles and file system calls to our most important file system - Git.
Comment 2 Ed Merks CLA 2015-04-06 10:23:27 EDT
A few questions for you...

My impression was that URLs like this:

http://git.eclipse.org/c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.setup

only update their content every 10 minutes, i.e., I have to wait at most 10 minutes to see new content in a browser.  So I've never had the impression that I'm doing a query that computes the contents on each request.  How does this process actually work?

Does it help alleviate the concern that we only make a HEAD request and pull down the file content only when the etag changes?  That only seems to happen when the content really changes.  If I recall correctly the timestamp changes every 10 minutes, again as if some daemon were updating a file. What explains that apparent behavior?

We could certainly implement some type of job that does mirroring to a different site.  Note that it's not just the content hosted by the Oomph project but also the project setups contributed by other projects (and referenced in the project index) that will run into this issue. But then, I imagine the job to mirror the main indexes to a different host could be like a crawler that figures out what other project's setups are references and hence also need to be mirrored.  We'd need to work with you to know how best to implement runtime something like that...
Comment 3 Denis Roy CLA 2015-04-06 11:09:11 EDT
In bug 453438 I had disabled cGit's cache since we were having display issues.  I've since re-enabled it, but I can't guarantee that the cache will always be available.

Regardless, even issuing a HEAD request causes our web servers to invoke cGit (a C executable) in CGI mode, which is a very expensive way of returning an ETag or a 304 for static content. cGit is a great tool to allow humans to browse Git repositories.  It is not adapted to allow millions of machines to fetch a static file.

cGit:
$ time wget http://git.eclipse.org/c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.setup
[snip]
real    0m0.167s


Static file on www.eclipse.org:
$ time wget http://www.eclipse.org/robots.txt
real    0m0.063s


Static file on download.eclipse.org:
$ time wget http://download.eclipse.org/errors/403.html
real    0m0.061s


One client won't feel 1/10th of a second, but one million cGit requests represents an extra 100,000 seconds of CPU crunch time for us, or over 27 hours.


In Europe you'll likely have different values since you'll have additional latency from the Internet, but that latency does not represent the essence of this bug -- CPU crunch time caused by forking cGit processes from a web server.
Comment 4 Denis Roy CLA 2015-04-06 11:09:57 EDT
(In reply to Ed Merks from comment #2)
>  Note that it's not just the content hosted by the Oomph
> project but also the project setups contributed by other projects (and
> referenced in the project index) that will run into this issue. 

Do you have links to that content?  How often is it fetched?
Comment 5 Denis Roy CLA 2015-04-06 11:47:35 EDT
Created attachment 252176 [details]
Strace of an HTTP request for a static file

In case anyone is interested in how web servers work, here is an annotated strace of an Apache web process which is asked for a static file (a jar file, in this case).

Easy fork for the web server -- check permissions, send the file, append to the Apache log.
Comment 6 Denis Roy CLA 2015-04-06 11:50:14 EDT
Created attachment 252177 [details]
Strace of an HTTP request for a cached cGit resource

By contrast, here's a request for a cGit resource which happens to be cached.  This is more than 3 times the CPU work for the server, which must fork a process.

We could use fastCGI to avoid the fork and make cGit faster and more scalable, but I'm not sure that would be compatible with the central caching store, since we have three git.eclipse.org web servers.
Comment 7 Ed Merks CLA 2015-04-07 01:22:24 EDT
Don't worry, we'll address the issue; it's certainly not as severe as p2's constantly greedy approach (which Oomph's offline support helps significantly address). 

In particular, note that we already build a local "mirror" on the user's machine.  The process is essentially visiting the index I referred to earlier, which will visit most of what you see in this folder:

http://git.eclipse.org/c/oomph/org.eclipse.oomph.git/plain/setups/

So primarily we need to mirror that folder, as well as all the href="..." you see in here in the project catalog:

http://git.eclipse.org/c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.projects.setup

Most of those project references are also in Eclipse's git. We should also keep in mind that the Simple installer doesn't need the project setups, so it's needs are dramatically reduced, i.e., just the main index and the product catalog. I.e., just 4 web-hosted resources; p2 does *many* times that per update site. This will be the primary use case for the majority of the users.

If we were to "mirror" the projects as well, preserving directory structure to some other host, a simple VM arg, -Dorg.oomph.redirection.all=http://git.eclipse.org/c/->http://some.host/some.root.folder/, in the *.ini would just work and require no code changes in Oomph (and of course we'd build the installer product to include that and include tasks to install all products that way).

We'll look at the steps involved in producing a mirror, and ask you about where to host such a mirror/caching daemon.  Then we can experiment easily with where ultimately the content is ultimately hosted.
Comment 8 Denis Roy CLA 2015-04-07 14:26:46 EDT
Thanks, Ed.
Comment 9 Ed Merks CLA 2015-04-08 06:33:44 EDT
I've working on the initial prototype for creating a mirror of all the reachable setups located under http://git.eclipse.org/c/ and it struck me that it's just as easy to mirror the directory structure into a zip as it is to do so into a folder.  It strikes me that the advantage of mirroring into a zip is that there is only a single resource for clients to download so only one http access would be involved.   Either case can be easily handled via either of these approaches:

  -Doomph.redirection.setups=http://git.eclipse.org/c/->file:/D:/temp/setup-mirror/
  -Doomph.redirection.setups=http://git.eclipse.org/c/->archive:file:/D:/temp/setups.zip!/
  
I tested both ways to ensure that the installer no longer accesses cGit with these redirections in place, and that works well.

So a question for Denis.  I assume it would be better for the web server if clients only download (or checked Etags/timestamps) of just a single larger file than many smaller files.  Also, because it's a zip, fewer bytes are downloaded in the case it needs to be downloaded (because the ETag/timestamp is different).  I.e., the uncompressed contents are 930kb in total versus the zip which is 108kb.   Is this a good idea?
Comment 10 Denis Roy CLA 2015-04-08 07:46:03 EDT
> So a question for Denis.  I assume it would be better for the web server if
> clients only download (or checked Etags/timestamps) of just a single larger
> file than many smaller files.

Absolutely. The TCP overhead is not negligible when the payload is only a few hundred bytes per request. I am pushing for this type of behaviour in p2 -- it's like using Twitter to write a novel.



> Also, because it's a zip, fewer bytes are
> downloaded in the case it needs to be downloaded (because the ETag/timestamp
> is different).  I.e., the uncompressed contents are 930kb in total versus
> the zip which is 108kb.   Is this a good idea?

Yes.  Our servers can compress text content on-the-fly, but jar/zip/png removes that overhead.


Many thanks for looking into this.  I appreciate it.
Comment 11 Denis Roy CLA 2015-04-08 07:48:27 EDT
> > So a question for Denis.  I assume it would be better for the web server if
> > clients only download (or checked Etags/timestamps) of just a single larger
> > file than many smaller files.

It will also be faster for the clients, as the request/response latency will only happen once per transaction.
Comment 12 Ed Merks CLA 2015-04-08 10:27:34 EDT
I've done an initial commit of the prototype so that I can test it more easily on my Linux virtual box, most importantly to test that File.renameTo works atomically, i.e., that it can rename to an existing file, which does not work on Windows (and isn't expected to work).

http://git.eclipse.org/c/oomph/org.eclipse.oomph.git/commit/?id=e9d03f79db6daaa87d1bf96c01e066ebdc3135b5
Comment 13 Ed Merks CLA 2015-04-08 10:53:54 EDT
Okay, so rename does work to rename the temporary result to the proper location which according to the following is atomic:

http://stackoverflow.com/questions/17715726/in-java-1-6-file-renameto-atomic-on-linux

So the basic idea is that the mirror utility can create the zip file the first time.  In all cases, it loads all the resources reachable from the main index into a resource set, finds all the resources with URIs that normalize to https://git.eclipse.org/c/... and saves each of those, preserving the path structure into the zip file.  It uses EMF's "save only if changed" option so that nothing is written to the zip file if it's the same as what's already in the zipfile.  The timestamp of the result can be checked if the result is different from the original copy.  If not, nothing needs to be done.  If so, the temporary zip can be renamed to the original location, atomically. 

I've tested the installer with the redirection options and verified that no https://git.eclipse.org URIs are accessed (when redirecting to a local zip file).  Of course in the real life scenario, the zip will be on an Eclipse host, and will be downloaded once, as needed, where "as needed" will be determined by the time stamp which the mirror utility ensures is unchanged for unchanged content.

The mirror utility could be fired via a crontab or it could be left running as a daemon that wakes up periodically.  The latter approach avoids the significant repeated startup overhead of an Equinox application.

Note that the mirror utility is part of the installer product, so unpackaging the installer's download provides a self-contained parcel for everything that's needed.

What should be the next steps for further testing?  Given that these resources are much like web content (the user sees nothing useful in the UI until it's been downloaded), I feel that www.eclipse.org would be the best place to host it.  Where should it go?  Could we manually place it at that location so I can further test?
Comment 14 Denis Roy CLA 2015-04-08 14:38:41 EDT
> What should be the next steps for further testing?  Given that these
> resources are much like web content (the user sees nothing useful in the UI
> until it's been downloaded), I feel that www.eclipse.org would be the best
> place to host it.  Where should it go?  Could we manually place it at that
> location so I can further test?


What is "it", the zip file?

If you can arrange to put it somewhere in the oomph downloads area, I can arrange to have it served from www.eclipse.org for optimal bandwidth allocation.

This makes it easy for you to examine/replace the content while giving this file top priority.
Comment 15 Eike Stepper CLA 2015-04-09 12:27:54 EDT
*** Bug 464301 has been marked as a duplicate of this bug. ***
Comment 16 Ed Merks CLA 2015-05-28 09:25:22 EDT
The work for this is complete now.
Comment 17 Denis Roy CLA 2016-08-16 11:54:36 EDT
Reopening.

I think Oomph is "falling back" to cGit when it cannot fetch  https://www.eclipse.org/setups/setups.zip for whatever reason.

Although the above scenario is not supposed to happen, it has happened today. The result is that git.eclipse.org and dev.eclipse.org are both overwhelmed by the load that is caused, which means our Git repositories are offline.

The top-50 requests to http[s]://git.eclipse.org are virtually all oomph-related:

  17359 /c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.setup
   1924 /c/tycho/org.eclipse.tycho.git/plain/setup/Tycho.setup
   1887 /c/ease/org.eclipse.ease.core.git/plain/releng/org.eclipse.ease.releng/oomph/ease.setup
   1883 /c/emf/org.eclipse.emf.eson.git/plain/releng/org.eclipse.emf.eson.releng/ESON.setup
   1872 /c/egit/egit.git/plain/tools/oomph/EGit.setup
   1870 /c/viatra/org.eclipse.viatra.git/plain/releng/org.eclipse.viatra.setup/VIATRAEMF.setup
   1864 /c/acceleo/org.eclipse.acceleo.git/plain/releng/org.eclipse.acceleo.releng/Acceleo.setup
   1859 /c/nebula/org.eclipse.nebula.git/plain/oomph.setup
   1858 /c/amalgam/org.eclipse.amalgam.git/plain/releng/org.eclipse.amalgam.releng/Amalgam.setup
   1857 /c/jubula/org.eclipse.jubula.core.git/plain/org.eclipse.jubula.project.configuration/oomph/jubula.setup
   1848 /c/scout/oomph.git/plain/Scout.setup
   1815 /c/ecoretools/org.eclipse.ecoretools.git/plain/org.eclipse.emf.ecoretools.build/EcoreTools.setup
   1673 /c/jsdt/webtools.jsdt.git/plain/releng/org.eclipse.wst.jsdt.releng/JSDT.setup
   1602 /c/platform/eclipse.platform.text.git/plain/org.eclipse.text.releng/platformText.setup
   1576 /c/platform/eclipse.platform.swt.git/plain/bundles/org.eclipse.swt.tools/Oomph/platformSwt.setup
   1566 /c/platform/eclipse.platform.resources.git/plain/bundles/org.eclipse.core.resources.releng/platformResources.setup
   1564 /c/platform/eclipse.platform.ua.git/plain/org.eclipse.ua.releng/platformUa.setup
   1559 /c/platform/eclipse.platform.runtime.git/plain/bundles/org.eclipse.core.runtime.releng/platformRuntime.setup
   1539 /c/platform/eclipse.platform.ui.git/plain/releng/org.eclipse.ui.releng/platformUi.setup
   1537 /c/platform/eclipse.platform.team.git/plain/bundles/org.eclipse.team.releng/platformTeam.setup
   1512 /c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.products.setup
   1421 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Resources.ecore
   1010 /c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.applications.setup
   1008 /c/oomph/org.eclipse.oomph.git/plain/setups/redirectable.products.setup
    988 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Targlets.ecore
    982 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Predicates.ecore
    955 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Mylyn.ecore
    954 /c/oomph/org.eclipse.oomph.git/plain/setups/models/PDE.ecore
    952 /c/oomph/org.eclipse.oomph.git/plain/setups/models/SetupWorkingSets.ecore
    940 /c/oomph/org.eclipse.oomph.git/plain/setups/models/JDT.ecore
    932 /c/oomph/org.eclipse.oomph.git/plain/setups/models/WorkingSets.ecore
    927 /c/oomph/org.eclipse.oomph.git/plain/setups/com.github.projects.setup
    925 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Maven.ecore
    923 /c/oomph/org.eclipse.oomph.git/plain/setups/redirectable.projects.setup
    923 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Git.ecore
    916 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Projects.ecore
    915 /c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.projects.setup
    912 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Workbench.ecore
    898 /c/oomph/org.eclipse.oomph.git/plain/setups/models/SetupTarglets.ecore
    895 /c/oomph/org.eclipse.oomph.git/plain/setups/models/ProjectSet.ecore
    891 /c/oomph/org.eclipse.oomph.git/plain/setups/models/Launching.ecore
    741 /c/cdt/org.eclipse.cdt.git/plain/releng/CDT.setup
    737 /c/oomph/org.eclipse.oomph.git/plain/setups/org.eclipse.all.product.setup
    737 /c/oomph/org.eclipse.oomph.git/plain/setups/interim/products/PapyrusWithCDO.setup
    737 /c/cdo/cdo.git/plain/plugins/org.eclipse.emf.cdo.explorer.ui/CDOExplorer.setup
    733 /c/recommenders/org.eclipse.recommenders.git/plain/tools/oomph/recommenders.setup
    732 /c/cdo/cdo.git/plain/plugins/org.eclipse.emf.cdo.server.product/CDOServer.setup
    731 /c/emf/org.eclipse.emf.git/plain/releng/org.eclipse.emf.releng/EMF.setup
    719 /c/oomph/org.eclipse.oomph.git/plain/setups/interim/E4Tools.setup
    716 /c/tracecompass/org.eclipse.tracecompass.git/plain/TraceCompass.setup


The single most common user-agent generates almost 20x more requests than the runner-up, which leads me to believe these are requests from Eclipse clients:

  97162 "Apache-HttpClient/4.3.6
   5680 "Mozilla/5.0 (Windows NT
   2316 "Mozilla/5.0 (compatible; bingbot/2.0;
   2063 "Mozilla/5.0 (Macintosh; Intel
   1680 "Mozilla/5.0 (compatible; Googlebot/2.1;
   1602 "Jakarta Commons-HttpClient/3.1" 
   1497 "Mozilla/5.0 (X11; Linux
    972 "Mozilla/5.0 (X11; Fedora;
    942 "Mozilla/5.0 (X11; Ubuntu;
    602 "Mozilla/5.0 (compatible; DotBot/1.1;
    577 "Mozilla/5.0 (compatible; Baiduspider/2.0;
    313 "Java/1.8.0_65"  
    256 "Mozilla/5.0 (compatible; MSIE
    117 "Mozilla/4.0 (compatible;)" 
    115 "Mozilla/5.0 (Linux; Android
     84 "ltx71 - (http://ltx71.com/)"
     76 "Java/1.8.0_101"  
     72 "Mozilla/5.0 (compatible; Yahoo!

Please discontinue using cGit in this fashion!
Comment 18 Ed Merks CLA 2016-08-18 08:45:54 EDT
The fix is committed to master:

http://git.eclipse.org/c/oomph/org.eclipse.oomph.git/commit/?id=6b8afcf45a14c07ec831774ba668ff1bf3cd7e73

The fixes include ensuring that the local cached version of setups.zip is used even if the web server returns a 404.  This was the primary cause of the massive overuse of git.eclipse.org this week.  Mind you, I don't believe that zip was ever missing, but the website was definitely returning 404 for it.  In this case we did fall back to access git.eclipse.org directly (based on the premise that if a resource is no longer available on the web, we should stop using the local cached version).  But the setups.zip is crucial, and cannot ever be removed, so a 404 is bogus, and it's best to use the cache.

The fix also ensures that git.eclipse.org URIs can only be accessed by the SetupArchiver application; there is no magical option to subvert this.  The new logic ensures that setups.zip always consulted (except by the  SetupArchiver).  So even if downstream users of Oomph author their own setup catalogs that reuse direct git.eclipse.org resources, and they don't use an archive as we do by default, they will not be permitted direct access to git.eclipse.org, In such a case, they will only have access to setups in the Oomph catalog, or that they make accessible by using the SetupArchiver application to build their own archive and/or cache it locally.

When direct access to git.eclipse.org is detected other than by the SetupArchiver application, it is blocked (IOException is thrown), unless it's already available in the local cache.  Also, a single warning is logged explaining how to use the SetupArchiver application to make the URI accessible by ensuring it's available in the local cache; it includes the full command line invocation that the user can copy and paste to the console/terminal.  

Note that the SetupArchiver application been moved so that it's present both in the Eclipse Installer product and in any Eclipse installation containing Oomph setup.  As such, any committer developing a new setup can relatively easily use it.  Though generally folks develop using a local version in the Git clone and even this won't be an issue.  It's definitely an issue for us when someone wants to add a setup to the catalog, but we can easily run the SetupArchiver and then test their contribution before committing it.   And of course, once committed, the SetupArchiver Hudson job will kick in within 5 minutes to make a new setups.zip available.

And finally, when we communicate with Eclipse servers, we use our own UserAgent  eclipse/oomph/<oomph-version>.  When used from the installer it will be eclipse/oomph/installer/<oomph-version> and from the SetupArchiver eclipse/oomph/archiver/<oomph-version>.
Comment 19 Denis Roy CLA 2016-08-19 09:39:58 EDT
I don't understand everything you wrote but I appreciate the the fix, Ed.
Comment 20 Denis Roy CLA 2016-09-13 13:18:42 EDT
Have newer versions of the installer already been published?  We had another Git outage related to this:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=501194#c3
Comment 21 Ed Merks CLA 2016-09-13 13:30:36 EDT
(In reply to Denis Roy from comment #20)
> Have newer versions of the installer already been published?  We had another
> Git outage related to this:
> 
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=501194#c3

No, the old installers are still in use and of course the old installations need to be updated with the latest.

But please read your email. I'll repeat the contents here:

Guys,

Is there something blocking access to the setups.zip depending on the agent?  The latest version uses eclipse/oomph/installer/1.5.0.qualifier as the agent, but the old installations will use ECF's default which produces a forbidden exception as described below.   Older installations not being able to update the setups archive is problematic at best, and at worst, anyone who doesn't have this in their cache at all will hammer gitc again.

Regards,
Ed


-------- Forwarded Message --------
Subject: 	Re: Can't disable oomph completely
Date: 	Tue, 13 Sep 2016 17:24:21 +0200
From: 	Sergei Karimov <>


I compared requests using fiddler and made a test program to demonstrate the issue. It seems www.eclipse.org doesn't like User-Agent: Apache-HttpClient/4.3.6 (java 1.5)
package delme;

import java.net.HttpURLConnection;
import java.net.URL;

public class Main {
public static void main(String[] args) throws Exception {
final URL url = new URL("http://www.eclipse.org/setups/setups.zip");
final HttpURLConnection con = (HttpURLConnection) url.openConnection();
//this header value is sent from eclipse and 403 received
con.setRequestProperty("User-Agent", "Apache-HttpClient/4.3.6 (java 1.5)");
//this header value is sent from browser and 200 received
// con.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko");
con.connect();
final int responseCode = con.getResponseCode();
System.out.println("responseCode=" + responseCode);
}
}
Comment 22 Denis Roy CLA 2016-09-14 15:12:52 EDT
> Guys,
> 
> Is there something blocking access to the setups.zip depending on the agent?

That was our bad.  A wrong rule made its way in there. It has since been pulled.