| Summary: | [repository] Repository update policy | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Equinox | Reporter: | Igor Fedorenko <igor> | ||||||
| Component: | p2 | Assignee: | Pascal Rapicault <pascal> | ||||||
| Status: | RESOLVED WONTFIX | QA Contact: | |||||||
| Severity: | enhancement | ||||||||
| Priority: | P3 | CC: | bsd, christian.k.2510, contact, d.nachev, digulla, d_a_carver, eclipse, Ed.Merks, g.watson, hmalphettes, irbull, jan.sievers, joerg83reichert, kodstark, markus.kell.r, mike, pascal.rapicault, pascal, sbouchet, steffen.kriese, susan, t-oberlies, thomas | ||||||
| Version: | 3.7 | ||||||||
| Target Milestone: | --- | ||||||||
| Hardware: | PC | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Attachments: |
|
||||||||
|
Description
Igor Fedorenko
I guess some may not be familiar with Maven repository update check policy ;-) so let me explain expected behaviour * First time p2 needs to access remote repository, I expect it to download remote repository metadata and cache it locally * Any subsequent request to the repository, even after p2 jvm restart, I expect p2 to access remote repository according to configured update policy ** "never", means NO remote access if local cache is present. ** "daily", means check remote for updates if last update check was performed more than 24 hours ago. ** "always" check for remote updates once per build (tycho is likely to access the same repo multiple times during the build). * I also expect to be able to force remote repository update from command line, via a well-known maven invocation parameter (this means P2 needs to provide Tycho an API to force update based on Maven cli parameters) Is the cache using timestamps today? Often, a simple HEAD request is enough to get a timestamp and size of the remote file. I'd like a strategy that, if this information is unchanged, didn't download the jar/xml files. I believe p2 already does HEAD trick and skips download if it is reasonably sure nothing changed. This is not enough for Tycho. Larger builds can reference many repositories, especially if composite repositories are taken into account. I have reports from Tycho users that Tycho sometimes takes >1 minute just to check nothing changed. It is very common to run the same build over and over again, when trying to fix that pesky unit test failure, for example. Waiting >1 minute just to run a single test becomes frustrating. (In reply to comment #3) > I have reports from Tycho users that Tycho sometimes takes >1 minute just to > check nothing changed. I don't think that's specific to Tycho. We see this too. And what's worse, we also see this when updating the IDE. I'm fairly certain that I see the content.jar and compositeContent.jar being downloaded even though I've done the same operation fairly recently. This would indicate that the HEAD trick isn't used or at least that it isn't used in an optimal way. (In reply to comment #4) > (In reply to comment #3) > > I have reports from Tycho users that Tycho sometimes takes >1 minute just to > > check nothing changed. > > I don't think that's specific to Tycho. We see this too. And what's worse, we > also see this when updating the IDE. I'm fairly certain that I see the > content.jar and compositeContent.jar being downloaded even though I've done the > same operation fairly recently. This would indicate that the HEAD trick isn't > used or at least that it isn't used in an optimal way. As far as I can tell from the code, HEAD is properly used, but please take an additional look as I may have missed something obvious. I'm thinking that the message is also probably the same whether you download or just do a HEAD. Finally it use to be that the foundation would redirect to ec2 even for the metadata files which could add to the aspect of slowness (which is also what we see in tycho builds) As indicated during today's call, I'm working on this and my first goal is to provide the complete implementation of the "core" functionalities. We briefly discussed the ability to start exposing these policies in the UI but it is not clear who / how / if it helps anyone to be able to control these. The other thing that was discussed was the ability to have the repo itself carry information about a "default" update policy since a producer knows what he does with a given repo. For example a lot of the children repos are never changed (e.g. N / I / M repos for the SDK) and as such it could make sense to flag them as "never" check. In many cases (platform and release train composites in particular), it would be sufficient to just check the composite. If it hasn't changed, then there's no need to check it's children. In fact, just enabling that would boost performance a lot. And it would give the publisher some level of control too. If he indeed does change a child but doesn't change the composite, then all he needs to do is to touch the composite. That changes the timestamp which in turn triggers a new check. (In reply to comment #6) > In many cases (platform and release train composites in particular), it would > be sufficient to just check the composite. If it hasn't changed, then there's > no need to check it's children. I'd prefer that this wasn't the default: I have some composite repos where the children *do* change independently of the composite. It would be nice to have a never-check flag. No need to worry here. First because I don't know if I'll have the time to go that far, and second because I think this "policy" needs to be mentioned in each child and not a default in the parent. (In reply to comment #4) > This would indicate that the HEAD trick isn't > used or at least that it isn't used in an optimal way. I've debugged this once and from what I saw, the HEAD request simply took a few seconds per repository and sometimes, it timed out. So even if you don't download anything, any request over the net can fail or can be slow when the server is on the other side of the globe. That's why there must be an "offline" mode. There appears to be two problems here. 1. Are we doing HEAD requests properly and 2. A new API for managing your 'update policy' (never, daily, etc...). If #1 is really a problem (which I'm not sure it is), then we could address it during the RC builds. However #2 won't likely be looked at in 3.7 (we are passed API freeze). Pascal, I'm moving this out. Feel free to put it back on the plan, especially if someone has a reproducible test case that demonstrates our HEAD requests are not working properly. (In reply to comment #10) > There appears to be two problems here. 1. Are we doing HEAD requests properly > and 2. A new API for managing your 'update policy' (never, daily, etc...). If > #1 is really a problem (which I'm not sure it is), then we could address it > during the RC builds. I do think that doing one HEAD request per child of a composite repo is the main reason for the perceived slow performance. Here is what I did to get some data: As an example tycho project which uses the Helios p2 repo, I built the bootstrap part of tycho itself [1] with HTTP tracing enabled and with a pre-filled local repository: mvn clean install \ -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog \ -Dorg.apache.commons.logging.logging.simplelog.showdatetime=true \ -Dorg.apache.commons.logging.simplelog.log.httpclient.wire=debug \ -Dorg.apache.commons.logging.simplelog.log.org.apache.commons.httpclient=debug you can see in the build log (atached) that there are - 23 HEAD requests to (composite)content/artifacts.jar and - 23 GET requests to p2.index which seemingly always result in HTTP 404 NOT FOUND sent to download.eclipse.org only to find out that none of the child repos changed. For a roundtrip time of 0.5 sec for each request, this results in > 20 sec overhead to find out nothing has changed on every build. Also, it seems that mirrors do not ease this situation (they are not used on metadata level?) This suggests that the number of roundtrips should be reduced. 1. What is p2.index used for? Is this really needed? 2. As a first step towards update policy, what do you think about adding a switch that tells the client not to traverse all child repos when checking if something changed (keeping default behavior as is)? [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/compositeContent.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/compositeContent.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/R/content.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/SR1/content.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/SR2/content.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201006230900/content.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201009240900/content.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201102250900/content.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/compositeArtifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/compositeArtifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/R/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/SR1/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/technology/epp/packages/helios/SR2/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201006230900/compositeArtifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/eclipse/updates/3.6/compositeArtifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/eclipse/updates/3.6/R-3.6-201006080911/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/eclipse/updates/3.6/R-3.6.1-201009090800/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/eclipse/updates/3.6/R-3.6.2-201102101200/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201006230900/aggregate/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201009240900/compositeArtifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201009240900/aggregate/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201102250900/compositeArtifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "HEAD http://download.eclipse.org/releases/helios/201102250900/aggregate/artifacts.jar HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/R/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/SR1/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/SR2/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201006230900/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201009240900/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201102250900/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/R/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/SR1/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/technology/epp/packages/helios/SR2/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201006230900/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/eclipse/updates/3.6/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/eclipse/updates/3.6/R-3.6-201006080911/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/eclipse/updates/3.6/R-3.6.1-201009090800/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/eclipse/updates/3.6/R-3.6.2-201102101200/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201006230900/aggregate/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201009240900/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201009240900/aggregate/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201102250900/p2.index HTTP/1.1[\r][\n]" [DEBUG] header - >> "GET http://download.eclipse.org/releases/helios/201102250900/aggregate/p2.index HTTP/1.1[\r][\n]" --- [1] https://github.com/sonatype/sonatype-tycho/tree/master/tycho-p2-resolver Created attachment 196566 [details]
maven build log with HTTP trace enabled
In Tycho, we don't want p2 doing any remote requests on each build unless the user explicitly asks for this. It does not make sense to check for remote changes each time I rerun the build when troubleshooting a unit test failure. (In reply to comment #11) > 1. What is p2.index used for? Is this really needed? The p2.index file provides information about the repository location. By default, p2 will request content.jar, then compositeContent.jar, then site.xml, etc.... (unless it has "learned" otherwise). The p2.index can provide a different ordering (use the site.xml first, no matter what). This can actually help performance too. For example, if I have my own format content.ian, p2 will actually try "content.jar, content.xml, compositeContent.jar, compositeContent.xml, site.xml and finally content.ian"). It will wait for a 404 on each one. With a p2.index I can put my format first and tell p2 to stop looking after that. So in essence, we should recommend that all repositories that do not provide the content.jar should provide a p2.index to limit the number of accesses. Even if they do provide a content.jar, providing the p2.index will not increase the number. If content.jar is provided there will be two accesses: p2.index FAIL, content.jar SUCCESS If it isn't, and if p2 index is lacking, there will be at least three: p2.index FAIL, content.jar FAIL, ... If p2.index is present, and what it appoints first hand is present, then there will be two accesses: p2.index SUCCESS, whatever p2.index points to SUCCESS I had another look at the build log. For each artifact and metadata repo there is a GET request for p2.index. This means that per (child) repo, we typically have 4 roundtrips on the client side (GET p2.index, HEAD content, GET p2.index, HEAD artifact). As HEAD already optimizes bandwidth, the problem we have here IMHO is latency. Regardless whether we add p2.index or not, I guess we cannot get below 4 roundtrips per repo if I got Ian right and this works as designed. Now if we multiply this factor 4 with the 11 child repos of helios which are all traversed, we get the observed performance problem of ~40 roundtrips before any operation. And it gets worse with every child repo added. Two separate points: 1. Concretely for the helios/indigo repos, is it necessary for the heavily used release train repo to be composed of 11 child repos? Could this be collapsed into one or maximum 3 child repos (release, SR1, SR2) ? I guess the additional one-time effort when aggregating would be worth the performance gain for all the clients. I guess this is relevant for tycho as well as any eclipse user trying to install/update. 2. As Igor mentioned, for tycho we need at least a way to switch this behavior off completely (offline mode using local cache only). A more sophisticated update policy would also be nice but being able to go offline is probably the first thing we need so at least have a chance to skip the ~20 sec penalty on each build if we know nothing changed or just don't care for now. Created attachment 196722 [details] Partial patch When this bug got first opened, I started work on a patch that I'm now attaching. The patch consists in adding policies in the cache manager such that it will not check the remote. I believe that the changes in the CacheManager are ready to go. I stopped on this work because I did not had the time to figure out how to have the change percolate all the way up to the repository / repository manager level so these policies could get controlled from these APIs. In the meantime, one possibility for tycho would be to provide its own implementation of the cachemanager (it is a provisioning agent server) that builds upon this patch and adds any additional level of control necessary, if necessary. As for the number of composite, I recommend to have a bug opened against cross-project component (https://bugs.eclipse.org/bugs/enter_bug.cgi?product=Community). I think the issue not only is visible in Tycho but also in p2 in general. A second bug should be opened against cross-project to have the p2.index file added there. Thomas / Ian / Tobias, could you please take care of that. Thx. Tycho already provides cache manager implementation [1]. Granted, I have not looked at this code for over a year now, but IIRC CacheManager does not provide enough flexibility to provide per-repository update check policy we need in Tycho. And some remote calls, like p2.index access, never go through the cache manager. My information could be outdated, so if p2 developers can suggest how Tycho cache manager implementation can be improved, I'd be happy to admit I was wrong ;-) [1] https://github.com/sonatype/sonatype-tycho/blob/master/tycho-p2-resolver/tycho-p2-resolver-impl/org.eclipse.tycho.p2.resolver.impl/src/main/java/org/eclipse/tycho/p2/impl/resolver/TychoP2RepositoryCacheManager.java I must be missing something in the complete requirement for tycho. The cache manager sole "API" is createCache(URI), so nothing would prevent Tycho to completely reimplement this code to do what it wants like implementing policy and checking on a per-repo basis the policy that needs to be applied. Is there a need to persist the policy that was last set or something like that? I think the new patch makes it a bit easier (some visibility may need to be tweaked to enable overwrite) and should allow tycho to create an instance of the cache manager that is fed with the repos and their policy. (In reply to comment #18) > Tycho already provides cache manager implementation [1]. Granted, I have not > looked at this code for over a year now, but IIRC CacheManager does not provide > enough flexibility to provide per-repository update check policy we need in > Tycho. And some remote calls, like p2.index access, never go through the cache > manager. I can confirm the problem with p2.index always being fetched from remote even when in maven offline mode. (simply run 'mvn -o install' on a tycho project with HTTP tracing switched on as described above) The problem here is we can't override some of the private methods in the p2 CacheManager we extend. I will check if the patch solves this or needs more tweaking. (In reply to comment #17) > As for the number of composite, I recommend to have a bug opened against > cross-project component > (https://bugs.eclipse.org/bugs/enter_bug.cgi?product=Community). opened bug 347403 (In reply to comment #19) Implementing this in Tycho feels like the right thing, because in the end we want the Maven repository update policies applied to p2 - and IMHO Tycho is the project that brings Maven and p2 together. Once this is done, and if there is more general interest in the Tycho cache implementaiton, we may reconsider bringing it back to p2. For now, however, I also rather see this something to be implemented in Tycho. Any objections against changing the component? Obviously we would still make small changes in p2 in order to reuse p2 internals for the Tycho cache implementation. Please create another bug, because this bug is still relevant in the context of p2 outside of the tycho scenarios. As for the p2.index, I can't think of a reason why we would not be able to have it go through the cache manager as well. Ian, wdyt? (In reply to comment #23) > Please create another bug, because this bug is still relevant in the context of > p2 outside of the tycho scenarios. > I've opened this bug specifically because there was no p2 API we could use from Tycho to implement remote update check policy, and now you are saying we need to open another bug for this? Not fair! On a more serious note, I am almost certain it was not possible to suppress first remote HEAD request from Tycho CacheManager implementation, only subsequent requests, so each build p2 checks all repositories once. But if CacheManager allows this now, I am fine keeping update check policy implementation in Tycho. I have opened bug #347448. Move all 3.8 bugs to Juno. *** Bug 221595 has been marked as a duplicate of this bug. *** *** Bug 291230 has been marked as a duplicate of this bug. *** *** Bug 347650 has been marked as a duplicate of this bug. *** (In reply to comment #24) > (In reply to comment #23) > > Please create another bug, because this bug is still relevant in the context > of > > p2 outside of the tycho scenarios. > > > > I've opened this bug specifically because there was no p2 API we could use from > Tycho to implement remote update check policy, and now you are saying we need to > open another bug for this? Not fair! This is a misunderstanding: Pascal's comment was about my proposal of moving this bug to Tycho. This bug shall stay on the p2 component to track the valid enhancement request of introducing an update policy in p2. So for re-attempting to use the cache manager to get rid of the remaining request to p2.index, we would need to open a new bug in Tycho. AFAIK this bug doesn't exist yet. > So for re-attempting to use the cache manager to get rid of the remaining > request to p2.index, we would need to open a new bug in Tycho. AFAIK this bug > doesn't exist yet. Created bug 357357 (In reply to comment #28 and comment #29) These duplicates look like this bug is becoming a kitchen sink bug that collects all cases where updates don't happen as they should. Originally, this bug was about *too many* update queries, but the other bugs are for the opposite. And this bug is marked as enhancement while the others are clearly bugs. Is there any reason for not reopening those bugs? (In reply to comment #32) > Originally, this bug was about *too many* update queries, but the other bugs > are for the opposite. And this bug is marked as enhancement while the others > are clearly bugs. Is there any reason for not reopening those bugs? Just to clarify. Originally, this enhancement request was about externally configurable repository metadata update policy, so that host application (ide, build system, etc) and ultimately the user can decide how often p2 should be allowed to make remote repository metadata requests. I still believe this is a valid enhancement request regardless of possible ways to optimize implementation of the remote requests. There is lots of discussion here (and a partial patch), so I'll put this on Kepler for now. This issue looks dead. Tycho is working well for many people these days. For me the repository cache works well. |