Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 347448 - Emphasize p2.index should be included in p2 repository sites
Summary: Emphasize p2.index should be included in p2 repository sites
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: Cross-Project (show other bugs)
Version: unspecified   Edit
Hardware: PC Mac OS X - Carbon (unsup.)
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: David Williams CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 307075 350030 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-05-27 09:41 EDT by Pascal Rapicault CLA
Modified: 2012-03-02 01:22 EST (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pascal Rapicault CLA 2011-05-27 09:41:31 EDT
The p2.index file provides a hint to p2 so it does not do its typical scanning to discover the sort of repo we are dealing with (http://wiki.eclipse.org/Equinox/p2/p2_index). 
As a follow up of bug #347403, it would greatly help reduce the number of request to the foundation server if p2.index files were being made available for each repository (including children) and also improve the user experience for end users.
Comment 1 David Williams CLA 2011-05-27 11:34:17 EDT
Why just Indigo? Why not Helios? Why not all? 

And, would it kill 'ya to attach the files you want put there? :) 

I say that (with good humor) mostly since the documentation provided at 
http://wiki.eclipse.org/Equinox/p2/p2_index
is sparse, at best. 
For example, the section on 
http://wiki.eclipse.org/Equinox/p2/p2_index#Jar_vs_XML_extension 
is completely empty. 
And, while the description talks about "order to search", there is not one example of ordered files. Seems one would always want to provide a complete list? Just vary the order? 

And, since it does document clearly "if its wrong it will break the repository", I'd prefer to get your concrete attachments. 

But, sincerely appreciate the p2 team's help improving performance!
Comment 2 David Williams CLA 2011-05-27 13:45:29 EDT
I found this comment in bug 310441#c2 interesting ... 

<quote>
The p2.index file is an optional control file that is used in rare cases (where
you have multiple different repositories at the same location and you want to
instruct p2 which repo to load). 
</quote>

Sounds like the purpose of this file is evolving :)
Comment 3 Ian Bull CLA 2011-05-27 14:20:49 EDT
(In reply to comment #2)
> I found this comment in bug 310441#c2 interesting ... 
> 
> <quote>
> The p2.index file is an optional control file that is used in rare cases (where
> you have multiple different repositories at the same location and you want to
> instruct p2 which repo to load). 
> </quote>
> 
> Sounds like the purpose of this file is evolving :)

No, nothing is evolving, this is still optional.  The index file will tell p2 which 'type' of repository to load if there is more than one repository at a given location. If there is only one 'type' of repository at a given location this file is not needed as p2 will 'figure it out'.  However, the file can 'help' p2 'figure it out' faster in this case.
Comment 4 John Arthorne CLA 2011-05-27 15:07:53 EDT
I'm not sure it would actually speed up much in this case. The release train already uses the highest priority file format. In fact the only extra round trips are due to searching for the p2.index file :(
Comment 5 Ian Bull CLA 2011-05-27 15:43:16 EDT
(In reply to comment #4)
> I'm not sure it would actually speed up much in this case. The release train
> already uses the highest priority file format. In fact the only extra round
> trips are due to searching for the p2.index file :(

Yes, in this case you're probably right. IIRC compositeContent.jar is third on our search order (behind content.jar and content.xml).  With a p2.index we would find it on the 4th request (p2.index, content.jar, content.xml, compositeContent.jar).  We could move that up to number 2. If the child repos are content.jar, then we would find them with two requests.

However, if the child repos are again more composite repos, then the p2.index file may make sense.
Comment 6 John Arthorne CLA 2011-05-27 16:18:31 EDT
(In reply to comment #5)
> Yes, in this case you're probably right. IIRC compositeContent.jar is third on
> our search order (behind content.jar and content.xml).  With a p2.index we
> would find it on the 4th request (p2.index, content.jar, content.xml,
> compositeContent.jar).  We could move that up to number 2. If the child repos
> are content.jar, then we would find them with two requests.

I wasn't seeing any hits on content.jar and content.xml when I traced it, but that's because we remember the "last successful suffix" and try that one first. So after the first search, we hit the right file every time (except for the extra lookkup of the p2.index file itself). Maybe we could also remember when a repository doesn't have an index file... I'll open a separate bug for that.

So short answer: Adding a p2.index file will improve performance on the first repository lookup only, but still well worth doing.
Comment 7 Pascal Rapicault CLA 2011-05-27 16:41:27 EDT
The problem is more visible in the case of builds as described in bug #337022.
Comment 8 Pascal Rapicault CLA 2011-05-27 16:41:56 EDT
This is a follow up of bug #347403.
Comment 9 Pascal Rapicault CLA 2011-05-27 16:45:23 EDT
It can also be done to Helios and other repositories. I just wanted to not scare everybody off. And as for the scarcity of the page, I actually just created it to open this bug :) I will improve it.
Comment 10 David Williams CLA 2011-05-27 16:51:37 EDT
> 
> So short answer: Adding a p2.index file will improve performance on the first
> repository lookup only, but still well worth doing.

What? For Indigo? We are past RC2 now, right? Time for blocking or critical bugs only, IMHO. I don't have the impression this is blocking or critical, but instead a little tweak. I don't mind trying things like this at the start of Juno ... maybe "back-porting" to Indigo and Helios maintenance but, IMHO this idea needs a little more time to flush out. I'm not exactly opposed to the concept (though, conceptually does seem a bit of a work-around hack) ... but I am reluctant to change anything (especially "central" things) this late in the release. (Unless blocking or critical regression and obviously safe, of course).
Comment 11 David Williams CLA 2011-05-27 16:52:24 EDT
(In reply to comment #9)
> ... And as for the scarcity of the page, I actually just
> created it to open this bug :) I will improve it.

I appreciate your honesty. :)
Comment 12 John Arthorne CLA 2011-06-22 09:40:37 EDT
*** Bug 350030 has been marked as a duplicate of this bug. ***
Comment 13 Alex Blewitt CLA 2011-06-22 09:51:17 EDT
From bug 350030, here's what the contents of the p2.index file should be for the http://download.eclipse.org/releases/indigo/

version=1
metadata.repository.factory.order=compositeContent.jar,!
artifact.repository.factory.order=compositeArtifacts.jar,!

I have used this file in several P2 sites in the past without any problems, and will reduce the amount of 404s seen by the server when (a) requesting this file, and (b) in not hitting the content.jar with a HEAD subsequently, at least for first time connections.
Comment 14 Alex Blewitt CLA 2011-06-22 09:52:59 EDT
(NB there's a bug in the wiki page - artifact.repository.factory.order= compositeContent.xml, ! should be compositeAritfacts.xml)
Comment 15 John Arthorne CLA 2011-06-22 13:10:01 EDT
*** Bug 307075 has been marked as a duplicate of this bug. ***
Comment 16 John Arthorne CLA 2011-06-22 13:11:16 EDT
I think the most likely approach now is to start doing this in Juno immediately, and possibly add to Indigo SR1 if all goes well.
Comment 17 David Williams CLA 2011-09-21 02:50:00 EDT
I just want to say I have not lost track of this bug :) 
and still willing to see how to implement for Juno, since would appear to reduce a few empty trips to eclipse.org. 

But, I've got to ask ... a) "hand editing" (or, placement) of files is usually a bad idea. If p2 "wants" these files, why don't p2 publishers publish them? (I realize they were original seen as "rare" to handle special cases, but seems they are not advocated as "normal" to reduce a few file guesses. and

b) why does p2 use so many files to begin with? content.xml vs. compositeContent.xml? What the heck? Why doesn't p2 just use one file, and ... like xml was designed to do ... describe its own content with tags? (I'm asking, seriously, if it is worth opening a p2 enhancement for that? Am I missing something obvious?)
Comment 18 David Williams CLA 2011-09-21 02:56:15 EDT
... are not advocated as "normal" ==> ... are _now_ advocated as "normal"
Comment 19 David Williams CLA 2012-03-01 03:17:33 EST
(In reply to comment #13)
> From bug 350030, here's what the contents of the p2.index file should be for
> the http://download.eclipse.org/releases/indigo/
> 
> version=1
> metadata.repository.factory.order=compositeContent.jar,!
> artifact.repository.factory.order=compositeArtifacts.jar,!
> 
> I have used this file in several P2 sites in the past without any problems, and
> will reduce the amount of 404s seen by the server when (a) requesting this
> file, and (b) in not hitting the content.jar with a HEAD subsequently, at least
> for first time connections.

Just to not leave a misleading post here, I think this information is in error, as far as I can tell. 

for .../releases/indigo I think it would be 

version=1
metadata.repository.factory.order=compositeContent.xml,\!
artifact.repository.factory.order=compositeArtifacts.xml,\!

I do not think there is a "jar" factory at all. I tried some local tests, and using "jar" seemed to fail, so I don't think that's correct at all ... if it worked for you, you may have already had some information cached about what worked before. A good test has to start completely fresh. 

Or ... else I'm seeing something wrong ... but, pretty sure 'xml' is the correct value to use and have tried to clarify this in 
http://wiki.eclipse.org/Equinox/p2/p2_index
Comment 20 David Williams CLA 2012-03-01 03:26:49 EST
another question on the "docs" and effects (time) of 404 errors. 

The doc at http://wiki.eclipse.org/Equinox/p2/p2_index says
"
Given that a composite repository is just a repository that refers to other repositories, the full benefit of p2.index can only be achieved if every child repo has the file. 
"

But, I know of at least one case in our common repo where I don't think this is true ... we have a bit of a different structure, so that (some of) the "artifacts" are in their own directory, with an artifacts.jar file. (no content, no composites). 

I think when p2 goes to look there for artifacts, is looks for p2.index, gets 404, and then tries artifacts.jar file. So, I could imagine it might take a little longer for p2 to find the p2.index file, read it, find out it should look for artifacts.jar file, and then read the artifacts.jar file. Is there some "hidden cost" of 404 errors? If not, this would be one case, where a p2.index file doesn't really do any good. (Probably one of the few cases). 

Clarifications welcome.
Comment 21 David Williams CLA 2012-03-01 03:40:21 EST
(In reply to comment #19)
> (In reply to comment #13)

> 
> for .../releases/indigo I think it would be 
> 
> version=1
> metadata.repository.factory.order=compositeContent.xml,\!
> artifact.repository.factory.order=compositeArtifacts.xml,\!
> 

For completeness, I'll document here that the p2.index files for the three 
.../releases/indigo/<datetimedirecotry/ would be 

version=1
metadata.repository.factory.order=content.xml,\!
artifact.repository.factory.order=compositeArtifacts.xml,\!

So, after studying this a while (and trying some modest local tests) I'm fine adding these 4 files to the main 4 directories in common repo. Heck, I'd even do it for Helios, given the number of hits still going on there! 

Any objections? 

If this happens to change the date/time stamps so mirrors appear invalidated, and back down to zero, I know how to 'touch' them gently to reset to the original time. 

Comments welcome. 

P.S. After going to all this trouble ... I hope someone has some profound before/after measurements :) [I know, just kidding, I think we've already established "wouldn't be profound" ... but, if anyone can, please do the performance tests so maybe would motivate others?
Comment 22 Denis Roy CLA 2012-03-01 13:35:54 EST
In addition to all the metadata JAR files and their potential XML equivalents, now we have p2.index?  Who comes up with these ideas?  :)



> I'm not sure it would actually speed up much in this case. The release train
> already uses the highest priority file format. In fact the only extra round
> trips are due to searching for the p2.index file :(

From yesterday's logfile for download.eclipse.org...  BTW these are all "404 Not Found" :

   hits file
 264757 /releases/indigo/p2.index
 237864 /releases/indigo/201202240900/p2.index
 221080 /technology/epp/packages/indigo/SR2/p2.index
 220684 /releases/indigo/201109230900/p2.index
 220180 /technology/epp/packages/indigo/SR1/p2.index
 219370 /technology/epp/packages/indigo/R/p2.index
 215248 /releases/indigo/201106220900/p2.index
 185022 /eclipse/updates/3.7/R-3.7-201106131736/p2.index
 182851 /eclipse/updates/3.7/R-3.7.1-201109091335/p2.index
 180052 /eclipse/updates/3.7/R-3.7.2-201202080800/p2.index
 179895 /eclipse/updates/3.7/p2.index
 129375 /mylyn/drops/3.6.5/v20120215-0100/p2.index
 110441 /releases/helios/p2.index
 104294 /eclipse/updates/3.7/categories/p2.index
 102986 /releases/helios/201006230900/p2.index
 102398 /technology/epp/packages/helios/p2.index
 101479 /releases/helios/201009240900/p2.index
 101315 /technology/epp/packages/helios/R/p2.index
 101074 /technology/epp/packages/helios/SR1/p2.index
 100671 /releases/helios/201102250900/p2.index

... and the list goes on, to total 5,407,368 404's looking for various p2.index files.  For yesterday.
Comment 23 Denis Roy CLA 2012-03-01 13:42:36 EST
(In reply to comment #21)
> P.S. After going to all this trouble ... I hope someone has some profound
> before/after measurements :) [I know, just kidding, I think we've already
> established "wouldn't be profound" ... but, if anyone can, please do the
> performance tests so maybe would motivate others?

A standard "404" error for a non-existent p2.index file is at least 43 bytes of totally useless data, so providing 119 bytes of useful (to p2) data is already a step in the right direction.

In terms of the release train repos, does putting a p2.index file up save us from some of this?

 "GET /eclipse/updates/4.2/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /eclipse/updates/4.2/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /releases/juno/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /releases/juno/compositeContent.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /releases/juno/201202030900/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /releases/juno/201202030900/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /releases/juno/201112160900/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /releases/juno/201112160900/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /technology/epp/packages/juno/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /technology/epp/packages/juno/compositeContent.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /technology/epp/packages/juno/M5.180/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /technology/epp/packages/juno/M5.180/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /technology/epp/packages/juno/M4.174/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /technology/epp/packages/juno/M4.174/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /technology/epp/packages/juno/M3.116/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /technology/epp/packages/juno/M3.116/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /technology/epp/packages/juno/M2.53/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /technology/epp/packages/juno/M2.53/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /technology/epp/packages/juno/M1.9/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /technology/epp/packages/juno/M1.9/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"
 "GET /e4/updates/0.12/p2.index HTTP/1.1" 404 13 "-" "Jakarta Commons-HttpClient/3.1"
 "HEAD /e4/updates/0.12/content.jar HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"


BTW -- for how long does p2 cache these results?  I ran a Check for Software updates 3 hours ago, and again just now, and it returned to the server for all the goodness above.
Comment 24 Pascal Rapicault CLA 2012-03-01 13:48:36 EST
> BTW -- for how long does p2 cache these results?  I ran a Check for Software
> updates 3 hours ago, and again just now, and it returned to the server for all
> the goodness above.

The expected behaviour is for p2 to cache the various content.jar/xml as well as artifacts.jar, and to only check for the file timestamp. The p2.index file is not cached.

Does that match what you see?
Comment 25 Denis Roy CLA 2012-03-01 13:51:03 EST
> Does that match what you see?

Yes, thanks
Comment 26 Pascal Rapicault CLA 2012-03-01 14:33:26 EST
David, I think we should definitely go ahead with the addition of the p2.index. The "profound measurement" will be obtained by looking at the http logs.

In terms of additional improvements, I think that exposing one content.xml over multiple would also make for a big improvement since the user would not download a lot of duplicated metadata when contacting repos like indigo. But this is the topic of another bug if you think it worth it (I think it does) :).
Comment 27 David Williams CLA 2012-03-01 14:49:35 EST
I have, just now, 2:30 Eastern, 3/1/2012, put p2.index files in following directories. 

This of course reduces the 404s for those files, and might reduce 404's for "content.xml" from the main (top level) sites (for first time installers, since no longer will look for them at all on that URL in that first-time update (pre-cached) case ... but doubt that number is too large anyway). 

I'll change title and leave open a bit to change focus to "everyone, especially with composite repos, should use p2.index".  

= = = = = 

.../releases/helios/
.../releases/helios/201006230900/
.../releases/helios/201009240900/
.../releases/helios/201102250900/

.../releases/indigo/
.../releases/indigo/201106220900/
.../releases/indigo/201109230900/
.../releases/indigo/201202240900/

.../releases/juno
.../releases/juno201112160900/
.../releases/juno201202030900/
Comment 28 David Williams CLA 2012-03-01 15:03:29 EST
(In reply to comment #26)
> David, I think we should definitely go ahead with the addition of the p2.index.
> The "profound measurement" will be obtained by looking at the http logs.

Well, getting 404s was designed into p2 on purpose, apparently, as part of their "look for all these files" logic ... instead of having one and only one file that contained everything needed (as the magic of XML should easily allow) ... so I think "number of 404s" doesn't mean much ... I was hoping for more quantitative "round trip" numbers as given in bug 347403. (But, no big deal, mostly my way of saying I've spent 8 hours on this! and think those that made the design decisions could have done more to better document what's needed ... such as so far, no one from p2 team has commented on comment 13 which I'm pretty sure is wrong, and if not wrong, it'd be nice to have someone explain). [And, I'm just grossing about the lost opportunity of using one XML file with self-described-data ... don't mean to flame anyone ... I'm sure the decisions seemed right at the time.] 

> In terms of additional improvements, I think that exposing one content.xml over
> multiple would also make for a big improvement since the user would not
> download a lot of duplicated metadata when contacting repos like indigo. But
> this is the topic of another bug if you think it worth it (I think it does) :).

We, at common repo, do only have one ... one per release (SR0, SR1, SR2). If you mean have one, period, that's changed for SR1 and SR2 ... then ... volunteers welcomed! :)
Comment 29 David Williams CLA 2012-03-01 15:08:25 EST
Changing to help "finish off" this bug as being about documenting and encouraging use by everyone with a p2 repo. 

If any of you on Denis' "hot list" have questions or "special repo shapes" that are not covered by the instructions in 

http://wiki.eclipse.org/Equinox/p2/p2_index 

Then please ask, and will try to help figure out and document special cases.
Comment 30 David Williams CLA 2012-03-01 15:15:30 EST
(In reply to comment #6)
> (In reply to comment #5)

> So after the first search, we hit the right file every time (except for the
> extra lookkup of the p2.index file itself). Maybe we could also remember when a
> repository doesn't have an index file... I'll open a separate bug for that.

To cross reference, there are two related bug open in p2-land ... 

bug 310546 2010-04-26  Add caching support to the p2.index file reader
bug 302909 2010-02-15  [publisher] Generate the p2.index
Comment 31 David Williams CLA 2012-03-01 15:34:36 EST
I've added a blub to the IT Infrastructure document at 

http://wiki.eclipse.org/IT_Infrastructure_Doc#Include_a_p2.index_file_at_p2_repository_site.3F

and a little "clarification" clause in Sim. Rel. Document, just to help spread the word. 

I'll send note to cross-project list about this and the p2.mirrorsURL "requirement".
Comment 32 David Williams CLA 2012-03-01 15:40:19 EST
And, I don't say thanks enough ... I know, I must always come across as complaining :) ... but in this case I do want to say a special thanks to Thomas Hallgren. He (or team) added the function to publish these p2.index files in the b3 aggregator -- and they are even commented! -- and without that, I would not have been able to figure out and feel confident about what was needed here. 

Thanks Thomas.
Comment 33 David Williams CLA 2012-03-02 01:22:35 EST
I've updated all the docs I can think of, and sent note to cross-project list, so will close this particular bug as "fixed". 

But, if anyone sees any especially bad problem cases, I suggest bugs be opened for those specific projects. 

If anyone ever encounters oddities or has "how to" questions, feel free to ask on cross-project list. (or, p2-dev, if seems real p2 specific ... but cross-project if appears to be something that would effect several projects).