| Summary: | Reduce memory footprint of Composite Repositories | ||
|---|---|---|---|
| Product: | [Eclipse Project] Equinox | Reporter: | Dean Roberts <dean.t.roberts> |
| Component: | p2 | Assignee: | Dean Roberts <dean.t.roberts> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | CC: | dj.houghton, irbull, jeffmcaffer, john.arthorne, pascal, thomas |
| Version: | unspecified | Keywords: | performance |
| Target Milestone: | 3.7 M5 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Bug Depends on: | 324873, 329385, 329386, 330463, 331762 | ||
| Bug Blocks: | 333894 | ||
|
Description
Dean Roberts
There is the babel repo: www.eclipse.org/babel/downloads.php DJ, John and myself had a discussion about the compatibility impact of proposed changes to address this defect.
There will be three patches associated with this change.
Patch 1) Change code such that manifest data is not persisted to repositories
during a build and increment the repository version so that
unpatched products don't fail during install with a missing manifest
error.
Patch 2) Modify install code such that manifest data is not required during
install and increase the required version range so that patched
products can read the new repository version.
Patch 3) Modify code so that manifest data contained in repositories created
by unpatched products is not persisted in memory.
During the discussion there was an argument made for omitting patch 3. Without patch 3 the memory savings will only be enjoyed by new Eclipse versions reading repositories created by new Eclipse versions.
The primary motivation for omitting patch 3 was a concern over repositories containing IUs with the same ID and version having but having different physical representations on disk. This happens, for example, when a patched eclipse mirrors a repository with manifest data in it. The repository is read but the manifest data is ignored. When the in memory IU is persisted it is persisted without the manifest data.
Personally I believe we should include patch 3 but welcome input from the community. My arguments for inclusion are:
1) The change is minor (1 line)
2) What is important is the model representation of the IU. With the patches
in place the manifest data is ignored. Thus the IU is conceptually
identical regardless of whether it was read from bytes that contained or
did not contain manifest information.
3) I believe a new Eclipse reading older repositories will not be a rare
occurance and the memory savings should be as widely available as possible.
Assuming we proceed, the following staging is proposed.
1) Release patch 1, 2 and 3 to 3.7 HEAD
2) Back port patch 2 to older streams. 3.6 for sure ... how far back do
we want to go? Suggestions please.
Patches 1 and 3 could never be released to older Eclipse streams since we would not want to end up in a situation where repositories built by a 3.6.x Eclipse would be unable to update a 3.6.(x - n) product. Presumably particular customers could take these patches if they where required and the ramifications understood.
Once I add repository version number changes and checking and get some feed back from this comment I will attach the three patches discussed.
To be clear, the changes that Dean proposes in Patch 3 would only benefit 3.7 clients who are reading pre-3.7 repositories. Pros: Allows 3.7 clients to read older repositories. Could be important to products built on Eclipse. Cons: Questions about bundle uniqueness. When doing mirroring, etc you can end up in a state with 2 IUs with the same id and version but different touchpoint data. Is this invalid? Comments from people about their use in delivering repositories would be helpful in deciding whether or not to release this part of the code. Pasted Comment #2 and Comment #3 into correct defect: https://bugs.eclipse.org/bugs/show_bug.cgi?id=329386 Posted the following question to the p2-dev mailing list about the use of uncompressed metadata repositories and the impact of license text on their size. ===== Hi folks, I am trying go get a feel for how widely used uncompressed metadata repositories are. A typical content.xml file contains many copies of identical license text. Memory use is not an issue since the implementation uses a StringPool to extern string references. Compressed repositories do not present an issue for disk foot print or network traffic as the content.xml compresses extremely well, typically 95% or more. However, uncompressed metadata repositories may pose a significant concern here if they are widely used. So does anybody have an opinion on how widely used uncompressed repositories are? Thanks I think this can be closed now since all dependent bug reports have been closed. If there are remaining outstanding issues then please re-open or open a new report. |