Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 255685 - Mirroring to use exising repository as a baseline
Summary: Mirroring to use exising repository as a baseline
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.5 M4   Edit
Assignee: Andrew Cattle CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-11-18 12:40 EST by Andrew Niefer CLA
Modified: 2008-12-05 17:08 EST (History)
5 users (show)

See Also:


Attachments
Adds baseline mirroring functionality (12.04 KB, patch)
2008-12-01 08:21 EST, Andrew Cattle CLA
dj.houghton: iplog+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Niefer CLA 2008-11-18 12:40:33 EST
See also bug 255678.

Consider Repository A containing a bundle org.foo_1.0.0.v1 from build #1

Repository B is created by build #2, with a new org.foo_1.0.0.v1, which may be different from build #1's version.

We want to publish Repository B for the world to consume, but keep it separate from Repository A.  Mirroring could support creating a public B' using repo A as a baseline.  Artifacts are mirrored from B to B', with the exception that any artifact already existing in repository A is taken from there instead of from B.

This should also support the same comparator as in 255678 and raise warnings/errors.
Comment 1 DJ Houghton CLA 2008-11-19 16:46:16 EST
We talked a bit about this and here are a few use-cases which I am interested in seeing covered:

- I have 5 repositories that I have already built and I want to create a new composite repository which has these 5 as sub-repos.
- I am a product developer and I want to create a new composite repository which is a collection of 5 remote (e.g. http) repositories.

In these cases I believe it would be ok to fail the "add child/create repo/whatever" operation if we found 2 JARs which were the same id/version but differing in contents.

Comment 2 Jeff McAffer CLA 2008-11-25 09:32:10 EST
just to clarify the usecase in comment 0.  did build 1 and 2 really create the same artifact with the same version number yet the content is different?  That seems like a problem in itself.
Comment 3 Kim Moir CLA 2008-11-25 09:50:41 EST
Jeff, the packing process changes timestamps within the jar, so the md5 checksums on the two jars with the same qualifier will be different.  Similarly, when we switch to a new compiler, this may change the binary content of the same bundle that was built the day before with an older compiler.  These issues have existed forever, but have only been exposed when we start checking md5 checksums at runtime. 
Comment 4 Andrew Niefer CLA 2008-11-25 10:28:39 EST
(In reply to comment #2)
> just to clarify the usecase in comment 0.  did build 1 and 2 really create the
> same artifact with the same version number yet the content is different?  That
> seems like a problem in itself.
> 

Yes the MD5 will be different.  This may or may not be a problem, detecting this requires actually comparing the contents of the jar.

The problem is essentially that we version our source, and the result binaries depend on other things like the compiler and which other binary bundles it was built against.  A recompile of the same source a week later against different prerequisites may resulting in binary changes that warrant a version upgrade.

This bug is about ensuring that all public copies of that jar are the same.  Bug 255678 is a step towards comparing the old and new jars to see if the changes are significant enough to warrant a retag and rebuild.
Comment 5 Jeff McAffer CLA 2008-11-25 23:10:52 EST
This is a bit of a bummer.  one of the common MD5 usecases is for the hash to be stored at a trusted location.  Then you can get the content from any where and if the hash matches, you are happy.  Having a repo supply the hash for its own content really only helps with download integrity (a good thing but not stellar).  Signing of the bundles really establishes their correctness and is resilient to things like timestamp changes etc.

So, I wonder if the hassle involved here with the MD5 hashing is, in the end, worthwhile.  If the CONTENT really is different (as in the bytecodes are different) then sure we should detect that and cause version numbers to be incremented.  Leaving the MD5 stuff to fail when people happen to recompile will result in people using -Declipse.p2.MD5Check=false.  
Comment 6 Andrew Cattle CLA 2008-11-27 09:04:09 EST
I have a basic version working but I still need to write some tests.

Also it relies on code I just submitted as part of my patch for Bug 255678
Comment 7 Andrew Cattle CLA 2008-12-01 08:21:16 EST
Created attachment 119138 [details]
Adds baseline mirroring functionality

baseline is set using the "-compareAgainst <baseline location>" argument.

When mirroring the application checks to see if the current descriptor in the source is in the baseline. If ti is the copy from the baseline is mirrored and a compare is performed on the artifacts.

Returns a multistatus containing both the status resulting from the compare and the status resulting from the transfer.

I've also included a test case.
Comment 8 DJ Houghton CLA 2008-12-05 16:42:26 EST
Thanks Andrew. 
Released with minor modifications.
Closing.
Comment 9 DJ Houghton CLA 2008-12-05 17:08:38 EST
Created bug 257783 to address the scenario in comment #1.