Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 142294 - Update site optimization using Pack200
Summary: Update site optimization using Pack200
Status: RESOLVED FIXED
Alias: None
Product: GEF
Classification: Tools
Component: Misc (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows XP
: P2 enhancement (vote)
Target Milestone: 3.4.0 (Ganymede) M6   Edit
Assignee: Anthony Hunter CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-05-17 13:39 EDT by Steven R. Shaw CLA
Modified: 2008-05-26 13:48 EDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Steven R. Shaw CLA 2006-05-17 13:39:30 EDT
From David Williams:
We still follow the principle that if projects have optimized their own update sites, then we will make use of that to optimize Callisto site. 
In theory, we would literally (still) just mirror exactly what's on the project's sites (both jars and .gz files), but due to a limitation with mirror command, we will re-generate the 
.gz files if the jar file has been "conditioned" for it. This is important, even in your own project builds/sites (even if Callisto was not in the picture), since if a jar is available at all, (via zip file, update manger, or pack200 gz file) it should be the exact same jar file, in the end, so its important projects learn about "conditioning" jars, use those in their own zips, use those on your own update sites, and use those as input to create your own gz files on your own update sites.  Then, the Callisto process will simply "duplicate" what you've done. 

See http://wiki.eclipse.org/index.php/Update_Site_Optimization 
and/or ask questions here, if you have trouble doing this. 

I do plan to create the site digest for Callisto update site this week, 
and I will run the pack200 part of site optimiation, assuming I can get java 5 to run on our PPC machine, or don't encounter any other bugs or problems. 

The optimize/pack part though, is specifically written to only operate on jars that have been "conditioned" for it (there's a special file written to meta-inf directory, 
indicating its been condition). This is to avoid "changing" a jar without the owner intending it to be "changed". Its my understanding the current "conditioning" step doesn't really 
change anything, just decompresses the jar it if its compressed ... but technically, there are things that it can theoretically do with some run-lenght encoding of dates, etc., (but, as far as I know, these are not currently done).
Comment 1 Steven R. Shaw CLA 2006-06-07 09:12:59 EDT
Some more info from David Williams:

I've computed some numbers and the savings is pretty sizable, 30% currently, and this could be expected to go up to 60% savings if everyone packed (conditioned) their 
update site jars. I've put some of these numbers in "info" bug 145685, and 
pasted the text below. Feel free to comment there (or here) or add your own numbers! 

If it helps anyone transition, I've put part of our WTP build script right in 
our wiki, at 

http://wiki.eclipse.org/index.php/Callisto_Build_and_Update_Tips_and_Tricks 

that documents how we do it in WTP. It may not apply directly, depending on 
how you do your builds, but I've broken it up into 6 "utility steps", and even 
annotated the ant script! -- I'm sure you'll enjoy reading it, even if you don't implement it :) [And, thanks go to Andrew, Richard, Sonia, Kim and Naci for doing all the hard stuff and setting good examples ... I just pulled things into one spot and wrote the comments :) ] 

So, be sure to ask if you run into trouble packing up stuff ... I'm sure there's readers of this list that would be very happy to help. 


= = = = = 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=145685 
= = = = = 

This is an "info" bugzilla only, to document the size of Callisto (as of RC4)
and the effects of using Pack200. I will attach an excel spreadsheet that
contains the "raw data" that lead to this summary, in case anyone else want to
play with them, or do a more detailed analysis. 

There are 645 jars "available" for download (some are platform specific, so no
one would ever download them all -- though I'll act like they would for some of
the following summaries). 

The total size of all those 645 jars is 194 MBytes. (So, imagine, that's what
would be downloaded if pack200 not available, and someone needed them all). 

I think only the Eclipse Project and the WTP project have conditioned their
jars (as of RC4) which leads to about half of those jars having packed (.gz) 
versions available. 

The total size of all the packed (.gz) files is 47 MBytes. 
The total size of the jars that correspond to these gz files is 107 MBytes. 

This means that "packing" in general, on average, produces .gz files that are
44% the size of their corresponding jars. 

If instead you take as the-total-to-include all the files that would need to be
downloaded, the packed version is 70% of that total (so, if most users have
'unpack' available, and if most users used update manager, you can thank the
WTP and Eclipse Projects for saving 30% bandwidth in the upcoming release :)
But if everyone did it, the saving would be about double that! Hmmm, what's
that translate into rented bandwidth costs? :) 

It is very interesting that on a jar by jar basis, the reduced size varied
everywhere from 20% to 90% of the original jar (with the 44% being a
real-life-in-practice average). As would be expected, "doc" plugins did not
compress well, and very small code plugins did not compress well. 

Hope everyone finds these interesting as I do! 
Comment 2 Anthony Hunter CLA 2007-03-21 08:12:33 EDT
Requirements For Participation
Projects that are part of Europa agree to abide by the following requirements. 

3. Projects must optimize their update site using pack200 to reduce bandwidth utilization and provide a better update experience for users. Additionally, they should do site digesting. 
Comment 3 Anthony Hunter CLA 2008-05-26 13:48:47 EDT
Bug 233875 confirms that GEF has completed Pack2000 work in Ganymede. The work was completed by Nick Boldt when we moved to the common modeling build tools.