Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 349822 - [metadata][repository] p2/Indigo update site still contains multiple redundant copies of EPL
Summary: [metadata][repository] p2/Indigo update site still contains multiple redundan...
Status: RESOLVED WONTFIX
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.7   Edit
Hardware: PC All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: P2 Inbox CLA
QA Contact:
URL: http://download.eclipse.org/eclipse/u...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-20 09:45 EDT by Alex Blewitt CLA
Modified: 2016-11-17 21:39 EST (History)
10 users (show)

See Also:


Attachments
First cut at a solution (7.31 KB, text/plain)
2012-03-11 23:53 EDT, Ian Bull CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Blewitt CLA 2011-06-20 09:45:43 EDT
The content.jar contains 105 copies of the Eclipse public license verbatim. I thought that problem was supposed to be fixed?

http://download.eclipse.org/eclipse/updates/3.7-I-builds/I20110613-1736/content.jar
Comment 1 Alex Blewitt CLA 2011-06-20 09:52:19 EDT
See also bug 306818 which claimed to have fixed this problem for 3.7M4
Comment 2 Alex Blewitt CLA 2011-06-20 09:59:35 EDT
  <unit id="org.eclipse.sdk.tests.feature.group" >
    <properties size="9">
    <property name="df_LT.license" value="Eclipse Foundation Software User Agreement February 1, 2011 ..."/>
  </unit>
  <unit id="org.eclipse.releng.tools.feature.jar" >
    <properties size="8">
    <property name="df_LT.license" value="Eclipse Foundation Software User Agreement February 1, 2011 ..."/>
  </unit>
  <unit id="org.eclipse.pde.junit.runtime.addon.feature.group">
    <properties size="10">
    <property name="df_LT.license" value="Eclipse Foundation Software User Agreement February 1, 2011 ..."/>
    ...
  </unit>
Comment 3 Alex Blewitt CLA 2011-06-20 12:32:48 EDT
See also bug 325378 which was closed dupe of bug 306818 (which is in fact marked as fixed, but never been promoted to closed).
Comment 4 Kim Moir CLA 2011-06-20 14:50:17 EDT
Yes, this is expected behaviour.  I just talked to Dean Roberts, the p2 committer who implemented this feature.

http://wiki.eclipse.org/Equinox/p2/License_Mechanism

The changes how we include licenses were implemented for improve the situation for those authoring features in that we can now point to a shared license feature.  They don't improve the situation for the metadata, there is still a duplication of text.  The reasoning is that content.jars are zipped and licenses being text, they achieve a very high compression ratio. Also, in memory, p2 uses string tables. Mutliple identical licenses should not grow memory foot print beyond that needed to store string table references. 

In short, yes it would be nice to only have one license text in the metadata, but they felt that addressing the authoring situation was the better issue to solve given limited resources.

In my blog, I didn't mention anything about less text in the metadata, but rather how to implement it as feature maintainer.

http://relengofthenerds.blogspot.com/2011/01/implementing-shared-licenses-with-37m5.html

Hope this helps :-)
Comment 5 Kim Moir CLA 2011-06-20 16:12:13 EDT
Closing since p2/pde build doesn't currently support this so I can't implement it.

And as Dean mentioned in the wiki article, the benefits are limited.
Comment 6 John Arthorne CLA 2011-06-20 16:53:27 EDT
There were quite significant improvements to p2 repository memory performance in 3.7. For a summary, see bug 314118 comment 8. However as Kim pointed out, the inclusion of license information with each unit does not impact disk performance, memory usage, or network performance. It has a small theoretical effect on read/write speed due to zip compression but the compression cost is dwarfed by the I/O cost. So, in short changing the file format to avoid the duplication is a premature optimization that wasn't worth doing.
Comment 7 Ian Bull CLA 2011-06-20 17:12:59 EDT
I just ran a few small tests:

Indigo content.jar (compressed) 2.2M

I unzipped it, and removed all the license text, and re-compressed: 1.5M. It is possible that I botched the sed script (which removed the license), but I don't think so.  Also, I just used linux zip, so maybe the jar compression is slightly different.

I do think this type of improvement is interesting, but I don't know if it's worth breaking existing clients for.  Remember, p2 needs to be forward compatible as well as backwards compatible.

Also, most people will only have to download the content.jar once. We have pretty good cache management in p2.
Comment 8 Alex Blewitt CLA 2011-06-20 17:31:38 EDT
(In reply to comment #6)
> However as Kim pointed out,
> the inclusion of license information with each unit does not impact disk
> performance, memory usage, or network performance. It has a small theoretical
> effect on read/write speed due to zip compression but the compression cost is
> dwarfed by the I/O cost. So, in short changing the file format to avoid the
> duplication is a premature optimization that wasn't worth doing.

Really? You've done tests on this, or you're just asserting that through belief?

I removed all bar one copy of the EPL from the content.xml and added it to the content.jar:

-rw-r--r--  1 alex  wheel   229K 20 Jun 22:28 content.jar
-rw-r--r--  1 alex  wheel   354K 20 Jun 22:29 content.jar.orig

That makes a significant difference for checking for updates from a file. Even if caching has been improved, this will still affect the first time load of the file, not to mention the mirrors that host it.
Comment 9 Alex Blewitt CLA 2011-06-20 17:48:13 EDT
This is how it is possible to condense the text into one copy, without having to negatively affect downstream clients:


<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE repository [
<!ELEMENT repository ANY>
<!ENTITY epl "Eclipse Foundation Software User Agreement&#xA;February 1, 2011&#xA;&#xA;Usage Of Content&#xA;&#xA;THE ECLIPSE FOUNDATION MAKES AVAILABLE SOFTWARE, DOCUMENTATION, INFORMATION AND/OR&#xA;OTHER MATERIALS FOR OPEN SOURCE PROJECTS (COLLECTIVELY &quot;CONTENT&quot;).&#xA;USE OF THE CONTENT IS GOVERNED BY THE TERMS AND CONDITIONS OF THIS&#xA;AGREEMENT AND/OR THE TERMS AND CONDITIONS OF LICENSE AGREEMENTS OR&#xA;NOTICES INDICATED OR REFERENCED BELOW.  BY USING THE CONTENT, YOU&#xA;AGREE THAT YOUR USE OF THE CONTENT IS GOVERNED BY THIS AGREEMENT&#xA;AND/OR THE TERMS AND CONDITIONS OF ANY APPLICABLE LICENSE AGREEMENTS&#xA;OR NOTICES INDICATED OR REFERENCED BELOW.  IF YOU DO NOT AGREE TO THE&#xA;TERMS AND CONDITIONS OF THIS AGREEMENT AND THE TERMS AND CONDITIONS&#xA;OF ANY APPLICABLE LICENSE AGREEMENTS OR NOTICES INDICATED OR REFERENCED&#xA;BELOW, THEN YOU MAY NOT USE THE CONTENT.&#xA;&#xA;Applicable Licenses&#xA;&#xA;Unless otherwise indicated, all Content made available by the&#xA;Eclipse Foundation is provided to you under the terms and conditions of&#xA;the Eclipse Public License Version 1.0 (&quot;EPL&quot;). A copy of the EPL is&#xA;provided with this Content and is also available at http://www.eclipse.org/legal/epl-v10.html.&#xA;For purposes of the EPL, &quot;Program&quot; will mean the Content.&#xA;&#xA;Content includes, but is not limited to, source code, object code,&#xA;documentation and other files maintained in the Eclipse Foundation source code&#xA;repository (&quot;Repository&quot;) in software modules (&quot;Modules&quot;) and made available&#xA;as downloadable archives (&quot;Downloads&quot;).&#xA;&#xA;- Content may be structured and packaged into modules to facilitate delivering,&#xA;extending, and upgrading the Content. Typical modules may include plug-ins (&quot;Plug-ins&quot;),&#xA;plug-in fragments (&quot;Fragments&quot;), and features (&quot;Features&quot;).&#xA;- Each Plug-in or Fragment may be packaged as a sub-directory or JAR (Java(TM) ARchive)&#xA;in a directory named &quot;plugins&quot;.&#xA;- A Feature is a bundle of one or more Plug-ins and/or Fragments and associated material.&#xA;Each Feature may be packaged as a sub-directory in a directory named &quot;features&quot;.&#xA;Within a Feature, files named &quot;feature.xml&quot; may contain a list of the names and version&#xA;numbers of the Plug-ins and/or Fragments associated with that Feature.&#xA;- Features may also include other Features (&quot;Included Features&quot;). Within a Feature, files&#xA;named &quot;feature.xml&quot; may contain a list of the names and version numbers of Included Features.&#xA;&#xA;The terms and conditions governing Plug-ins and Fragments should be&#xA;contained in files named &quot;about.html&quot; (&quot;Abouts&quot;). The terms and&#xA;conditions governing Features and Included Features should be contained&#xA;in files named &quot;license.html&quot; (&quot;Feature Licenses&quot;). Abouts and Feature&#xA;Licenses may be located in any directory of a Download or Module&#xA;including, but not limited to the following locations:&#xA;&#xA;- The top-level (root) directory&#xA;- Plug-in and Fragment directories&#xA;- Inside Plug-ins and Fragments packaged as JARs&#xA;- Sub-directories of the directory named &quot;src&quot; of certain Plug-ins&#xA;- Feature directories&#xA;&#xA;Note: if a Feature made available by the Eclipse Foundation is installed using the&#xA;Provisioning Technology (as defined below), you must agree to a license (&quot;Feature &#xA;Update License&quot;) during the installation process. If the Feature contains&#xA;Included Features, the Feature Update License should either provide you&#xA;with the terms and conditions governing the Included Features or inform&#xA;you where you can locate them. Feature Update Licenses may be found in&#xA;the &quot;license&quot; property of files named &quot;feature.properties&quot; found within a Feature.&#xA;Such Abouts, Feature Licenses, and Feature Update Licenses contain the&#xA;terms and conditions (or references to such terms and conditions) that&#xA;govern your use of the associated Content in that directory.&#xA;&#xA;THE ABOUTS, FEATURE LICENSES, AND FEATURE UPDATE LICENSES MAY REFER&#xA;TO THE EPL OR OTHER LICENSE AGREEMENTS, NOTICES OR TERMS AND CONDITIONS.&#xA;SOME OF THESE OTHER LICENSE AGREEMENTS MAY INCLUDE (BUT ARE NOT LIMITED TO):&#xA;&#xA;- Eclipse Distribution License Version 1.0 (available at http://www.eclipse.org/licenses/edl-v1.0.html)&#xA;- Common Public License Version 1.0 (available at http://www.eclipse.org/legal/cpl-v10.html)&#xA;- Apache Software License 1.1 (available at http://www.apache.org/licenses/LICENSE)&#xA;- Apache Software License 2.0 (available at http://www.apache.org/licenses/LICENSE-2.0)&#xA;- Metro Link Public License 1.00 (available at http://www.opengroup.org/openmotif/supporters/metrolink/license.html)&#xA;- Mozilla Public License Version 1.1 (available at http://www.mozilla.org/MPL/MPL-1.1.html)&#xA;&#xA;IT IS YOUR OBLIGATION TO READ AND ACCEPT ALL SUCH TERMS AND CONDITIONS PRIOR&#xA;TO USE OF THE CONTENT. If no About, Feature License, or Feature Update License&#xA;is provided, please contact the Eclipse Foundation to determine what terms and conditions&#xA;govern that particular Content.&#xA;&#xA;&#xA;Use of Provisioning Technology&#xA;&#xA;The Eclipse Foundation makes available provisioning software, examples of which include,&#xA;but are not limited to, p2 and the Eclipse Update Manager (&quot;Provisioning Technology&quot;) for&#xA;the purpose of allowing users to install software, documentation, information and/or&#xA;other materials (collectively &quot;Installable Software&quot;). This capability is provided with&#xA;the intent of allowing such users to install, extend and update Eclipse-based products.&#xA;Information about packaging Installable Software is available at&#xA;http://eclipse.org/equinox/p2/repository_packaging.html (&quot;Specification&quot;).&#xA;&#xA;You may use Provisioning Technology to allow other parties to install Installable Software.&#xA;You shall be responsible for enabling the applicable license agreements relating to the&#xA;Installable Software to be presented to, and accepted by, the users of the Provisioning Technology&#xA;in accordance with the Specification. By using Provisioning Technology in such a manner and&#xA;making it available in accordance with the Specification, you further acknowledge your&#xA;agreement to, and the acquisition of all necessary rights to permit the following:&#xA;&#xA;1. A series of actions may occur (&quot;Provisioning Process&quot;) in which a user may execute&#xA;the Provisioning Technology on a machine (&quot;Target Machine&quot;) with the intent of installing,&#xA;extending or updating the functionality of an Eclipse-based product.&#xA;2. During the Provisioning Process, the Provisioning Technology may cause third party&#xA;Installable Software or a portion thereof to be accessed and copied to the Target Machine.&#xA;3. Pursuant to the Specification, you will provide to the user the terms and conditions that&#xA;govern the use of the Installable Software (&quot;Installable Software Agreement&quot;) and such&#xA;Installable Software Agreement shall be accessed from the Target Machine in accordance&#xA;with the Specification. Such Installable Software Agreement must inform the user of the&#xA;terms and conditions that govern the Installable Software and must solicit acceptance by&#xA;the end user in the manner prescribed in such Installable Software Agreement. Upon such&#xA;indication of agreement by the user, the provisioning Technology will complete installation&#xA;of the Installable Software.&#xA;&#xA;Cryptography&#xA;&#xA;Content may contain encryption software. The country in which you are&#xA;currently may have restrictions on the import, possession, and use,&#xA;and/or re-export to another country, of encryption software. BEFORE&#xA;using any encryption software, please check the country&apos;s laws,&#xA;regulations and policies concerning the import, possession, or use, and&#xA;re-export of encryption software, to see if this is permitted.&#xA;&#xA;Java and all Java-based trademarks are trademarks of Oracle Corporation in the United States, other countries, or both.">
]>

...

        <property name='df_LT.license' value='&epl;'/>

        <property name='df_LT.license' value='&epl;'/>

        <property name='df_LT.license' value='&epl;'/>

        <property name='df_LT.license' value='&epl;'/>

        <property name='df_LT.license' value='&epl;'/>


As long as the XML parser is a proper XML parser, it will define &epl; to be the content of the text defined at the top (and note, I'm defining 'repository' to be ANY to avoid having to define the content and/or kick off validation). The parser will then substitute the value of the entity with the content of the license.

Having just tested this on a local instance, it appears to work, but someone else might want to verify. The size savings of doing this are 362064 -> 234876 bytes, a saving of almost 1/3.

If you strip out the unnecessary whitespace as well the nit goes down to 229291 bytes (and that's without stripping newlines as well).
Comment 10 Alex Blewitt CLA 2011-06-22 06:44:16 EDT
(In reply to comment #5)
> Closing since p2/pde build doesn't currently support this so I can't implement
> it.
> 
> And as Dean mentioned in the wiki article, the benefits are limited.

This can be implemented post Buckminster aggregation, it hasn't got anything to do with PDE (see above).

Also, the benefits are not limited - you can save a significant chunk of this file, which is requested by millions of clients. I'm sure the webmaster would be interested in reducing the bandwidth requirements, even if each individual client then caches it for further use.

This issue is about the generated P2 file, not about PDE. It's a file that can be cleaned up post generation.
Comment 11 Alex Blewitt CLA 2011-06-22 12:17:47 EDT
The data below was for the I build.

Using an P2 mirror for the content.jar of Indigo, there are 1711 copies of the EPL in a 2.5Mb Jar file. Reducing to just one nets a saving down to 1.7Mb, so still a noticeable difference.
Comment 12 Alex Blewitt CLA 2011-06-22 12:53:20 EDT
NB you can verify this behaviour by applying the same change to a local copy of content.jar from the Eclipse platform build, and verifying that the licenses are still shown correctly (which they are in 3.6 and 3.7), when running the update manager inside Eclipse.
Comment 13 Denis Roy CLA 2011-06-23 13:56:21 EDT
> Reducing to just one nets a saving down to 1.7Mb, so
> still a noticeable difference.

Thanks for opening this.

These are the most-fetched files from download.eclipse.org. Download count, file, size.  Cutting each of these by 1/3 would be significant.

26672;/mylyn/drops/3.6.0/v20110608-1400/content.jar: 71478 bytes
18392;/mylyn/drops/3.6.0/v20110608-1400/artifacts.jar: 12170 bytes
15977;/releases/indigo/201106220900/content.jar: 2286915 bytes
13625;/technology/epp/packages/indigo/R/content.jar: 87194 bytes
12775;/eclipse/updates/3.7/R-3.7-201106131736/artifacts.jar: 53966 bytes
12743;/eclipse/updates/3.7/R-3.7-201106131736/content.jar: 362064 bytes
11621;/releases/helios/201102250900/content.jar: 2167853 bytes
11556;/releases/indigo/201106220900/aggregate/artifacts.jar: 318526 bytes
10690;/technology/epp/packages/helios/SR2/content.jar: 101691 bytes
10422;/favicon.ico: 10134 bytes
10379;/releases/helios/201006230900/content.jar: 2085384 bytes
10282;/releases/helios/201009240900/content.jar: 2117274 bytes
9518;/technology/epp/packages/helios/SR1/content.jar: 72498 bytes
9441;/technology/epp/packages/helios/R/content.jar: 77246 bytes
9144;/eclipse/updates/3.6/R-3.6.2-201102101200/artifacts.jar: 50206 bytes
8957;/releases/helios/201102250900/aggregate/artifacts.jar: 250875 bytes
8284;/eclipse/updates/3.6/R-3.6.1-201009090800/artifacts.jar: 48871 bytes
8176;/eclipse/updates/3.6/R-3.6-201006080911/artifacts.jar: 48438 bytes
8085;/releases/helios/201009240900/aggregate/artifacts.jar: 251430 bytes
7992;/releases/helios/201006230900/aggregate/artifacts.jar: 258726 bytes
7804;/eclipse/updates/3.6/R-3.6.2-201102101200/content.jar: 382297 bytes
6846;/eclipse/updates/3.6/R-3.6-201006080911/content.jar: 352376 bytes
6815;/eclipse/updates/3.6/R-3.6.1-201009090800/content.jar: 366489 bytes
6234;/eclipse/updates/3.5/R-3.5-200906111540/artifacts.jar: 32462 bytes
6225;/eclipse/updates/3.5/R-3.5.2-201002111343/artifacts.jar: 33301 bytes
6218;/eclipse/updates/3.5/R-3.5.1-200909170800/artifacts.jar: 32734 bytes
5954;/releases/galileo/200909241140/aggregate/artifacts.jar: 103928 bytes
5944;/releases/galileo/201002260900/aggregate/artifacts.jar: 105468 bytes
4929;/releases/ganymede/content.jar: 1129071 bytes
Comment 14 Ian Bull CLA 2011-06-23 15:15:36 EDT
Someones ears must be burning... Pascal and I just had a conversation and this topic came up.

I think we can all agree that making changes to these files at this point is not really a good idea. The current version has had years of testing, and making a change the day of a release is likely a recipe for disaster.

While a post-processing step may work, p2 has made the 'explicit' decision that an Installable Unit (IU) is immutable, and I can only imagine the blog posts (from there very vocal members of the Eclipse community) if we purposely added a post-processing step that manually mutated a structure that we designed to be 'immutable'.  

I'm sure the bug report would read something like: "Eclipse cannot even follow their own best practices".

Having said all that, I think this is something we should address.  However, this should likely happen as part of the repository writing.  Since this is not a releng bug, I'm going to move this to the p2 bucket.  The bigger question is how can we detect that a variable substitution can be used, and who can / is willing to do this work.
Comment 15 Pascal Rapicault CLA 2011-06-23 18:42:30 EDT
Before we go down this path, the other thing we should revise is whether a stable URL provided by the OSI or SPDX would suffice to cover for the legal needs (see bug #349822).
Comment 16 Alex Blewitt CLA 2011-06-23 19:29:23 EDT
Note also the solution mentioned previously is not mutating the IU. The entity is replaced as part of the XML parsing and so will have a similar effect to eg sending the file in UTF8 or UTF16 - there should be no difference as far as the data process by P2 is concerned. There are other things P2 could do it better in the future, sure - but this is a drop in fix for now and can be applied to the Indifo repository without any clients needing to be changed.
Comment 17 Ian Bull CLA 2011-06-24 11:46:56 EDT
(In reply to comment #16)
> Note also the solution mentioned previously is not mutating the IU. The entity
> is replaced as part of the XML parsing and so will have a similar effect to eg
> sending the file in UTF8 or UTF16 - there should be no difference as far as the
> data process by P2 is concerned. There are other things P2 could do it better
> in the future, sure - but this is a drop in fix for now and can be applied to
> the Indifo repository without any clients needing to be changed.

Assuming everyone reading the XML is using a proper XML parser that will perform the variable substitution correctly.  While I would like to think that's the case, if software development has taught me anything, it's to expect the unexpected.
Comment 18 Ian Bull CLA 2011-06-24 13:54:38 EDT
And just to be very clear Alex, I'm not disagreeing with the problem, nor your solution.  

I'm worried about the risk of making this type of change at this stage.  Up to now the entire contents of an IUs are contained within the <iu> <iu/> tags, and someone may be depending on that. This was never considered API -- and those depending on this behaviour should be slapped -- but I don't know if we should break them.  Also, as I mentioned, I don't know if all XML parsers will handle the variable substitution properly.

Now, if you are adamant that we should make this change to our existing repositories then this issue should be raised with David Williams, the cross-product mailing list and the AC.  I would personally vote -1 for this as I don't think the risks are worth it, but since I'm not on the AC my vote is not relevant.
Comment 19 Kim Moir CLA 2011-06-24 14:17:02 EDT
I would also vote -1 to this change.  Today, the repository is stable and people are happily downloading Indigo without issue.  I think it is too risky to make this sort of change at this point.  

As was stated earlier in the bug, the authoring support for the licenses was changed in 3.7, but the actually representation of the license in the metadata was not.  Why? Committer resources are limited.  This item was was not seen as high enough priority to defer other plan items and bugs we had to address for Indigo. In an ideal world we would have unlimited resources but the reality is that we do not.  In an ideal world, we would have legions of people happy to submit patches and rigorously test that this all works perfectly for current and previous releases.  The reality is that this isn't usually the case.
Comment 20 Alex Blewitt CLA 2011-06-24 15:12:14 EDT
(In reply to comment #18)
> I'm worried about the risk of making this type of change at this stage.  Up to
> now the entire contents of an IUs are contained within the <iu> <iu/> tags

The content of the IU tags hasn't been changed. When parsed as an XML file, entity substitution is transparent.

> Also, as I mentioned, I don't know if all XML parsers will handle
> the variable substitution properly.

They aren't variables, they are XML entities, like &nbsp; and &rarr; in HTML. It's a fundamental part of how XML works.

http://www.w3.org/TR/xml/#sec-internal-ent

This provably happens at the moment, since the legal text has &#xA; for its carriage returns. If an XML parser wasn't treating entities appropriately, then they would not be handling the very body of text it is designed to replace.

> Now, if you are adamant that we should make this change to our existing
> repositories then this issue should be raised with David Williams, the
> cross-product mailing list and the AC.  I would personally vote -1 for this as
> I don't think the risks are worth it, but since I'm not on the AC my vote is
> not relevant.

(In reply to comment #19)
> I would also vote -1 to this change.  Today, the repository is stable and
> people are happily downloading Indigo without issue.  I think it is too risky
> to make this sort of change at this point.  
> 
> As was stated earlier in the bug...

I raised bug 325378 which was closed as a duplicate, seemingly incorrectly in this case. The issue that that (and this) bug raised wasn't "Make it easier to ensure PDE build includes the same copy of text in every bundle" but rather "Don't duplicate the license in the content file". Whether or not additional features were or weren't implemented is not relevant to either this bug or bug 325378.

> In an ideal world we would have unlimited resources but the reality is
> that we do not.  In an ideal world, we would have legions of people happy to
> submit patches and rigorously test that this all works perfectly for current
> and previous releases.  The reality is that this isn't usually the case.

True, and we all have to accept that. My point is not that we didn't improve things, but that there's still more work to do. And there's no real reason why various dry runs could be done against a clone of the Indigo bits to test this out with a phased approach if necessary.

However, unlike other bugs, the ramifications of this one have a direct financial impact to the Eclipse Foundation through the bandwidth costs associated with trying to minimise this file. Even reformatting the XML file to avoid unnecessary white space (instead of the pretty-printed view as it is at the moment) would help reduce the size.

Finally, regardless of whether the indigo p2 bits are left as is or improved in some way, the problem still remains and the bug should be left open to track a solution for the future.

Denis, should this issue be escalated to the cross-project list, or should we leave the Indigo bits as is and look forward for a resolution for (say) SR1 or Juno? It would seem to affect you and your budget most of all.
Comment 21 Alex Blewitt CLA 2011-09-25 17:48:38 EDT
The bug is still present in the 3.7.1 update sites.
Comment 22 Ian Bull CLA 2012-03-11 23:53:20 EDT
Created attachment 212442 [details]
First cut at a solution

This is a pretty rough cut, but it's a start.  I've focused on two solutions here:

1. Removing the indentation
2. Pulling out strings and creating variables.  

Because I don't want to change the XML Writer too much, and I want to maintain a single pass, this pulls out all strings longer than 100 chars, and creates a variable for them.  Ideally, we'd only pull out strings if they are duplicated in the file (but that will require a multi-pass writer).  

Here are a few numbers on this approach.

All IUs in Indigo (SR0, SR1, SR2) == 5.5M
Collapsing strings (longer than 100 chars) == 3.7M
Remove Indentation (and collapse strings) == 3.2M

That's a pretty good saving (from 5.5M to 3.2M).  I will continue to push forward with this as I think this approach shows promise.

Alex, thanks for your suggestions, especially the one about using XML entities.  This ensure we maintain backwards compatibility while enjoying the space savings.
Comment 23 Pascal Rapicault CLA 2016-11-17 21:39:10 EST
This has been fixed for newer releases of Eclipse.