Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 175714 - Investigate if version suffixes can be shortened
Summary: Investigate if version suffixes can be shortened
Status: RESOLVED WONTFIX
Alias: None
Product: PDE
Classification: Eclipse Project
Component: Build (show other bugs)
Version: 3.3   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: pde-build-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 132073 171482 298116 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-02-27 10:54 EST by Andrew Niefer CLA
Modified: 2018-12-03 09:12 EST (History)
10 users (show)

See Also:


Attachments
proposed patch (11.67 KB, patch)
2007-04-18 15:55 EDT, Andrew Niefer CLA
no flags Details | Diff
updated (12.18 KB, patch)
2007-04-26 17:21 EDT, Andrew Niefer CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Niefer CLA 2007-02-27 10:54:58 EST
The generated feature version suffixes are getting too long, the tests have started exceeding the path length on windows:
[mkdir] Created dir: C:\buildtest\N20070227-0010\eclipse-testing\test-eclipse
     [exec] checkdir warning:  path too long; truncating
     [exec]                    eclipse/plugins/org.eclipse.platform.source_3.3.0.N20070227-001-9b9BEgJO3Y5Usn7Qnp12mEgnqgJJz-I/src/org.eclipse.core.runtime.compatibility.registry_3.2.100.N20070227-0010/runtime_registry_compatibilitysrc.zip
     [exec]                 -> C:/buildtest/N20070227-0010/eclipse-testing/test-eclipse/eclipse/plugins/org.eclipse.platform.source_3.3.0.N20070227-001-9b9BEgJO3Y5Usn7Qnp12mEgnqgJJz-I/src/org.eclipse.core.runtime.compatibility.registry_3.2.100.N20070227-0010/runtime_registry_compatibilitys

org.eclipse.rcp	     001-8h8RDoFW9kz0tNfmpfIIyH
org.eclipse.platform 001-9b9BEgJO3Y5Usn7Qnp12mEgnqgJJz-I
org.eclipse.sdk	     001-7L7J-1cIlbQhSNE_7DAIHP8z0ufvtEupmHYfPsY97lohorhKKz00J

We should consider reintroducing default values for generatedVersionLength or significantVersionDigits if we can't think of another way to shorten these suffixes.
Comment 1 David Olsen CLA 2007-02-27 19:01:46 EST
I could argue that the right fix is to shorten the length of qualifiers and to specify generatedVersionLength where necessary.  But I don't think that will fly.

I knew that umbrella features (features that include other features) would have longer qualifier suffixes.  But I didn't think the effect would be this great.

So I'm fine changing the default.  I think generatedVersionLength should have a default rather than significantVersionDigits.  Pick a value that you think is best (as large as possible while minimizing the chance of path length overflow).  It should be a one word change in the code, so Andrew can make the actual change.
Comment 2 Andrew Niefer CLA 2007-03-01 12:18:20 EST
I have introduced a default value of 28 for generatedVersionLength.
The rcp suffix was 26 indicating alot of features that only contain plugins should probably fit in the 28.  The platform suffix was 35 and we ended up being 6 characters too long, so this path should now (barely) be short enough.

I'm keeping this bug open to investigate whether or not we can do anything to the algorithm itself to shorten the qualifiers without losing information.
Comment 3 Kim Moir CLA 2007-03-28 10:39:00 EDT
It seems the length of 28 is still too long  - caused problems in N20070328-0010.  Can the pde build team provide any guidance this problem - can we set a shorter length without losing information?
Comment 4 John Arthorne CLA 2007-03-28 11:02:05 EDT
I think you can go *much* shorter on these suffixes.  If each character is in the range a-z, A-Z, 0-9, that's over 60 character possibilities per position.  With only six characters you have a range of 56 billion possible feature suffixes, which should be plenty. With 28 characters I think you're in the "number of grains of sand on earth" range.
Comment 5 Andrew Niefer CLA 2007-03-28 11:41:40 EDT
60^28 possibilities should be more than enough for anyone.
The problem is that in general there are an infinite number of qualifier possibilities, and we are considering a build in isolation without reference to early versions.  

I think what we need is a way for the builder to provide us with more information that we could use to shorten things up.
For example, if we knew that qualifiers were always of the form 
c<8 digit date stamp>xyz123 and that no date was earlier than say 2000, then there would be some savings that could be made there.

Comment 6 Markus Knauer CLA 2007-03-28 12:01:58 EDT
I don't know if this is the right place to discuss the general length, but from time to time we had user reports about ZIP files which couldn't not be extracted by the default Windows ZIP or other unzip tools. (I am not talking about I or N builds - this happens with milestone and release builds)

When we asked these users, in most cases it was a problem of the install directory of Eclipse: You probably know what happens, if a user extracts the Eclipse ZIP in his 'desktop' folder and he has a very long user name - the unzip fails because of some very lengthy pathnames, usually from documentation plug-ins. From my point of view: Keep these qualifiers as short as possible.
Comment 7 John Arthorne CLA 2007-03-28 12:11:45 EDT
Just for fun (and my math could be a bit rusty). 60^28=(10^1.79)^28 = 10^50.  According to the link below, this is slightly more than the number of atoms on planet earth (8.87 * 10^49)  :)

http://pages.prodigy.net/jhonig/bignum/qaearth.html
Comment 8 Kim Moir CLA 2007-03-28 15:55:22 EDT
Regarding comment #5, for our builds the qualifier for builds will always be
cyyyymmdd-hhmm (N builds) or the cvs tag for I builds plus the generated part.
The cvs tag depends on the convention of the contributing team.

I don't think we can make any assumptions about the qualifiers used by other teams.

Comment 9 Jeff McAffer CLA 2007-03-29 12:25:09 EDT
its not number of possible values it is the distribution that counts.  There are infinitely many seconds but if I truncate time to HHMM then there will be quite alot of seconds that are counted as the same.  This feels like a classic hashing related problem.
Comment 10 DJ Houghton CLA 2007-03-29 14:28:32 EDT
*** Bug 171482 has been marked as a duplicate of this bug. ***
Comment 11 Andrew Niefer CLA 2007-03-29 18:19:14 EDT
Jeff is right that it is about the distribution, this is what I was getting at in comment #5 in that we need more information to narrow the scope. 

I think the best chance of doing something right now is around the nesting of features.  Nesting features are resulting in longer suffixes than we originally expected and there should be savings to be had there.  (The idea would be to recursively visit all included features and look at their plugins instead of just taking the feature's qualifier).

As well, the inclusion of the major.minor.service versions is responsible for around 5 characters, perhaps there is a better way to do this, or at least an option to turn it off if the feature maintainer is already incrementing the feature's version in response to such changes.
Comment 12 Pascal Rapicault CLA 2007-03-30 09:25:23 EDT
*** Bug 132073 has been marked as a duplicate of this bug. ***
Comment 13 Andrew Niefer CLA 2007-04-18 15:55:12 EDT
Created attachment 64225 [details]
proposed patch

The attached patch makes the proposed change to recurse and gather plugins instead of using the feature suffix.  This patch also partially addresses bug 162022 by including the context portion of feature qualifiers.  It also fixes a previously unnoticed problem where plugins that weren't included in a config that was being built were not considered for the suffix.
The changes to the suffixes are as follows:
org.eclipse.rcp      
       old: 001-8h8RDoFW9kz0tNfmpfIIyH
       new: 8-7rDo FPl2C-Kc5o7F6FH
org.eclipse.platform 
       old: 001-9b9BEgJO3Y5Usn7Qnp12mEgnqgJJz-I
       new: _n_9Elxz6IITqlYr6CvSz0H3
org.eclipse.sdk  
       old : 001-7L7J-1cIlbQhSNE_7DAIHP8z0ufvtEupmHYfPsY97lohorhKKz00J
       new : zlIP5TG7EGMTkYdL1g_KN4fjYBKQY1z0Aa1EXDz0i

Note that although this patch does not do it, a further 6-7 characters could be removed by an option to not include the major.minor.service portions of version numbers in the calculation.

We will run a test build before I released these changes.
Comment 14 Jeff McAffer CLA 2007-04-18 23:03:34 EDT
I have some concerns about the scalability of this approach since the length of the suffix seems to grow with the transitive number of children.  That means that some feature sitting on top of a few thousand bundles may have a really long suffix. 

More generally it seems that the approach being used only really works if all the constituents progress.  Since basically we are adding base 65 numbers, adding a bundle and removing a bundle could cancel each other out resulting in a lower suffix.  In a conversation with Andrew he pointed out that the only real way of addressing this is to know about the previous versions in the stream.

In short, unless we know that we are getting significant and scalable reduction in the suffix length we should exercise caution in putting this in and risking other interesting behaviour.
Comment 15 Andrew Niefer CLA 2007-04-19 12:04:04 EDT
I ran this on a large product where there was 2718 features and plugins considered in the transitive closure of versions.

Gd9EaoM_ZlEYw-i3F0WQlcWms2rTEMG6bqSrC4z-Jt92bLI0  (product - 2718 items)
BuB3F-28kIP5TG7EGMTkYdL1ruKJt5OM92TA7R62VsjS      (sdk - 141 items)
_n_9Elxz6IITqlYr6CvSz0H3                          (platform - 92 items)
8-7rDo7FPl2C-Kc5o7F6FH                            (rcp - 25 items)
7k7cE9y7Cq6ba7T8iB_NMTK                           (jdt - 17 items)
7N7M-6NUEF6EZNl6CyDC                              (pde - 8 items)
7C79_71CI99g_LAQ                                  (cvs - 5 items)

This was run with a small change to the algorithm that is not yet included in the patch, as well with a change to pde build for bug 183207.

I think the algorithm may be logarithmic with the number of items, but somewhat linear with the actual qualifiers of the items involved.  The features nested in the large product were all using 3.2 style suffixes so there was not the same kind of nesting of suffixes that the SDK had.
Comment 16 Andrew Niefer CLA 2007-04-26 17:21:07 EDT
Created attachment 65116 [details]
updated
Comment 17 Andrew Niefer CLA 2007-04-30 17:38:02 EDT
Comment on attachment 65116 [details]
updated

We decided not to release the patch attached here.  It would actually make the case of removing plugins from a feature (bug 162022) worse.
Instead I fixed an encoding problem that lead to repeating 'z's in the version number.  This also fixed a particular case that made it possible for the suffix to drop.  I also fixed the problem with unresolved plugins not being accounted for.

We fixed bug 176947 so that it is easier to set the significantVersionDigits and generatedVersionLength.
Comment 18 Andrew Niefer CLA 2007-05-01 08:53:58 EDT
Not doing anything more for 3.3
Comment 19 Pascal Rapicault CLA 2009-12-21 12:18:39 EST
*** Bug 298116 has been marked as a duplicate of this bug. ***
Comment 20 Lars Vogel CLA 2018-12-03 09:11:06 EST
Currently we are not actively enhancing PDE build anymore. Therefore, I close this bug as WONTFIX. 

Please reopen, if you plan to provide a fix.
Comment 21 Lars Vogel CLA 2018-12-03 09:12:57 EST
Currently we are not actively enhancing PDE build anymore. Therefore, I close this bug as WONTFIX. 

Please reopen, if you plan to provide a fix.