Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 570221

Summary: Use datatypes suitable to express the requirements of an api
Product: [Eclipse Project] JDT Reporter: Carsten Hammer <carsten.hammer>
Component: UIAssignee: Carsten Hammer <carsten.hammer>
Status: CLOSED WONTFIX QA Contact: Fabrice Tiercelin <fabrice.tiercelin>
Severity: normal    
Priority: P3 CC: fabrice.tiercelin, jjohnstn
Version: 4.19   
Target Milestone: ---   
Hardware: All   
OS: All   
See Also: https://git.eclipse.org/r/c/jdt/eclipse.jdt.ui/+/172988
https://git.eclipse.org/r/c/jdt/eclipse.jdt.ui/+/172260
https://git.eclipse.org/c/jdt/eclipse.jdt.ui.git/commit/?id=71c57cdeedcac3266f60e455d05d89bc1b5be564
https://git.eclipse.org/r/c/jdt/eclipse.jdt.ui/+/174577
https://git.eclipse.org/c/jdt/eclipse.jdt.ui.git/commit/?id=45c8db5b5398dcccef411a6137c6db4028a33119
Whiteboard:

Description Carsten Hammer CLA 2021-01-10 05:46:44 EST
In a number of cases jdt.ui uses a datatype on method parameters or internally in classes that is not aligned to the exact requirements.

What the datatype expresses is e.g.:

array[] - ordered, duplicate objects expected or allowed, no multithreading addition and deletion supported by the datatype

List - ordered, duplicate objects expected or allowed, can support multithreading using specific implementation, unfortunately there is no specific interface definition to express that (design flaw in the architecture of java collections). However you can fallback to use a specific multithreading capable implementation instead of List.

Set - not ordered (can support ordered set by specific implementation like LinkedHashSet), no duplicate objects expected or allowed (hashcode and equals implementation of objects used need to be correct - that is not always the case), can support multithreading using specific implementation, similar problem as List

HashTable - ordered, duplicate objects expected or allowed, multithreading supported (but slower than unsynchronized collection) 

and so on...

Because of using most of the time simple arrays instead of using the right datatypes jdt.ui needs a lot of boilerplate code to take into account the possible duplications and multithreading issues that arise from the fact that it is using a too generic approach datatype wise. Simplifying maintainability is another advantage of using datatypes where specific requirements are an internal implementation detail instead of generic datatypes where you have to consider on every use your requirements. 

Of course the reason for having this is clearly that at the time of the introduction of these code parts there had been no alternatives. Many useful collections and apis have been introduced later.

And the second important aspect is that in a number of situations certain processing on simple arrays still can be the fastest possible way. So it could be that using "the right datatype" in some situations is not worth the performance penalty that you pay for it.

I suggest to check on a case by case base the code to introduce better 
alternatives.
Comment 1 Jeff Johnston CLA 2021-01-10 18:45:42 EST
One other thing to consider is that git history gets replaced and this has bearing in future maintenance.  If the changes correct errors, enable some feature, or have noticeable performance benefits, then by all means they are worth doing.  Otherwise, if they are just a rewrite of code, especially if the code has been stable for some time, I am not sure of the benefit.
Comment 2 Carsten Hammer CLA 2021-01-11 01:44:35 EST
(In reply to Jeff Johnston from comment #1)
> One other thing to consider is that git history gets replaced and this has
> bearing in future maintenance.  If the changes correct errors, enable some
> feature, or have noticeable performance benefits, then by all means they are
> worth doing.  Otherwise, if they are just a rewrite of code, especially if
> the code has been stable for some time, I am not sure of the benefit.

True, would it be OK to check one or both of the attached gerrits for being acceptable under this consideration or not or should I close this buzilla and abandon its gerrits directly?
Comment 4 Jeff Johnston CLA 2021-01-12 21:54:44 EST
(In reply to Carsten Hammer from comment #2)
> (In reply to Jeff Johnston from comment #1)
> > One other thing to consider is that git history gets replaced and this has
> > bearing in future maintenance.  If the changes correct errors, enable some
> > feature, or have noticeable performance benefits, then by all means they are
> > worth doing.  Otherwise, if they are just a rewrite of code, especially if
> > the code has been stable for some time, I am not sure of the benefit.
> 
> True, would it be OK to check one or both of the attached gerrits for being
> acceptable under this consideration or not or should I close this buzilla
> and abandon its gerrits directly?

Regarding the first gerrit, I would recommend dropping it as one change is quite large and others are on tests which unless they create noticeable performance gains, probably don't justify the loss of annotation history.

Regarding the gerrit that was just merged below which only had a small amount of chnages: can you guarantee that the fSettings will never be accessed via multiple threads and therefore the code should use the original Hashtable?  If not, then it should likely be reverted, otherwise, I am ok as it touches few lines.
Comment 5 Carsten Hammer CLA 2021-01-13 00:38:39 EST
(In reply to Jeff Johnston from comment #4)
> (In reply to Carsten Hammer from comment #2)
> > (In reply to Jeff Johnston from comment #1)
> > > One other thing to consider is that git history gets replaced and this has
> > > bearing in future maintenance.  If the changes correct errors, enable some
> > > feature, or have noticeable performance benefits, then by all means they are
> > > worth doing.  Otherwise, if they are just a rewrite of code, especially if
> > > the code has been stable for some time, I am not sure of the benefit.
> > 
> > True, would it be OK to check one or both of the attached gerrits for being
> > acceptable under this consideration or not or should I close this buzilla
> > and abandon its gerrits directly?
> 
> Regarding the first gerrit, I would recommend dropping it as one change is
> quite large and others are on tests which unless they create noticeable
> performance gains, probably don't justify the loss of annotation history.
> 
> Regarding the gerrit that was just merged below which only had a small
> amount of chnages: can you guarantee that the fSettings will never be
> accessed via multiple threads and therefore the code should use the original
> Hashtable?  If not, then it should likely be reverted, otherwise, I am ok as
> it touches few lines.

Hashtable is not only a question of using a synchronized collection. This type as is is deprecated and there are replacements that offer synchronization for a reason. We can change to a synchronized collection without making use of Hashtable.

Ok, I thought that the design changes and its benefits for new possibilities regarding parallel stream processing, type safety, interface design and future maintenance are so obvious that it outweighs arguments to „never change a running system“. That’s obviously not the case. Then let‘s revert and after that close as „wontfix“..

Maybe we can pick this up at a later point in time.
Comment 6 Eclipse Genie CLA 2021-01-13 00:51:07 EST
New Gerrit change created: https://git.eclipse.org/r/c/jdt/eclipse.jdt.ui/+/174577
Comment 8 Jeff Johnston CLA 2021-01-13 15:50:04 EST
(In reply to Carsten Hammer from comment #5)
> (In reply to Jeff Johnston from comment #4)
> > (In reply to Carsten Hammer from comment #2)
> > > (In reply to Jeff Johnston from comment #1)
> > > > One other thing to consider is that git history gets replaced and this has
> > > > bearing in future maintenance.  If the changes correct errors, enable some
> > > > feature, or have noticeable performance benefits, then by all means they are
> > > > worth doing.  Otherwise, if they are just a rewrite of code, especially if
> > > > the code has been stable for some time, I am not sure of the benefit.
> > > 
> > > True, would it be OK to check one or both of the attached gerrits for being
> > > acceptable under this consideration or not or should I close this buzilla
> > > and abandon its gerrits directly?
> > 
> > Regarding the first gerrit, I would recommend dropping it as one change is
> > quite large and others are on tests which unless they create noticeable
> > performance gains, probably don't justify the loss of annotation history.
> > 
> > Regarding the gerrit that was just merged below which only had a small
> > amount of chnages: can you guarantee that the fSettings will never be
> > accessed via multiple threads and therefore the code should use the original
> > Hashtable?  If not, then it should likely be reverted, otherwise, I am ok as
> > it touches few lines.
> 
> Hashtable is not only a question of using a synchronized collection. This
> type as is is deprecated and there are replacements that offer
> synchronization for a reason. We can change to a synchronized collection
> without making use of Hashtable.
> 
> Ok, I thought that the design changes and its benefits for new possibilities
> regarding parallel stream processing, type safety, interface design and
> future maintenance are so obvious that it outweighs arguments to „never
> change a running system“. That’s obviously not the case. Then let‘s revert
> and after that close as „wontfix“..
> 
> Maybe we can pick this up at a later point in time.

Thanks Carsten.