Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 335854

Summary: Performance: ServiceRegistry: array copy
Product: [Eclipse Project] Equinox Reporter: Radoslav Ivanov <radoslav.ivanov>
Component: FrameworkAssignee: equinox.framework-inbox <equinox.framework-inbox>
Status: RESOLVED INVALID QA Contact:
Severity: minor    
Priority: P3 CC: l.kirchev
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Attachments:
Description Flags
Self Bytes Allocation none

Description Radoslav Ivanov CLA 2011-01-31 10:23:22 EST
Build Identifier: 3.7

Hello

I am writing with regards to method “org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.lookupServiceRegistrations(String,Filter), which creates new ArrayList object from found result (row 962). 

IMHO, that operation causes copy of array which afterwards is iterrated and some of its elements are being removed. I would suggest to improve that by creating an emtpy array list instead and adding only the result stuff in itterator. As a result we will have no array copy along with no removals from iterator.

Reproducible: Always
Comment 1 BJ Hargrave CLA 2011-01-31 16:34:16 EST
Is this a real performance issue? How much time is spent here? How much time does the suggested change make? Please provide some performance data.

The array copy must be made in the synchronized region. This where the array list is created. This is why the code is the way it is. If there is a demonstrated and non-trivial performance issue, we can investigate a change.
Comment 2 Radoslav Ivanov CLA 2011-02-07 06:37:55 EST
Created attachment 188427 [details]
Self Bytes Allocation
Comment 3 Radoslav Ivanov CLA 2011-02-07 06:55:00 EST
Hello

Sorry for late reply but I have been busy previous week. 

Of course it is not a big issue but it might be improved. I already attached some data and, you see, the allocated self bytes are consumed because of array copy. Moreover, I would reduce the scope of synchronization by locking only on object it is read from.

I suggest method's code like this:
{ if (clazz == null) {
 synchronized (allPublishedServices) {
   return buildResult(allPublishedServices, filter);
 }
} else {
 synchronized (publishedServicesByClass) {
   return buildResult(publishedServicesByClass.get(clazz), filter);
 }
}}

private List<ServiceRegistrationImpl<?>> buildResult(List result, Filter filter) {
 if ((result == null) || result.isEmpty()) {
   List<ServiceRegistrationImpl<?>> empty = Collections.EMPTY_LIST;
   return empty;
 }
 if (filter == null){
  List<ServiceRegistrationImpl<?>> aResult = new ArrayList<ServiceRegistrationImpl<?>>(result);
  return aResult;
 }
 List<ServiceRegistrationImpl<?>> aResult = new ArrayList<ServiceRegistrationImpl<?>>();
 //iterrate result object, add only necessary stuff and then return
 for (Iterator<ServiceRegistrationImpl<?>> iter = result.iterator(); iter.hasNext();) {
   ServiceRegistrationImpl<?> registration = iter.next();
   ServiceReferenceImpl<?> reference;
   try {
     reference = registration.getReferenceImpl();
   } catch (IllegalStateException e) {
     continue;
   }
   if (filter.match(reference)) {
	aResult.add(reference);
   }
 }
 return aResult;
}


Best regards
Radoslav Ivanov
Comment 4 BJ Hargrave CLA 2011-02-07 08:45:58 EST
(In reply to comment #2)
> Created attachment 188427 [details]
> Self Bytes Allocation

I am not sure what this tells me other than memory was allocated which is necessary no matter what. That picture refers to ArrayList.<init> not ArrayList.remove which is where I thought you claimed there is a problem.

(In reply to comment #3)
> Of course it is not a big issue but it might be improved. 

Since it a not a big issue, we should do nothing. Attempting to optimize for no significant reason is a recipe for introducing errors.

> Moreover, I would reduce the scope of synchronization by locking only on
> object it is read from.

Did you not see the comments for the allPublishedServices and publishedServicesByClass fields? /* @GuardedBy("this") */

You cannot change the scope of the synchronization in this method without also changing *all* of the other code which accesses (reads and writes) theses fields. Your change would destroy the thread safe access to the data across the service registry.

> 
> I suggest method's code like this:
> { if (clazz == null) {
>  synchronized (allPublishedServices) {
>    return buildResult(allPublishedServices, filter);
>  }
> } else {
>  synchronized (publishedServicesByClass) {
>    return buildResult(publishedServicesByClass.get(clazz), filter);
>  }
> }}
> 
> private List<ServiceRegistrationImpl<?>> buildResult(List result, Filter
> filter) {
>  if ((result == null) || result.isEmpty()) {
>    List<ServiceRegistrationImpl<?>> empty = Collections.EMPTY_LIST;
>    return empty;
>  }
>  if (filter == null){
>   List<ServiceRegistrationImpl<?>> aResult = new
> ArrayList<ServiceRegistrationImpl<?>>(result);
>   return aResult;
>  }
>  List<ServiceRegistrationImpl<?>> aResult = new
> ArrayList<ServiceRegistrationImpl<?>>();
>  //iterrate result object, add only necessary stuff and then return
>  for (Iterator<ServiceRegistrationImpl<?>> iter = result.iterator();
> iter.hasNext();) {
>    ServiceRegistrationImpl<?> registration = iter.next();
>    ServiceReferenceImpl<?> reference;
>    try {
>      reference = registration.getReferenceImpl();
>    } catch (IllegalStateException e) {
>      continue;
>    }
>    if (filter.match(reference)) {

The above is another mistake. You are now calling user code (since Filter.match will call <init>(String), compareTo and equals on whatever objects are in the service properties) while holding a framework lock. The service registry code is written *very* carefully to avoid calling user code while holding framework locks since doing so exposes the system to deadlocks. 

This is why the current code copies the data while holding the proper lock and then calls user code after releasing the lock.

>     aResult.add(reference);
>    }
>  }
>  return aResult;
> }
> 

Again, I don't see any reason to make any change in this area without proof of a problem and proof the proposed solution improves things.

Perhaps you can measure the performance change in time and memory of changing the concrete list type from ArrayList to LinkedList:

  result = new LinkedList<ServiceRegistrationImpl<?>>(result);

since LinkedList will not reallocate an internal array on object removal. But you would need to prove it did not make things worse on average.
Comment 5 BJ Hargrave CLA 2011-02-07 08:53:57 EST
(In reply to comment #4)
> since LinkedList will not reallocate an internal array on object removal. But
> you would need to prove it did not make things worse on average.

Actually ArrayList.remove does not reallocate the internal array. It just copies elements to the left. So the worst case, memory wise, is that the internal array is bigger than necessary.

Your proposed change allocates an ArrayList of default size and then adds elements which may result in a number of internal array allocations and element copies to grow the list as elements are added. Each time the ArrayList must reallocate the internal array, it allocates it with headroom for future growth. So your proposed change may be in fact be worse in than the current code.

This is why we need real performance data to evaluate proposed performance changes.
Comment 6 Radoslav Ivanov CLA 2011-02-07 09:22:43 EST
Hi again

Please ingore the proposal for synchronization, it was just a suggestion. 

>not a big issue, we should do nothing
Indeed, if the driving motivation is that above, I would like you to close the bug and ignore my next words.

Obviously (attached image), copying of array causes allocation of memory - there is nothing to do with ArrayList.remove (it just proves the waste of copying). 

Still I belive that this allocates less (even included in the synchronized block) then the current code with removing does:
List<ServiceRegistrationImpl<?>> aResult = new LinkedList<ServiceRegistrationImpl<?>>();
//iterrate result object, add only necessary stuff and then return
for (Iterator<ServiceRegistrationImpl<?>> iter = result.iterator();
iter.hasNext();) {
   ServiceRegistrationImpl<?> registration = iter.next();
   ServiceReferenceImpl<?> reference;
   try {
     reference = registration.getReferenceImpl();
   } catch (IllegalStateException e) {
     continue;
   }
   if (filter.match(reference)) {
    aResult.add(reference);
   }
}
return aResult;
Comment 7 BJ Hargrave CLA 2011-02-07 10:24:06 EST
(In reply to comment #6)
> >not a big issue, we should do nothing
> Indeed, if the driving motivation is that above, I would like you to close the
> bug and ignore my next words.
> 

I think I will close this. Please reopen if and when you have performance data indicating a real problem and how that problem is improved by a proposed solution.

> Obviously (attached image), copying of array causes allocation of memory -
> there is nothing to do with ArrayList.remove (it just proves the waste of
> copying). 

A new list needs to be created in any case. The issue is how much storage is required for the copy. For an ArrayList, there is an internal array whose size is set at the size of the source list. So if the size of the source list is L, then the size of the internal array in the copy list is always L even when the number of elements, N, become less than L when some elements are removed after the copy.

With the change to a LinkedList, the size of each entry increases from 1 to 3 (element reference and forward and back references). So with a LinkedList of N elements, the storage size is 3N. So unless you remove at least 2/3 of the elements, N < (2/3)L, the storage size of the LinkedList of N elements will always exceed the storage size of the ArrayList of N elements having an internal array of size L.

Furthermore, the ArrayList does a single allocation upon construction. A LinkedList will do N allocations, one for each add.

> 
> Still I belive that this allocates less (even included in the synchronized
> block) then the current code with removing does:

This is why *real* data is important. What we may believe may not actually be the case.

> List<ServiceRegistrationImpl<?>> aResult = new
> LinkedList<ServiceRegistrationImpl<?>>();
> //iterrate result object, add only necessary stuff and then return
> for (Iterator<ServiceRegistrationImpl<?>> iter = result.iterator();
> iter.hasNext();) {
>    ServiceRegistrationImpl<?> registration = iter.next();
>    ServiceReferenceImpl<?> reference;
>    try {
>      reference = registration.getReferenceImpl();
>    } catch (IllegalStateException e) {
>      continue;
>    }
>    if (filter.match(reference)) {
>     aResult.add(reference);
>    }
> }
> return aResult;