Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 338027

Summary: Grouping by class loader in the Dominator Tree query is incorrect
Product: [Tools] MAT Reporter: Kevin Grigorenko <kevin.grigorenko>
Component: CoreAssignee: Andrew Johnson <andrew_johnson>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P3 CC: andrew_johnson, krum.tsvetkov
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Whiteboard:
Attachments:
Description Flags
Heapdump demonstrating the problem
andrew_johnson: iplog-
Java program demonstrating the problem
andrew_johnson: iplog-
Proposed patch andrew_johnson: iplog+

Description Kevin Grigorenko CLA 2011-02-23 17:02:31 EST
Build Identifier: Trunk

Hi there. I believe there is a bug in the classloader grouping of the DominatorTree because it always calls snapshot.getClassOf on the dominated object, even if it is a class, which will return java.lang.Class whose classloader is the system class loader.

The bug is in org/eclipse/mat/internal/snapshot/inspections/DominatorQuery$ClassLoaderTree.prepareSet

int clId = snapshot.getClassOf(dominatedId).getClassLoaderId();

It should be changed to:

int clId = -1;
if (snapshot.isClass(dominatedId)) {
    IClass cl = (IClass)snapshot.getObject(dominatedId);
    clId = cl.getClassLoaderId();
} else {
    clId = snapshot.getClassOf(dominatedId).getClassLoaderId();
}

Attached is TestClassloaderBug.java which demonstrates the problem. Below is the output of the Dominator Tree query, grouped by ClassLoader, for both IBM and Oracle JVMs. Notice that in both cases, with the existing code, all the retained heap is under the system class loader. With the patch, the byte arrays are correctly retained under the AppClassLoader:

************************************

IBM JVM without patch:

Class Loader Name                                  | Objects | Shallow Heap | Retained Heap | Percentage
---------------------------------------------------------------------------------------------------------
com.ibm.oti.vm.BootstrapClassLoader @ 0x7ff7e88a968|   1,066 |      476,371 |     3,053,931 |    100.00%
---------------------------------------------------------------------------------------------------------

************************************

IBM JVM with patch:

Class Loader Name                                  | Objects | Shallow Heap | Retained Heap | Percentage
---------------------------------------------------------------------------------------------------------
sun.misc.Launcher$AppClassLoader @ 0x7ff7e8aa148   |       1 |          208 |     2,097,704 |     68.69%
com.ibm.oti.vm.BootstrapClassLoader @ 0x7ff7e88a968|   1,065 |      476,162 |       956,226 |     31.31%
---------------------------------------------------------------------------------------------------------

************************************

Oracle JVM without patch:

Class Loader Name    | Objects | Shallow Heap | Retained Heap | Percentage
---------------------------------------------------------------------------
<system class loader>|     670 |       16,143 |     2,290,015 |    100.00%
---------------------------------------------------------------------------

************************************

Oracle JVM with patch:

Class Loader Name                            | Objects | Shallow Heap | Retained Heap | Percentage
---------------------------------------------------------------------------------------------------
sun.misc.Launcher$AppClassLoader @ 0x50000e02|       1 |           15 |     2,097,215 |     91.58%
<system class loader>                        |     669 |       16,127 |       192,799 |      8.42%
---------------------------------------------------------------------------------------------------

************************************

Also attaching the hprof dump. If you want the IBM system dump, I can upload that too.

Thanks,
-- Kevin Grigorenko


Reproducible: Always
Comment 1 Kevin Grigorenko CLA 2011-02-23 17:04:03 EST
Created attachment 189651 [details]
Heapdump demonstrating the problem
Comment 2 Kevin Grigorenko CLA 2011-02-23 17:04:33 EST
Created attachment 189652 [details]
Java program demonstrating the problem
Comment 3 Kevin Grigorenko CLA 2011-02-23 17:04:53 EST
Created attachment 189653 [details]
Proposed patch
Comment 4 Andrew Johnson CLA 2011-02-25 02:35:36 EST
Thank you for the report and for the patch.

I see the problem with the first of our test dumps I tried: dumps/sun_jdk6_18_x32.hprof

and the patch fixes it.

For the record, please could you confirm for the patch:
1.that you wrote 100% of the code
2.that you are authorized to contribute it
3.that is is submitted under the EPL.
This is needed so we can use the patch.

I'll write a test case for this and also check if I can see a similar error elsewhere (e.g. histogram grouping or group by value or top components).
Comment 5 Andrew Johnson CLA 2011-02-27 12:02:06 EST
I think we should also count class loaders themselves in the grouped by class loader.

This would give:

Class Loader Name                                  | Objects | Shallow Heap | Retained Heap | Percentage
---------------------------------------------------------------------------------------------------------
sun.misc.Launcher$AppClassLoader @ 0x24139558      |       4 |           71 |         2,519 |      0.46%
|- sun.misc.Launcher$AppClassLoader                |       1 |           72 |         2,520 |      0.46%
|  '- sun.misc.Launcher$AppClassLoader @ 0x24139558|         |           72 |         2,520 |      0.46%
|- java.lang.Class                                 |       3 |            0 |             0 |      0.00%
|  |- class ClassA @ 0x27c44d48                    |         |            0 |             0 |      0.00%
|  |- class ClassB @ 0x27c45098                    |         |            0 |             0 |      0.00%
|  |- class ClassC @ 0x27c453e8                    |         |            0 |             0 |      0.00%
|  '- Total: 3 entries                             |         |              |               |           
---------------------------------------------------------------------------------------------------------

We should then fix the top components query to count in the same way as the dominator tree grouped by class loader when selecting objects for the component report.

Should we change the top consumers report in the same way?
This has:
Biggest Objects
Biggest Top-Level Dominator Classes
Biggest Top-Level Dominator Class Loaders

I am not so sure about this one. If we did then the top components report would only have one class loader in each Biggest Top-Level Dominator Class Loaders section for that class loader.

What about Group by Value?

How useful is dominator tree group by class loader if it expands from class loader to classes to objects, but no further into the actual dominator tree. Similarly how useful is the group by package if it just goes down the to class level, but no further?
Comment 6 Andrew Johnson CLA 2011-04-28 05:27:29 EDT
I'll change the dominator tree, the selection of objects for the top components report and the group by value (by class loader).

I've also fixed the group by value by class loader to get the correct class loader when there are multiple objects of one value all with the same class loader - before the change then multiple values were marked as having a class loader <none>.

I haven't changed the top consumers report as this uses Histogram.getClassLoaderHistogramRecords() and I'm not yet clear what the right behavior is for this.
Comment 7 Andrew Johnson CLA 2011-04-28 09:43:55 EDT
I have also changed the top consumers html report Biggest Top-Level Dominator Class Loaders.

There can be more than one class loader in this report.

Consider top level objects objects a, b of types A, B loaded by loader X, which both point to object c of type C loaded by loader Y.
Object c doesn't have a single parent, so will be in the top level of the dominator tree and will be included in the component report for Y.

The component report for X is based on the retained set for all top level objects associated with a loader, so for loader X is the retained set of a,b. The retained set of a,b includes c and will be in the component report for X. The top consumers report will look for the getTopAncestorsInDominatorTree to find the biggests object and will find a,b and c. The class loader associated with c is Y.

c is actually double counted as it is in two component reports. Is this a problem?
Comment 8 Kevin Grigorenko CLA 2011-04-28 09:54:13 EDT
Hey Andrew,

I think that as long as it's clear that the component reports are independent (i.e. from the perspective of that particular component, and not to be aggregated), and that this is somehow noted in a comment in the report or in the help documentation, then I think it's okay. I've been confused by tools in the past when they double counted in a meaningful way, but didn't let me know, because that can create weird scenarios if the user tries to naively compare certain data.

~Kevin
Comment 9 Krum Tsvetkov CLA 2011-04-29 07:12:27 EDT
Andrew, I am currently looking at the project IP log which we need submit for Indigo (deadline is 18 May). I haven't really followed the discussions/commits here. Shall we mark the proposed patch already with +iplog, or you haven't subitted it yet?

Kevin, sorry for asking again, but could you answer the questions from comment 4? I am not 100% sure if the process requires that we ask this explicitly, but to be on the safe side...

Thanks for giving us feedback and the patch!
Comment 10 Kevin Grigorenko CLA 2011-04-29 08:43:32 EDT
> Kevin, sorry for asking again, but could you answer the questions from comment
> 4? I am not 100% sure if the process requires that we ask this explicitly, but
> to be on the safe side...
> 

Hi Krum, no problem; Yes, I can answer in the affirmative for all the questions.
Comment 11 Andrew Johnson CLA 2011-04-30 11:22:29 EDT
Comment on attachment 189652 [details]
Java program demonstrating the problem

Test dump generation code not used.
Comment 12 Andrew Johnson CLA 2011-04-30 11:23:01 EDT
Comment on attachment 189651 [details]
Heapdump demonstrating the problem

Test dump not used - existing dumps also show the problem
Comment 13 Andrew Johnson CLA 2011-04-30 11:23:23 EDT
Comment on attachment 189653 [details]
Proposed patch

Patch used as basis of fix
Comment 14 Krum Tsvetkov CLA 2011-05-10 03:44:34 EDT
Andrew, just figured out that the last change to TopConsumers2Query.java (revision 1104: "338027 Fix grouping by class loader for top consumers html for classes and class loaders themselves.") had some side effects.
More precisely - making the objects parameter mandatory is breaking the leak suspect report. Well, not really breaking, but the TopConsumers section is not there, as it throws an exception:
!ENTRY org.eclipse.mat.report 4 0 2011-05-10 09:29:51.926
!MESSAGE Ignoring result of 'Top Consumers' due to Missing required parameter: objects
!STACK 0
org.eclipse.mat.SnapshotException: Missing required parameter: objects
	at org.eclipse.mat.query.registry.ArgumentSet.execute(ArgumentSet.java:63)
	at org.eclipse.mat.query.registry.CommandLine.execute(CommandLine.java:93)
	at org.eclipse.mat.report.internal.QueryPart.execute(QueryPart.java:96)
	at org.eclipse.mat.report.internal.SectionPart.execute(SectionPart.java:61)
	at org.eclipse.mat.report.internal.SectionPart.execute(SectionPart.java:61)
	at org.eclipse.mat.report.TestSuite.execute(TestSuite.java:131)
	at org.eclipse.mat.report.internal.RunRegisterdReport.execute(RunRegisterdReport.java:50)
	at org.eclipse.mat.query.registry.ArgumentSet.execute(ArgumentSet.java:129)
	at org.eclipse.mat.ui.QueryExecution$ExecutionJob.run(QueryExecution.java:174)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)
	
Modifying the overview.xml to execute the command like <command>top_consumers_html .*</command> works, but before fixing there I wanted to ask if there is some specific reason to make the parameter mandatory?
Comment 15 Andrew Johnson CLA 2011-05-10 04:25:53 EDT
No, there was no reason the objects argument was mandatory - I think it was just a mistake when I was testing something else.
Comment 16 Krum Tsvetkov CLA 2011-05-10 05:03:29 EDT
I'll change it back then.
Comment 17 Krum Tsvetkov CLA 2011-05-11 07:37:17 EDT
Are there further open problems here or can we close the bug?
Comment 18 Andrew Johnson CLA 2011-05-11 08:19:23 EDT
Only thing is whether to document how the grouping is done for dominator tree and top components, and comment 7.
Comment 19 Andrew Johnson CLA 2011-05-11 10:52:53 EDT
I've added some help to the dominator tree and top components query.