Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 370182 - [compiler] investigate caching compiler options
Summary: [compiler] investigate caching compiler options
Status: VERIFIED WONTFIX
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.8   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: 3.8 M6   Edit
Assignee: Ayushman Jain CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-31 00:23 EST by Ayushman Jain CLA
Modified: 2012-03-12 16:41 EDT (History)
2 users (show)

See Also:


Attachments
proposed fix (2.62 KB, patch)
2012-02-07 00:46 EST, Ayushman Jain CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ayushman Jain CLA 2012-01-31 00:23:37 EST
3.8M5

See bug 369487#c0 point (6)

>(6) Looking at SingleNameReference.variableBinding
>and QualifiedNameReference.variableBinding
>Should we cache the compiler options somehere ? We know it is expensive 
>to look this up from deely nested context and to do this for every SNR 
>and QNR in the program is hugely wasteful ?
Comment 1 Ayushman Jain CLA 2012-02-07 00:46:26 EST
Created attachment 210628 [details]
proposed fix

Srikanth, can you check if the patch does what was intended? Thanks!
Comment 2 Srikanth Sankaran CLA 2012-02-08 06:08:24 EST
(In reply to comment #1)
> Created attachment 210628 [details]
> proposed fix
> 
> Srikanth, can you check if the patch does what was intended? Thanks!

Yes, this looks good. Since Scope.compilerOptions() is declared final
all the 163 or so call sites should be able to derive the benefits of
inlining and materialize options with just a field look up.
Comment 3 Satyam Kandula CLA 2012-02-09 03:14:53 EST
Important think to note is that all the scopes stay as long as the other AST nodes stay. They may be staying in memory even when users create ASTs using DOM/AST api. We should add fields only if they are necessary. Do you think addition of this field will really help that much?
Comment 4 Srikanth Sankaran CLA 2012-02-09 03:33:07 EST
(In reply to comment #3)
> Important think to note is that all the scopes stay as long as the other AST
> nodes stay. They may be staying in memory even when users create ASTs using
> DOM/AST api. We should add fields only if they are necessary. Do you think
> addition of this field will really help that much?

Can you suggest some other alternative than not doing this at all ?
Is it possible to come up with some numbers using say MAT or some
other tool say while compiling JDT/Core projects to see what is the
foot print enhancement ? It is just a referece per scope i.e there
is no deep copy or cloning here.

Basically for every QNR and SNR, we are going to be calling
org.eclipse.jdt.internal.compiler.lookup.Scope.compilerOptions() which
loops all the way to CUScope to retrieve the options.

More than the cost of this retrieval I am worried about the perceived
cost of retrieval i.e could developers avoid checking for options before
invoking optional code just because the perceived cost is high ?
It has been cited as a reason in some cases.
Comment 5 Ayushman Jain CLA 2012-02-09 04:44:18 EST
(In reply to comment #3)
> Important think to note is that all the scopes stay as long as the other AST
> nodes stay. They may be staying in memory even when users create ASTs using
> DOM/AST api. We should add fields only if they are necessary. Do you think
> addition of this field will really help that much?
We're basically only creating new references here, and no duplicating the Map of compiler options. So we should be good, IMHO.
Comment 6 Satyam Kandula CLA 2012-02-09 04:45:07 EST
(In reply to comment #4)
> Can you suggest some other alternative than not doing this at all ?
I don't have a generic alternative as such. If we need this during analysis,
have this field in the flow context. I believe FlowContext is thrown after the
analysis.

My actual concern is that we are being liberal in adding fields. I just want to
make sure that we understand the cost whenever we add a field. 

> Is it possible to come up with some numbers using say MAT or some
> other tool say while compiling JDT/Core projects to see what is the
> foot print enhancement ?
I think this is a good idea. I will try to get some numbers using MAT. 


> It is just a referece per scope i.e there
> is no deep copy or cloning here.
I agree that this will not be as bad as many kind of nodes. However, I believe
that there will be substantial number of scope nodes for a CU which will add up
for a'Batch build'.

May be the right thing could be is have some thing like flow context which
could  be thrown away at the end of the code generation of the CU. 

> 
> Basically for every QNR and SNR, we are going to be calling
> org.eclipse.jdt.internal.compiler.lookup.Scope.compilerOptions() which
> loops all the way to CUScope to retrieve the options.
Will this be costly? How far will a CUScope be - I think it will generally be 4
or 5 away?

> More than the cost of this retrieval I am worried about the perceived
> cost of retrieval i.e could developers avoid checking for options before
> invoking optional code just because the perceived cost is high ?
> It has been cited as a reason in some cases.
Is the concern 'performance' or is that the checks will be intrusive? I
personally think it could be the latter.
Comment 7 Satyam Kandula CLA 2012-02-09 04:54:56 EST
(In reply to comment #5)
> We're basically only creating new references here, and no duplicating the Map
> of compiler options. So we should be good, IMHO.
I agree, but each field and duration of the liveness counts.
Comment 8 Srikanth Sankaran CLA 2012-02-09 05:18:56 EST
(In reply to comment #6)
> (In reply to comment #4)

> > Is it possible to come up with some numbers using say MAT or some
> > other tool say while compiling JDT/Core projects to see what is the
> > foot print enhancement ?
> I think this is a good idea. I will try to get some numbers using MAT. 

If this is easier, we can actually count the number of scopes created
during a full build of JDT/Core dev workspace. I am thinking, even if
we create a million scopes in this process, it is only 8MB or so.

> > Basically for every QNR and SNR, we are going to be calling
> > org.eclipse.jdt.internal.compiler.lookup.Scope.compilerOptions() which
> > loops all the way to CUScope to retrieve the options.
> Will this be costly? How far will a CUScope be - I think it will generally be 4
> or 5 away?

I think it is gross to reach for the CUScope each time. I don't know if
this makes a blip in the radar though.

> Is the concern 'performance' or is that the checks will be intrusive? I
> personally think it could be the latter.

It is human psychology that is the concern ;-)
Comment 9 Srikanth Sankaran CLA 2012-02-09 05:22:05 EST
(In reply to comment #8)

> I think it is gross to reach for the CUScope each time.

i.e plant a breakpoint on org.eclipse.jdt.internal.compiler.lookup.Scope.compilerOptions()
and see how often it triggers all the while retrieving the
invariant result.
Comment 10 Satyam Kandula CLA 2012-02-09 06:13:59 EST
(In reply to comment #8)
> If this is easier, we can actually count the number of scopes created
> during a full build of JDT/Core dev workspace. I am thinking, even if
> we create a million scopes in this process, it is only 8MB or so.
We just compile 2000 files at once during a batch build and hence the number of scopes will probably not reach million. However, I think we should be careful with the way we add up memory.

> 
> > > Basically for every QNR and SNR, we are going to be calling
> > > org.eclipse.jdt.internal.compiler.lookup.Scope.compilerOptions() which
> > > loops all the way to CUScope to retrieve the options.
> > Will this be costly? How far will a CUScope be - I think it will generally be 4
> > or 5 away?
> 
> I think it is gross to reach for the CUScope each time. I don't know if
> this makes a blip in the radar though.
I understand things look gross especially while debugging, but I don't think that should be the reason. There are some fields that are there in CompilationUnitScope, eg: environment itself. Do we want to cache that also in  each scope? I believe not, so why this needs special treatment :)

> 
> > Is the concern 'performance' or is that the checks will be intrusive? I
> > personally think it could be the latter.
> 
> It is human psychology that is the concern ;-)

I am trying to understand the code in concern in comment 0.  Firstly, this is being called only during code analysis and I think it is OK. 
Next, I don't think variableBinding() should really do the check for this option. I think it should return the binding as it is and the consumer should use it as it is necessary. Imagine someone else need the bindings for the fields and the variables, he had to write another method. That is probably not bad, but the name variableBinding() is bad. Now assuming that variableBinding() returns the fields always, the caller should probably do many checks. If they are more options that gets added up, code becomes more ugly. Here, I will be concerned with code getting ugly, rather than the performance.
Comment 11 Ayushman Jain CLA 2012-02-09 06:27:00 EST
(In reply to comment #10)
> I am trying to understand the code in concern in comment 0.  Firstly, this is
> being called only during code analysis and I think it is OK. 
> Next, I don't think variableBinding() should really do the check for this
> option. I think it should return the binding as it is and the consumer should
> use it as it is necessary. Imagine someone else need the bindings for the
> fields and the variables, he had to write another method. That is probably not
> bad, but the name variableBinding() is bad. Now assuming that variableBinding()
> returns the fields always, the caller should probably do many checks. If they
> are more options that gets added up, code becomes more ugly. Here, I will be
> concerned with code getting ugly, rather than the performance.
Note that we're renaming the method variableBinding in bug 369487. It's actually meant to only return the binding for the special case of null analysis for fields and the name is currently misleading. We wanted the implementation to have zero (almost) side effects when the option is turned off, hence we safeguarded the method's evaluation with a check for the option
Comment 12 Srikanth Sankaran CLA 2012-02-09 06:32:09 EST
(In reply to comment #10)

> I am trying to understand the code in concern in comment 0.  Firstly, this is
> being called only during code analysis and I think it is OK. 
> Next, I don't think variableBinding() should really do the check for this
> option. I think it should return the binding as it is and the consumer should
> use it as it is necessary. Imagine someone else need the bindings for the
> fields and the variables, he had to write another method. That is probably not
> bad, but the name variableBinding() is bad. 

This last point is already being addressed having been raised in review.
The patch is ready and is waiting for some clarification from Stephan on
an unrelated matter.

I agree with most of the points raised around how it may look more gross in
the debugger than in reality etc. But the fact of the matter is materializing
a bit that gets looked up for every field reference should not require us
to go through hoops like it does now.

For a full build, I would think these are retained in memory only on
per project compilation basis and I don't think that is very bad.

If you think this is a must fix, I am open to alternate proposals though.
Ayush, how about flow context/flow info - I don't think we should cache 
the options there, but we could cache includeFieldsInNullAnalysis there 
which seems to eminently fit there ?
Comment 13 Ayushman Jain CLA 2012-02-09 06:42:53 EST
(In reply to comment #12)
> If you think this is a must fix, I am open to alternate proposals though.
> Ayush, how about flow context/flow info - I don't think we should cache 
> the options there, but we could cache includeFieldsInNullAnalysis there 
> which seems to eminently fit there ?
Yes, we can do so for this option, but then I thought we're tackling a more general problem here. Even now there are a host of other places where we call scope.compilerOptions(), including in SNR and QNR for purposes other than null analysis. Will caching just the includeFieldsInNullAnalysis really help then?
Comment 14 Srikanth Sankaran CLA 2012-02-09 06:58:37 EST
(In reply to comment #13)

> analysis. Will caching just the includeFieldsInNullAnalysis really help then?

No, of course not. Here is what I think we should do. We should study the
lifetime of these objects and if they are indeed discarded after a project
build (when clean building a large workspace with several projects),
then we can go forward with the current fix.

The point about being careful while adding fields is well taken, but my
gut feel in the current instance is that we are overworrying.

Alternately, Satyam you can collect the MAT profiles against a
large workspace with and without this change and show the impact ?
Comment 15 Satyam Kandula CLA 2012-02-09 07:49:19 EST
(In reply to comment #14)
> Alternately, Satyam you can collect the MAT profiles against a
> large workspace with and without this change and show the impact ?
I want to collect some memory profiles generally for the JDT/Core. I need some time for this. At the same time, I agree that the impact of adding this field will obviously not make too much of a difference, but I want to convey that this is a bad precedent.
Comment 16 Srikanth Sankaran CLA 2012-02-09 08:15:11 EST
(In reply to comment #15)
> (In reply to comment #14)

> will obviously not make too much of a difference, but I want to convey that
> this is a bad precedent.

OK, Not an unfair view point. Since the current code is not known to
suffer from a performance problem on account of retrieving options,
I am OK with this being closed as WONTFIX. If in future, data emerges
that shows otherwise, will reconsider.

Thanks for weighing in on this, Satyam.
Comment 17 Stephan Herrmann CLA 2012-03-12 16:41:10 EDT
Verified for 3.8 M6.

For the records: bug 370639 is an example of a potential client of the change discussed here.
That bug has been implemented using existing methods, still no performance degradation has been observed.