| Summary: | Optimize CDORevisionCache implementations | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Modeling] EMF | Reporter: | Eike Stepper <stepper> | ||||||||
| Component: | cdo.core | Assignee: | Eike Stepper <stepper> | ||||||||
| Status: | CLOSED FIXED | QA Contact: | Eike Stepper <stepper> | ||||||||
| Severity: | enhancement | ||||||||||
| Priority: | P3 | CC: | caspar_d, vroldanbet | ||||||||
| Version: | 4.0 | Flags: | stepper:
review?
(vroldanbet) stepper: review? (caspar_d) |
||||||||
| Target Milestone: | --- | ||||||||||
| Hardware: | All | ||||||||||
| OS: | All | ||||||||||
| Whiteboard: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Eike Stepper
Created attachment 195000 [details]
Patch v1
Vik, please report how this patch impacts time and space efficiency. Caspar, there is one test failure in non-auditing: CDOTransactionImpl.removeCrossReferences(). The new non-auditing cache returns null for a clean revision. Do you have an idea onhow to cope with this situation? Or do you think it's simply a bug in the new cache? Patch can't be applied. It contains a hunk that changes (not: creates) CDORevisionCacheAuditing.java, but this file doesn't exist yet. Please post a patch including the creation of that file, and I'll take a look. Sure, I'll give it a look. I performed some tests with a hacked CDORevisionCacheImpl without the typeMap, and it was slightly faster and considerably more space efficient (I could run our app with 256mb heap without OOME. I'll check the patch and compare :) Eike, correct me if wrong, but... aren't InternalCDORevisions in CDORevisionCacheNonAuditing being hard-referenced? Is there something I'm missing here? Furthermore, there are some version and branch checks in that implementation. Why is that necessary in a NonAudit scenario? (In reply to comment #4) > Patch can't be applied. It contains a hunk that changes (not: creates) > CDORevisionCacheAuditing.java, but this file doesn't exist yet. No clue how I got into this state. Trying to fix... (In reply to comment #6) > correct me if wrong, but... aren't InternalCDORevisions in > CDORevisionCacheNonAuditing being hard-referenced? *Strongly* referenced, yeah. My bad, I'll come up with a new patch today... Created attachment 195180 [details]
Patch v2
Caspar, this patch should address the SVN hunk issue.
Vik, the non-auditing issue is not yet addressed. I'll look at that a little later today...
Eike, What goes wrong here is that the the tx expects it can ask the revManager for a specific version. But your non-auditing cache implicitly always evicts any other version in favor of the latest. I'm not sure if non-auditing (which to me means that the *repo* will not retain anything other than the latest version) necessarily implies that the client-side cache refuse to hold any historic versions... But regardless of your/my opinion on that, the whole issue can be avoided by fetching the clean revision from the tx's own cleanRevisions list, instead of from the revManager, as follows: InternalCDORevision cleanRevision = cleanRevisions.get(referencer); This will make the assertion pass always. Created attachment 195461 [details]
Patch v3
I've fixed the bug in CDOTransactionImpl.removeCrossReferences() as Caspar suggested.
And the NonAuditing cache now uses string or soft references according to disableGC.
All tests are passing. Going to commit...
Vik, please have a look again and tell us how the footprint is impacted.
Committed revision 7686: - trunk/plugins/org.eclipse.emf.cdo - trunk/plugins/org.eclipse.emf.cdo.common - trunk/plugins/org.eclipse.emf.cdo.net4j - trunk/plugins/org.eclipse.emf.cdo.server - trunk/plugins/org.eclipse.emf.cdo.tests Resolved Hi Eike, it appears to me (using VisualVM) that soft references are not being evicted. Our tests failed for simple 40 minute import test with 256 mb heap. It did succeed flawlessly with the previous CDORevisionCacheImpl hacked with your email suggestions. I'll keep making some tests to determine what's going on. Correction, what is not being evicted are AbstractCDORevisionCache.CacheSoftReference instances. I revisions are being removed correctly, but not CacheSoftReference instances, which according to VisualVM, are being strongly referenced by CDORevisionCacheNonAuditing.revisions HashMap. The problem relies on: org.eclipse.emf.cdo.internal.common.revision.CDORevisionCacheNonAuditing.removeRevision(CDOID, CDOBranchVersion) which has following condition: if (revision != null && ObjectUtil.equals(revision.getBranch(), branchVersion.getBranch()) && revision.getVersion() == branchVersion.getVersion()) if the revision has been already evicted, ref.get() will be null, and the "then block" code won't be executed, which means that the entry on "revisions" hashMap won't be removed. I suggest we extract revisions.remove(id) from the "then block" right above, so it is executed anyway. reopening for the identified bug on CDORevisionCacheNonAuditing I confirm that, after my proposed fix, everything works as expected :) Computationally it behaves as good as the previous cache version, but in terms of memory consumption, I'd say the cache consumes 50% less, which gives 50% more room for revisions. Thanks for the excellent enhacement, Eike :) (In reply to comment #16) > I suggest we extract revisions.remove(id) from the "then block" right above, so > it is executed anyway. Isn't that what I already see in my code? Reference<InternalCDORevision> ref = revisions.get(id); if (ref != null) { InternalCDORevision revision = ref.get(); if (revision != null && ObjectUtil.equals(revision.getBranch(), branchVersion.getBranch()) && revision.getVersion() == branchVersion.getVersion()) { revisions.remove(id); return revision; } } else { revisions.remove(id); } No, revisions.remove(id) should be right under ref.get() and above the if clause. Oh, I see you point. Yeah, but that else block is only executed if ref is null (the outer if clause). In the case of the inner If clause, there is no else block, so revision is not removed from the map in case revision == null. (In reply to comment #21) > Oh, I see you point. Yeah, but that else block is only executed if ref is null > (the outer if clause). In the case of the inner If clause, there is no else > block, so revision is not removed from the map in case revision == null. But that is the purpose. It's supposed to be a safety check so that only the version is evicted that is the one that matches the key in the CacheSoftReference. Am I missing something? Can you paste your full removeRevision() method? public InternalCDORevision removeRevision(CDOID id, CDOBranchVersion branchVersion)
{
synchronized (revisions)
{
Reference<InternalCDORevision> ref = revisions.get(id);
if (ref != null)
{
InternalCDORevision revision = ref.get();
revisions.remove(id);
if (revision != null && ObjectUtil.equals(revision.getBranch(), branchVersion.getBranch())
&& revision.getVersion() == branchVersion.getVersion())
{
return revision;
}
}
else
{
revisions.remove(id);
}
}
return null;
}
Lets see if I can explain better. This method is called "removeRevision". It means that a revision is going to be removed from the cache. However, this method doesn't seem to consider that the soft reference CDORevision could be already GC'ed. That implies ref.get will return null, but ref won't be null, because the actual CacheSoftReference is still strongly referenced by the map. That's were the ReferenecQueue mechanism kicks in (its polled every 60 seconds) and checks whether there are CacheSoftReference for which the referenced revision has been already evicted, with the purpose of removing it from the map. Considering such scenario, removeRevision will be invoked by org.eclipse.emf.cdo.internal.common.revision.AbstractCDORevisionCache.work(Reference<? extends InternalCDORevision>). So lets walktrhough the code: first we execute revision.get(id). That returns an instance of CacheSoftReference. so ref != null, and it executes the block inside. However reg.get == null, because the revision has been evicted. The next if block does not succeed, so "revisions.remove(id)" is skipped and finally null is returned. That leaves a CacheSoftReference instance in the map, referencing null, and occupying useful heap memory. Is this more clear now? That doesn't seem right to me. In your case we wouldn't need the whole if statement (the return value is rather unimportant here). The purpose of the if was to protect against this problem: The actual eviction happens in a background thread, i.e. AbstractCDORevisionCache.work() --> CDORevisionCacheNonAuditing.removeRevision() The "main" thread could add a newer revision to the cache while the eviction thread still thinks it has to evict the older one. I do agree, though, that the branch check does not make much sense in non-auditing. I also understand that you seem to have an actual problem with the current implementation and maybe that your suggested solution seems to solve it. I just can't understand it... Mid-air collision! My comment #25 replies to your comment #23 Eike, im sorry but I still don't understand what is a useless CacheSoftReference (referencing null) doing in memory, nor I understand it's purpose. I got it!!! What about this:
public InternalCDORevision removeRevision(CDOID id, CDOBranchVersion branchVersion)
{
synchronized (revisions)
{
Reference<InternalCDORevision> ref = revisions.get(id);
if (ref != null)
{
InternalCDORevision revision = ref.get();
if (revision != null)
{
if (revision.getVersion() == branchVersion.getVersion())
{
revisions.remove(id);
return revision;
}
}
else
{
revisions.remove(id);
}
}
}
return null;
}
ah, finally :D yeah, thats actually better than what I have. Less branch checks (there are no branches in no audit), better performant cache. I'll test the code. Is still the version check needed? I just need 40 minutes for the smallest stress test, ok? I'll ping you back when checked it (In reply to comment #27) > Eike, im sorry but I still don't understand what is a useless > CacheSoftReference (referencing null) doing in memory, nor I understand it's > purpose. The most impoirtant aspect about CacheSoftReference is that it extends SoftReference and its ctor registers it with a ReferenceQueue. And then it adds key information so that the queue worker can know what the key of the already evicted revision is (used for map cleanup): private static final class CacheSoftReference extends SoftReference<InternalCDORevision> implements CDORevisionKey (In reply to comment #29) > ah, finally :D yeah, thats actually better than what I have. Less branch checks > (there are no branches in no audit), better performant cache. I'll test the > code. > > Is still the version check needed? > > I just need 40 minutes for the smallest stress test, ok? I'll ping you back > when checked it Sure! I'll commit it anyway as soon as my tests have finished and passed ;-) Committed revision 7697 Added a !isEmpty() check in CDORevisionCacheAuditing.getObjectType() to prevent NoSuchElementExceptions:
if (revisionList != null && !revisionList.isEmpty())
{
Reference<InternalCDORevision> ref = revisionList.getFirst();
...
I fear that, in addition, we will have to add general synchronization between
a) the eviction thread and
b) the group of possible app threads
Committed revision 7705: - trunk/plugins/org.eclipse.emf.cdo.common > I fear that, in addition, we will have to add general synchronization between
> a) the eviction thread and
> b) the group of possible app threads
That was nonsense. Everything is properly synchronized already!
Committed revision 7708 Available in R20110608-1407 |