| Summary: | Inconsistency in DSF ACPM cancel behavior | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Tools] CDT | Reporter: | John Cortell <john.cortell> | ||||||||
| Component: | cdt-debug-dsf | Assignee: | Project Inbox <cdt-debug-dsf-inbox> | ||||||||
| Status: | RESOLVED FIXED | QA Contact: | Pawel Piech <pawel.1.piech> | ||||||||
| Severity: | normal | ||||||||||
| Priority: | P3 | CC: | cdtdoug | ||||||||
| Version: | 8.0 | Flags: | pawel.1.piech:
review+
|
||||||||
| Target Milestone: | --- | ||||||||||
| Hardware: | PC | ||||||||||
| OS: | Windows XP | ||||||||||
| Whiteboard: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
John Cortell
On 11/04/2010 02:03 PM, John Cortell wrote:
> The expectation is clear in some respects: if multiple clients are waiting for a
> cache update to complete and at least one of them is /not /canceled, then the
> cache update proceeds to completion. That is straightforward enough. The
> problems come into play when all the clients (could be just a pool of one)
> cancel their requests. When that happens, the cache update request is cancelled
> and the fallout from that is very murky to me.
To be honest I'm not really sure what is the most correct behavior, I was just trying to do what's most logical. We actually have some bugs in DSF where we were burned by incorrect handling of canceled requests.
On the question in this bug, what I implemented is to cancel the request for data. If the data request comes back with a CANCEL status, then the return data is ignored. If the return request is OK, then the data is still used to populate the cache. I'm not completely sure if this is correct, but it's the closest to the ACPM behavior that Eugene implemented.
(In reply to comment #1) > To be honest I'm not really sure what is the most correct behavior, I was just > trying to do what's most logical. We actually have some bugs in DSF where we > were burned by incorrect handling of canceled requests. > > On the question in this bug, what I implemented is to cancel the request for > data. If the data request comes back with a CANCEL status, then the return > data is ignored. If the return request is OK, then the data is still used to > populate the cache. I'm not completely sure if this is correct, but it's the > closest to the ACPM behavior that Eugene implemented. My point is that the implementation of RM is such that when the RM is canceled, the callee need not put a CANCEL status in the RM. The RM implicitlys get a CANCEL status. ACPM transactions work with RMs, and the behavior for those RMs should not be different, otherwise we've created a pretty significant pitfall. Created attachment 182740 [details] Patch to prevent cache from going valid on cancel. (In reply to bug 310345 comment #50) > I think it's more than that. The real problem is that the object is flipped > into the valid state. I.e., if we remove the cancel aspect of the current > behavior, we end up with a valid cache object, and that's not good either, > since it hasn't really been updated. I think the solution would be to leave the > cache object as is but naturally its RM would have to move into the cancel > state. Maybe that's what you're suggesting. Maybe we should keep the cancellation discussion to this bug, the other bug is going to be too long anyway. In any case this patch would prevent the cache object from becoming valid if the request was canceled, as you suggested also. Logically, if the data source request is canceled then there should be no clients waiting for the result. Also if the data source returns a CANCEL status, then the rm should be canceled, but I wasn't sure whether to make this an assertion or to guard for it as I did. (In reply to comment #0) > I've added two additional test methods to CacheTests with documentation. They > describe and prove the inconsistency. One of the tests fail. They should both > succeed. I don't see these, did you mean to attach them? (In reply to comment #2) > My point is that the implementation of RM is such that when the RM is canceled, > the callee need not put a CANCEL status in the RM. The RM implicitly get a > CANCEL status. ACPM transactions work with RMs, and the behavior for those RMs > should not be different, otherwise we've created a pretty significant pitfall. Good point. This implementation was added to address some of the bugs we were seeing in the DSF UI wrt requests being canceled but stale data still being returned to the view. In RequestCache implementation I changed the data source RM so that it will preserve the data and status set by the data source into the cache. If the source didn't set a cancel status then the result is written to cache and cache becomes valid. I'm not really sure if this is the right thing to do though, as it did get us into trouble in the past (as mentioned above). Created attachment 182747 [details]
Additional tests
(In reply to comment #4) > I don't see these, did you mean to attach them? Yes. Sorry. I've attached them now. However, they need to be adjusted to expect the cache to remain as-is. I can do that, or if you want to...either way. > (In reply to comment #2) > Good point. This implementation was added to address some of the bugs we were > seeing in the DSF UI wrt requests being canceled but stale data still being > returned to the view. > > In RequestCache implementation I changed the data source RM so that it will > preserve the data and status set by the data source into the cache. If the > source didn't set a cancel status then the result is written to cache and cache > becomes valid. I'm not really sure if this is the right thing to do though, as > it did get us into trouble in the past (as mentioned above). Hm. Maybe I'm not reading your fix correctly, but it seems to me the || !isCanceled() check covers the case where the source neglects to explicitly set a CANCEL status in the RM. I.e., if the RM was canceled, then we decline an update to the the cache data/status regardless of the status the source is trying to set. This to me seems consistent with the RM behavior that a canceled object's getStatus() will always return CANCELED regardless of what status was actually set into it. (In reply to comment #6) > Yes. Sorry. I've attached them now. However, they need to be adjusted to expect > the cache to remain as-is. I can do that, or if you want to...either way. Thanks. I think we need to decide on the correct behavior collectively here and move on. With the use cases for cancel that we have now are where a viewer cancels a request for data because it receives an event that the data has changed. For this, it would be better to prevent a canceled request from writing data to cache. However, there are other possible use cases where it would be better to allow the data to be written. For example, in the example where a table is scrolled very fast, and it cancels requests for data that was scrolled out of the viewport before it was fetched from data source. In that case, allowing data to populate cache after the request was canceled would make the user experience better. But I have to admit that the flexible hierarchy tree viewer that we have is not this smart. So, I'm not really sure which is better. I would lean towards preserve the current behavior, simply because it's already there and it should be more efficient. If we run into problems we could change it then? > Hm. Maybe I'm not reading your fix correctly, but it seems to me the > > || !isCanceled() > > check covers the case where the source neglects to explicitly set a CANCEL > status in the RM. I.e., if the RM was canceled, then we decline an update to > the the cache data/status regardless of the status the source is trying to set. > This to me seems consistent with the RM behavior that a canceled object's > getStatus() will always return CANCELED regardless of what status was actually > set into it. The "!isCanceled()" part is meant to guard against the case where the data source returns a CANCEL status, but the cache was not actually canceled. In this case, there may be clients waiting for the cache to become valid, therefore set() has to be called to complete their requests. However, this behavior is specifically prohibited by the API and there's an assertion against it in RequestMonitor, but I wasn't sure whether this logic should also use an assertion or to be tolerant of it using this test. I didn't intend as you described, to prevent cache from being set if the cache is canceled but the data source returned a non-cancel value. That's why the two tests are connected by a logical OR. (In reply to comment #6) I took a fresh look at it, and indeed I was confused. That check is a bit tricky to grasp. Basically, it says: always accept the source's data+status if the RM wasn't canceled. If the RM *was* canceled, then accept the data+status only if the given status is not CANCEL. And indeed this puts an obligation on the source to be diligent and explicitly set a CANCEL status if it aborts because the RM was canceled. As such, this is inconsistent with the behavior that you gave RM to solve your issue with stale data in the GUI on cancelations. It seems to me you want this inconsistency in order to allow a cache source's data to count if it comes in despite the RM having been canceled. I would do away with that. If the RM has been canceled, it's been canceled. Too bad if the source completes the request despite the cancel request. Ideally, I would have liked the basic RM behavior not to have gone that way. I.e., I would have preferred to have seen service implementations get fixed and be made to explicitly return a CANCEL status when it aborts because of a cancel. But given that wasn't done, I think we need to now live with that decision and try to be consistent. Looks like we both were updating at the same time... (In reply to comment #7) > So, I'm not really sure which is better. I would lean towards preserve the > current behavior, simply because it's already there and it should be more > efficient. If we run into problems we could change it then? I'd rather agree on a direction and stick to it. Changing behavioral contracts is never a good thing. That's worse than physically changing interfaces, where you can at least count on the compiler to tell you something has changed and where exactly you need to adjust for it. I tracked down the reason for the RM behavior (I had to look in DD CVS for it :-)). See bug 233338 and bug 232573. The reason for this behavior is more to do with the structure of RM API rather than any bad cache behavior. That said, changing the RM behavior may be rather destabilizing. So in that sense I agree with you, it's better to make the cache behavior consistent here. Do you want to go ahead make the changes? (In reply to comment #11) > I tracked down the reason for the RM behavior (I had to look in DD CVS for it > :-)). See bug 233338 and bug 232573. The reason for this behavior is more to > do with the structure of RM API rather than any bad cache behavior. That said, > changing the RM behavior may be rather destabilizing. So in that sense I agree > with you, it's better to make the cache behavior consistent here. > > Do you want to go ahead make the changes? Hm. I can't say I understand what you're getting at in https://bugs.eclipse.org/bugs/show_bug.cgi?id=232573#c6 but I guess that's irrelevant at this point. If we're in agreement about what to do now, then there's no need to understand the past :-) OK. I'll apply your patch locally and then adjust it so that we always disregard the data+status from the source if the RM has been canceled. (In reply to comment #12) > Hm. I can't say I understand what you're getting at in > https://bugs.eclipse.org/bugs/show_bug.cgi?id=232573#c6 but I guess that's > irrelevant at this point. If we're in agreement about what to do now, then > there's no need to understand the past :-) What happened is that in various places in the code we made the assumption that if RM.isSuccess() == false, then getStatus().isOK() == false. However in the case where an RM was canceled, isSuccess() would return false, but getStatus() could still return Status.OK. What happened next is that the OK status was saved to the cache but the data was not, which eventually resulted in an NPE. At the time, I think I could have gone back and tried to fix the places where we made this assumption, but it was already RC1. Also, someone else would eventually make the same mistake... (In reply to comment #13) > However in the > case where an RM was canceled, isSuccess() would return false, but getStatus() > could still return Status.OK. OK. That makes more sense. The sentence "could store a null value for data even if getStatus() returned IStatus.OK" threw me off because the *else* path for 'if (isSuccess())' doesn't store data. But I see what you meant now. That code could leave the RM's data untouched and give it an OK status. The default data for an RM is null. Yeah. It seems to me the ideal solution would have been to transfer both the data and the status regardless of whether the requestor canceled the RM. I.e., isSuccess() should not have taken the cancel state into account; it should merely reflect whether the request was completed. But that also would require the callee to be diligent about setting the status object in the event of a cancelation. Indeed the change you went with was the safest given the timeline. Created attachment 182785 [details]
Patch to prevent cache from going valid on cancel
This takes Pawel's patch and adjusts the behavior to always ignore the retrieval result once the cache object's RM has been canceled. Pawel, please review
Looks good. Committed to HEAD. *** cdt cvs genie on behalf of jcortell *** Bug 329481: Inconsistency in DSF ACPM cancel behavior. [*] AbstractCache.java 1.6 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf/org.eclipse.cdt.dsf/src/org/eclipse/cdt/dsf/concurrent/AbstractCache.java?root=Tools_Project&r1=1.5&r2=1.6 [*] RequestCache.java 1.5 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf/org.eclipse.cdt.dsf/src/org/eclipse/cdt/dsf/concurrent/RequestCache.java?root=Tools_Project&r1=1.4&r2=1.5 [*] CacheTests.java 1.6 http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.cdt/dsf/org.eclipse.cdt.tests.dsf/src/org/eclipse/cdt/tests/dsf/concurrent/CacheTests.java?root=Tools_Project&r1=1.5&r2=1.6 |