Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 335830 - RecoveringCDOSession attempts recovery regardless of exception type
Summary: RecoveringCDOSession attempts recovery regardless of exception type
Status: CLOSED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: cdo.core (show other bugs)
Version: 4.0   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Caspar D. CLA
QA Contact: Eike Stepper CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-31 05:56 EST by Caspar D. CLA
Modified: 2011-06-23 03:38 EDT (History)
2 users (show)

See Also:
stepper: review+


Attachments
Patch v1 (987 bytes, patch)
2011-02-04 01:17 EST, Caspar D. CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Caspar D. CLA 2011-01-31 05:56:27 EST
See summary.

Solution is to inspect the exception type and determine whether
it's really indicative of a network failure. If so, execute
existing logic; if not, rethrow.
Comment 1 Eike Stepper CLA 2011-02-03 08:41:07 EST
Side note: In bug 329254 I've just changed the commit behaviour on the server so that it should never never propagate any exception instances, but rather always use the regular commit protocol (response.result.rollbackMessage)
Comment 2 Caspar D. CLA 2011-02-04 00:02:40 EST
Note:

The recovery mechanism deals with 2 possible problems: one is deactivation
of the sessionProtocol (intercepted with
RecoveringCDOSessionImpl.sessionProtocolDeactivated), the other is the
occurrence of TransportExceptions during execution of methods on the
CDOSessionProtocol (intercepted with
RecoveringExceptionHandler.handleException). 

(Which is more likely to occur is platform dependent: a deactivation is
unlikely on Linux because the OS won't invalidate sockets even if the
physical interface fails.)

While the latter problem typically occurs in the application thread, the
former does not. Therefore, when deactivation occurs and is being handled,
it's possible for NPE's to occur in the application thread when it makes
calls on the protocol, because it's possibly calling on the deactivated
protocol which hasn't yet been replaced with a new one.

The recovery mechanism itself won't do anything to prevent this. It's up to
the user app to deal with this. To this end, it can listen to
CDOSessionRecoveryEvents coming from the session. As long as no 'FINISHED'
event has been received (call CDOSessionRecoveryEvent.getType()), it is
unsafe to make any calls on the session.
Comment 3 Caspar D. CLA 2011-02-04 01:17:31 EST
Created attachment 188290 [details]
Patch v1

It's a big one!
Comment 4 Eike Stepper CLA 2011-02-04 03:06:59 EST
Strange, I can not apply your patch. But https://bugs.eclipse.org/bugs/attachment.cgi?id=188290&action=diff shows your idea. I'm not sure but  would it be good to have IOException and IORuntimeException on that list, too?

BTW. I feel that we should come up with a general exception handling strategy that enables us and our consumers to more easily and more specifically react to exceptional situations. Something in net4j.util could provide a useful and *semantic* categorization similar to what the SpringFramework does with its abstraction layer over SQL vendor specific error codes and SQLExceptions. Careful consideration of checked exception versus unchecked exception should also be part of that strategy.

That would be something with a high long term effect and it would be hard to change every now and then. Hence it should be discussed at least by the whole team. I've filed bug 336318 for this discussion...
Comment 5 Caspar D. CLA 2011-02-07 01:45:22 EST
(In reply to comment #4)
> Strange, I can not apply your patch. But
> https://bugs.eclipse.org/bugs/attachment.cgi?id=188290&action=diff shows your
> idea. I'm not sure but  would it be good to have IOException and
> IORuntimeException on that list, too?

I doubt it. What if the protocol does some IO that has nothing to do
with the transport layer?

The way I see it, if an IOException occurs in the transport layer, and
bubbles up to this level without being wrapped in a TransportException,
then apparently our existing exception-wrapping strategy in inconsistent,
and we'd better fix that instead of adding various exception types here.
Comment 6 Eike Stepper CLA 2011-02-07 03:51:05 EST
(In reply to comment #5)
> I doubt it. What if the protocol does some IO that has nothing to do
> with the transport layer?

Granted.

> The way I see it, if an IOException occurs in the transport layer, and
> bubbles up to this level without being wrapped in a TransportException,
> then apparently our existing exception-wrapping strategy in inconsistent,
> and we'd better fix that instead of adding various exception types here.

I agree. Note that TransportException is a CDO concept. Doesn't that mean that Net4j can not propagate transport IOExceptions separately from protocol IOExceptions?
Comment 7 Eike Stepper CLA 2011-02-10 10:34:08 EST
Seems I forgot to approve the patch...
Comment 8 Caspar D. CLA 2011-02-11 00:11:10 EST
Committed to trunk, rev. 7057
Comment 9 Caspar D. CLA 2011-02-11 00:14:22 EST
(In reply to comment #6)
> Note that TransportException is a CDO concept. Doesn't that mean that
> Net4j can not propagate transport IOExceptions separately from protocol
> IOExceptions?

I'm not sure I understand the question.

But it does seem that it would make more sense for
TransportException to be a Net4J type.
Comment 10 Eike Stepper CLA 2011-06-23 03:38:31 EDT
Available in R20110608-1407