Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 332589

Summary: specify concurrency model for EMF data model
Product: z_Archived Reporter: Steffen Pingel <steffen.pingel>
Component: MylynAssignee: Project Inbox <mylyn-triaged>
Status: RESOLVED FIXED QA Contact:
Severity: enhancement    
Priority: P3 CC: alvaro.sanchez-leon, Ed.Merks, milesparker, sebastien.dubois, stepper
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Whiteboard:
Bug Depends on:    
Bug Blocks: 400171    

Description Steffen Pingel CLA 2010-12-14 18:36:24 EST
The review data model will be accessed concurrently from different thread, e.g. when an update to a Gerrit review is received while review results are being displayed. Since EMF does not have any built-in mechanism to handle concurrent access the framework will need to specify a model, e.g. through EMF transactions, locking or a similar mechanism.
Comment 1 Miles Parker CLA 2012-09-18 15:59:42 EDT
First, I had been going on assumption that Gerrit Task submission model follows standard Mylyn Task model of essentailly optomistic record updates, with a global pull to get current data into local store, followed by numerous client updates, and then a global push back for the review. Is this not the case?  

For the stated Use Case itself, it seems that R4E team has already implemented a lot of this!

What would be remaining to do here would be to adapt their approach to a more generic EMF Command framework based-solution which we may want/need to do anyway. I think one strategy that might work would be to build a custom command stack implementation that essentially advised any relevant operations n the model such that the locking mechanism would be invoked.

This would have the very significant side benefit of getting rid of a lot of code maintenance for R4E WRT record locking, remove some UI/logic dependencies, and solve the mismatch between r4E locking mechanism and a generic Mylyn oriented. The costs would be significant however -- it would mean wholesale adoption of the EMF command framework on the R4E side. So may not be something for the short term.

What do people think about the general idea?
Comment 2 Steffen Pingel CLA 2012-09-18 21:56:41 EDT
At the moment the EMF reviews model is only used in the UI and accessed from a single thread (mostly UI thread) at a time without any explicit locking place.  My sense is that CDO would be the only viable approach to support proper concurrent access to models. I'm worried that a simple locking mechanism could result in a less responsive UI once synchronization with review repositories comes into play.
Comment 3 Sebastien Dubois CLA 2012-09-19 16:31:08 EDT
Actually if the data is pulled out of the global (shared) data and used to rebase a local copy (locally, so there is no concurrency issues), then the local copy will contain the up to date data.  In this case it is just a matter of replacing completely the global data with the rebased local data and the synchronization mechanism could be rather trivial no?  Just a thought.
Comment 4 Steffen Pingel CLA 2012-09-19 18:02:19 EDT
(In reply to comment #3)
> Actually if the data is pulled out of the global (shared) data and used to
> rebase a local copy (locally, so there is no concurrency issues), then the local
> copy will contain the up to date data.  

Just to clarify, this bug isn't about conflicts resulting out of differences in local and remote data but about the technical concerns when handling EMF models on different Java threads. If the updated model is retrieved from the server and then merged into the locally cached model that is visible in the UI you need some sort of locking mechanism to ensure integrity.
Comment 5 Miles Parker CLA 2012-09-19 18:15:55 EDT
(In reply to comment #4)
> Just to clarify, this bug isn't about conflicts resulting out of differences in
> local and remote data but about the technical concerns when handling EMF models
> on different Java threads. If the updated model is retrieved from the server and
> then merged into the locally cached model that is visible in the UI you need
> some sort of locking mechanism to ensure integrity.

Ah, I had misinterpreted. This part is actually really simple. The only option is just to block on the UI thread when the actual update needs to occur, making that as low-imapct as possible obviously, by doing the actual IO and everything else you can in a separate job and just doing the actual reconciliation in UI. It's obviously not ideal, but given that any arbitrary updates can come off of the UI thread, it's the only safe way to do it.

I think Sebastien and I were both thinking of the case where multiple clients (i.e. Eclipse instances) try to go against the same data store. I agree with Steffen that CDO -- or perhaps some simpler approaches that would work for this case -- is really the best ultimate solution. I'm pointing out that the r4E approach could be supported here as well and that a generic EMF based approach would allow us to swap in whatever sort of implenentation we wanted there.
Comment 6 Steffen Pingel CLA 2012-09-19 19:44:34 EDT
The UI locking approach is problematic when you need access to parts of the model on the thread that is handling communication with the repository, e.g. to lookup properties of existing elements. You end up cloning model elements and the concurrency handling code ends up all over the code base. I tried that in the builds framework  and I consider that attempt failed even though "it works". I don't think it's a concern for reviews at the moment since there isn't any persistence or model instance management in the review framework but I would suggest to explore alternatives such as CDO if that changed.
Comment 7 Miles Parker CLA 2012-09-25 16:15:34 EDT
Note that you generally don't need to block on reads in practice and they are simple Java API operations. Command stack execution is that part that need to happen within UI context.

CDO may be heavy-weight for this case. We ought to be able to do most of this with EMF transaction support.
Comment 8 Steffen Pingel CLA 2012-09-25 22:34:50 EDT
I looked at EMF transactions and I was not convinced by the programming patterns or locking paradigm. My understanding was that you end up with one big lock which isn't much better than doing all operations on the UI thread.
Comment 9 Miles Parker CLA 2012-09-25 23:09:13 EDT
(In reply to comment #8)
> I looked at EMF transactions and I was not convinced by the programming
> patterns or locking paradigm. My understanding was that you end up with one
> big lock which isn't much better than doing all operations on the UI thread.

Yeah, the one thing it does buy you though is the ability to control the atomicity of the transaction and handle rollbacks etc.

In practice though I haven't needed to use it. The single UI thread model is perfectly perfromant for anything I've thrown at it. Do we have an explicit case that doesn't work well? If so, I could take a look and see if there aren't some non-obvious solutions.

I've built EMF models with dozens of threads pushing to a model with half a dozen active UI views coming off of it. The key thing there was to make the read stuff fail-safe -- for example if you get a CME, you just back off and try again later -- and that may not be an option here.
Comment 10 Steffen Pingel CLA 2012-09-25 23:22:03 EDT
(In reply to comment #9)
> I've built EMF models with dozens of threads pushing to a model with half a
> dozen active UI views coming off of it. The key thing there was to make the read
> stuff fail-safe -- for example if you get a CME, you just back off and try again
> later -- and that may not be an option here.

It's definitely not an option. The concurrency model needs to ensure data consistency.
Comment 11 Miles Parker CLA 2012-11-17 15:20:56 EST
Okay, there are two ways of doing this and both involve some extra implementation bits.

First to Steffen's point, I think the issue of ending up with one big lock is essentially unavoidable. See bug 337382 comment 6. Note that that is really for a deep reason, it isn't some kind of EMF API hang-up. It's really no different from *any* data structure that allows itself to be accessed from UI. The reason we see it as a special problem with models is simply that models tend to do a lot more!

That naturally implies that you have to lock the model against the UI thread. Of course, you can implement more fine-grained locking, but the basic issue remains. Again, in practice, I've never seen this as a concern as long as you keep any writes to shared EMF model as discrete, atomic, synchronous and quick. This means for example in there case of grabbing data from the Gerrit backend that we should grab what we need and once we're sure we've got it all, *then* write it back to the model.

Note that it is perfectly ok for example to build up a Review for example outside of UI thread, and then simply add the review to the model that is used in the UI thread once you've built it. All that is to say that there is no general solution and it is up to the implementor to think through specific cases. As long as the changes you are making are isolated form the rest of the model, you're good, and I think that's a key principle to keep in mind when designing for use with a high-latency interaction. And I think the current Gerrit implementation does a good job of that by simply keeping all of the ReviewItemSets in entirely different spaces. But as we integrate things (bug 394020) more, we need to keep these issues in mind.

That doesn't mean we shouldn't have a locking mechanism. However, all of the locking mechanisms involve use of an Editing Domain, because you want a command stack to run everything through. I like an approach that simply synchronizes against the command stack for writes. Note that doesn't handle the issue that read access which are technically not thread safe either! Therefore we need to be careful about reads that involve lists; basic getter and setter calls are not generally an issue (except for the obvious problem of contention) but enumerations against lists should be taking place against the UI thread.

The other downside of editing domain approach is that you have to use the command API which some people find too fidly. I really don't care, but using the transaction API allows people to use more natural Java style semantics for getting and setting values and prevents people from making an end-run around them. I'm not exactly sure what performance hit is for other transaction features that we don't need, but there is certainly a bit more housekeeping involved. On the other hand, Transaction does provide some nice features like supporting a discoverable common editing domain.
Comment 12 Eike Stepper CLA 2012-11-18 01:53:06 EST
(In reply to comment #11)

Please note that even reading EMF models is inherently unsafe for multiple threads. I think you've already mentioned the access to many-valued features (lazy list instantiation). Also the resolution of proxies is problematic.
Comment 13 Miles Parker CLA 2012-11-18 14:42:36 EST
(In reply to comment #12)
> (In reply to comment #11)
> 
> Please note that even reading EMF models is inherently unsafe for multiple
> threads. I think you've already mentioned the access to many-valued features
> (lazy list instantiation). Also the resolution of proxies is problematic.

Yep. But I'm not really familiar w/ the proxy resolution issue -- that's potentially a big deal because the existing R4E models and a number of other potential scenarios make heavy use of proxies. Do you have any references or background for that?

I don't think our existing solution is mature/complete enough to throw CDO into the mix yet, but we'd like to lay good groundwork for that and supporting as flexible back-end support as we can. One of the things I'm wondering about is whether to adopt the command pattern for all shared model edits. Personally, I really like the flexibility and generality that comes from being able to treat features as first class objects, but again I think people can find the semantics off-putting. What's your favorite pattern in CDO usage?

Any other insights you can share about working on a purely local, mostly UI-centric model would be most appreciated. :)
Comment 14 Miles Parker CLA 2013-03-08 20:01:31 EST
See bug 400270 comment 9 and https://git.eclipse.org/r/#/c/11012/ -- I'll be experimenting with concurrency implementations here. Right now we're using a Transaciton Domain. I don't actually anticipate any issues w/ an OOB implementation based on the remote API we're developing -- which clearly isolates UI calls form remote calls, but we'll certainly want to test that theory!
Comment 15 Miles Parker CLA 2013-04-11 23:45:25 EDT
Without using Transactions or any other more complex mechanisms, I believe the current implementation demonstrates that best solution is simply ensuring that model updates occur where they should -- in the UI thread -- and *all* other interactions such as remote API calls occur outside of the UI thread.

See https://git.eclipse.org/r/#/c/11012 for latest implementation of Remote API. The Remote API provides a number of features that encourage/force the user to update EMF models properly. Users simply need to implement AbstractRemoteEmfFactory and make appropriate requests to the RemoteEmfConsumer to ensure proper concurrency support.

Ed, I've cc'd you because I'd appreciate any comments you might have on the overall approach.
Comment 16 Eike Stepper CLA 2013-04-12 03:14:53 EDT
(In reply to comment #15)
> [...] I believe
> the current implementation demonstrates that best solution is simply
> ensuring that model updates occur where they should -- in the UI thread --

That sounds totally wrong to me. I think the UI thread is called UI thread because UI updates *must* occur there and all other logic *should* be executed in other threads or, at leat, be kept to a minimum. If other logic is absolutely needed in the UI thread it should be fully known, known to be fast and to be non-blocking. Model updates, though, may (and typically will) involve synchrounous adapter notification and the execution of arbitrary unknown code.
Comment 17 Miles Parker CLA 2013-04-12 13:47:27 EDT
(In reply to comment #16)
> (In reply to comment #15)
> > [...] I believe
> > the current implementation demonstrates that best solution is simply
> > ensuring that model updates occur where they should -- in the UI thread --
> 
> That sounds totally wrong to me. I think the UI thread is called UI thread
> because UI updates *must* occur there and all other logic *should* be executed
> in other threads or, at leat, be kept to a minimum.

Perhaps we have a confusion in terms, perhaps because of history of MVC etc... I thought we had established that for non-CDO case, model changes must occur in UI (or at least EMF specific) thread.

At some point, absent a framework like CDO, you're going to have to do the update that actually drives a change in the UI, and you're going to need to ensure that all model changes happen on one and only one thread.  Model changes are going to end up as UI calls at some point. So you need a firewall somewhere so that long-running/blocking operations do not end up there, and the question is just where to put that. You could have a separate EMF model update thread, and not touch the EMF model from anywhere else, but that would be extremely tedious, especially given that almost all of your interactions with the model will be happening from UI thread. That would also mean that all of your notifications would have to create UI async executions.

W/ the remote model API as designed you do all of the "real" work, including complex logic, blocking calls, etc.. in the pull API and then the updateModel calls are entirely copy operations. (See e.g. contract in https://git.eclipse.org/r/#/c/11012/15/org.eclipse.mylyn.reviews.core/src/org/eclipse/mylyn/reviews/core/spi/remote/emf/AbstractRemoteEmfFactory.java L323 L 337)

> If other logic is absolutely
> needed in the UI thread it should be fully known, known to be fast and to be
> non-blocking. Model updates, though, may (and typically will) involve
> synchrounous adapter notification and the execution of arbitrary unknown code.

Well, the notification consumer must ultimately be responsible for ensuring that they are handling the notifications efficiently and correctly, and it's a good point that you can't control for that, but that's a common issue for all EMF implementors. The remote API actually isolates this so that consumers receive higher level notifications. Again, for ease of use, these are assumed to be taking place on UI thread, because in all common cases, that's exactly where they should be used and it would be really awkward if they weren't. But it occurs to me that it would be nice to be able to specify when adding an observer whether the notification should occur on a non-user thread...
Comment 18 Miles Parker CLA 2013-04-12 20:33:52 EDT
I should have noted that whether the model updates take place on the UI thread is simply a policy, not a requirement of the framework. At either the AbstractRemoteService or AbstractRemoteFactoryProvider level you can implement implement any flavor of update. We should be able to evolve our approach overtime without affecting the key consumers APIs. For example, I've experimented with extending the factory provider to use a TransactionalEditingDomain instead, and that works fine.

There are two aspects of the design that make it possible. The AbstractRemoteEmfFactory provides a clean way for implementors to separate the remote/async and local/sync behavior. The AbstractRemoteConsumer is basically a proxy for a single model<->remote pair. The model object doesn't actually even have to refer to a stable instance, so for example you could execute the model thread on a dedicated thread, and then swap objects when it it updated. But of course you'd lose finer grained emf notifcations in that case, so I don't think it would be warranted. But, if we wanted to support background updating for non active editors and other review updates, those could all occur off the UI threads.

In any case as I say, in the real-world the updates even for very large patch sets are not even detectable.
Comment 19 Eike Stepper CLA 2013-04-14 02:27:07 EDT
(In reply to comment #17)
> At some point, absent a framework like CDO, [...]

So that sounds as if you're suggesting complex and possibly bad policies/mechansims just to avoid the use of an existing technology that has been designed to solve all these problems (and more) in a nice way.

> Model
> changes are going to end up as UI calls at some point. So you need a
> firewall somewhere so that long-running/blocking operations do not end up
> there, [...]

Yes. There are existing technologies such as EMF data binding to solve this problem in a nice way.

> [...] especially given that almost all of your
> interactions with the model will be happening from UI thread. 

While this sounds right at a first glance this argument is misleading. As you can normally not determine from a genral piece of runnable code how long it will actually take to run you will will normally run the code in a background job anyway so that the UI is not frozen during this time. One could argue that model updates always happen on a non-UI thread.
Comment 20 Miles Parker CLA 2013-04-15 13:36:03 EDT
Hi Eike,

First, I'm worried this is turing into a CDO vs. Plain Old EMF discussion. :) I think we all see the value of CDO and I'm really looking forward to getting deeply into it -- it's especially suited to issues bound to come up the M4 work -- but I don't think CDO or not is actually the issue here. We don't have to worry about a lot of issues that CDO and even Data Binding are trying to solve. They are really potentially complimentary technologies rather than competitors.

To understand why I think that, I need to make our use case and challenges a lot clearer.  As you know, when you have a simple use case, it can be better to adopt a more simple solution to address that issue, as it avoids exposing the code-base to deep issues that might not even be operative. At the same time, there are issues that are actually more generic / less-EMF specific. Here's what we're aiming at..

1. UI focus: The driver for everything we're doing is to make the remote sourced information available to the user against a common shared-model. In that sense, the EMF model is really closer to the role of a JFace table or something than a distributed storage system itself. We're basically only surfacing. (As a thought experiment, you could even imagine a CDO model serving as a "remote" to this system.) So that's where the focus on UI interaction comes from.

2. Remote Single System of Record: We are not responsible for or even capable of managing multiple changes to the tasks. The back end remote is god. So pair-wise resolution is not an issue.

3. Large change granularity: At the same time, if you look at how Mylyn interacts with a back end data source, the granularity of interaction is actually quite large. Typically, an entire task is updated all at once. We don't need -- or want, really -- attribute level update notification.

4. Low computation: There isn't much logic on the model side at all, and what there is is typically handled as part of the remote interaction.

5. Offline Access: A key aspect for Mylyn is the offline case. We need to allow the user to make changes and then later submit those as part of a single (supervised) operation. Contention for a single task is very rare, and is typically handled optimistically, with worse case literally throwing away local changes on user approval. (I said it was simpler. :) For the current case, we don't even have to worry about modifications!)

6. Simple API for Heterogenous back-ends: The most important thing we need to do is to provide a very high-level API for implementors of other review and task platforms to connect to the remote API and turn it into an EMF model in a consistent way, and to handle notifications and so on in a way that fits into the existing Mylyn APIs. This is actually the driver for all of the Remote API we're providing and is the most important aspect of the API to get right.

As I said, even if we were to use CDO or other approaches (and we still can!), we would still want to provide most of the same high-level API. The bottom-line is that you need some very clean high-level mechanism to a) "force" (to whatever extent possible!) implementors to separate the remote/long-running stuff from the local/fast stuff, b) provide "record" level notification of changes, update status, etc.. and c) protect the model in the process and that's all we're doing here.

(In reply to comment #19)
> (In reply to comment #17)
> > At some point, absent a framework like CDO, [...]
> 
> So that sounds as if you're suggesting complex and possibly bad
> policies/mechansims

I'm not sure if you mean "possibly bad" in the sense that it "is possible for something to be bad" or if some specific thing is possibly bad. ;) If the latter, I want to know so we can fix it. :)

> > Model
> > changes are going to end up as UI calls at some point. So you need a
> > firewall somewhere so that long-running/blocking operations do not end up
> > there, [...]
> 
> Yes. There are existing technologies such as EMF data binding to solve this
> problem in a nice way.

I don't know if you've seen this document: https://docs.google.com/document/d/1QmH4ZPTqXd8k6ammw9nJBVqF-oVELqisHySPpf6bFi8/edit ;-)

"Realms were introduced with the intent of protecting users from themselves in multi-threaded programming.  This feature has turned out to be a huge mistake.."

"Poor lifecycle management of observables" "IObservable.dispose() sucks. Disposed Observables are unusable forever. This would be uneccsary if we had an API to tell an observable to detach all listeners."

(In fact, the Reviews Remote supports exactly that.)

Really, I think EMF data-binding is great, but it's solving a different issue then we have. Again, we'd need to provide the higher level API in any case.

> > [...] especially given that almost all of your
> > interactions with the model will be happening from UI thread. 
> 
> While this sounds right at a first glance this argument is misleading. As
> you can normally not determine from a genral piece of runnable code how long
> it will actually take to run you will will normally run the code in a
> background job anyway so that the UI is not frozen during this time. One
> could argue that model updates always happen on a non-UI thread.

Okay, take the case where you are surfacing the _entire data model_ to the user _all at once_. And you have a remote model that can change _its enire contents_ arbitrarily at record-level granularity, but you don't know the delta except by touring the whole model. You could try to avoid that by say first doing a compare off the UI thread between the model and the remote API object, and then within the UI thread just update those features that have actually changed on the UI "original" of the model. But to do that, you first would have to ensure that you have a complete up to date copy of the UI model, and you'd need to do that in the UI thread... :)

I keep coming back to the argument that to the extent the model is exposed to the user then it becomes in that sense a user model. At that point, it's a data structure that should be exposed directly to the UI thread, in the same way you'd expose any other UI surfacing data structure. Of course, complex logic and so on should be away from that, but then you need some mechanism to protect it and the only way to do that is to do the actual final changes to the model on the UI thread itself.
Comment 21 Eike Stepper CLA 2013-04-17 01:35:01 EDT
Hi Miles,

Don't worry, whatever architecture and design you choose will not impact our friendship. This is no angry dispute ;-)

I do have the impression, though, that there is misunderstanding regarding the architecture I have in mind when I put CDO on the table. So let me elaborate:

It's clear that the remote backend system is god and that offline access is a key aspect to Mylyn. I didn't want to suggest a remote CDO server but the usage of a locally embedded CDO repository, possibly with a light-weight embedded database such as the EPL-licensed H2. You'd need the same background logic for the remote synchronization as with any other design, but it would access the model end through a dedicated CDOTransaction. 

The UI, which you put so much emphasis on, would also operate on the model through a number of CDOViews and CDOTransactions which are (on a core level) fully separated from each other, but can still transparently communicate with each other through commits to the embedded repository.

With this architecture you would gain all the technical benefits that CDO brings for free: access to arbitrarily large models, thread-safety, fast queries of the entire model and many others. But IMHO the biggest advantage is that you can implement the business logic (whether that's located in your UI or not) in full separation, i.e. without even thinking (a lot) about all these technical issues, because they're fully transparent with CDO.


Next topic: UI affinity and your arguments against data binding

I must admit that I don't understand why you're putting so much focus on UIs. IMHO all UIs should always be treated as consumers of a strong core layer, just like other non-UI consumers/clients. You never know how/where others may want to (re)use your functionality, so it's wise to factor most of it into components with minimum dependencies, especially no specific UI dependencies. If Eclipse data binding has too many drawbacks (which I wasn't aware of) then there may be other technologies to bind the UI to the core model without breaking the more fundamental separation between those two layers.

I'm not a heavy user of data binding because I don't develop many UIs, but the document you're referring to could not convince me of anything. Mostly because it's totally missing to deliver reasons for a broad range of assertions it stipulates. Well, I do understand some of the complaints, but especially the assertion "realms are a huge mistake" is missing thourough explanations of why that is, as well as how alternatives could work. No doubt, from an application developer's point of view it would be nicer not to think about the darned single UI thread, but nobody complains about that beast :P
Comment 22 Miles Parker CLA 2013-04-17 14:49:09 EDT
(In reply to comment #21)
> Don't worry, whatever architecture and design you choose will not impact our
> friendship. This is no angry dispute ;-)

Heh. :D I wasn't worried about that, but we do get attached to our own solutions, and I'm always trying to fight that tendency in myself, which is the reason I'm so appreciative that you're taking so much time on this!

> It's clear that the remote backend system is god and that offline access is a
> key aspect to Mylyn. I didn't want to suggest a remote CDO server but the usage
> of a locally embedded CDO repository, possibly with a light-weight embedded
> database such as the EPL-licensed H2. You'd need the same background logic for
> the remote synchronization as with any other design, but it would access the
> model end through a dedicated CDOTransaction.

Yes, that's all inline w/ how we were thinking of employing CDO. It does seem like it would in principle be possible to simply have arbitrary writers to that database and forget supporting any higher-level API, and that is certainly tempting ;D, but I still think that isolating things to a higher level API for connector developers should be part of this.

> The UI, which you put so much emphasis on, would also operate on the model
> through a number of CDOViews and CDOTransactions which are (on a core level)
> fully separated from each other, but can still transparently communicate with
> each other through commits to the embedded repository.

Sounds complicated. ;D Seriously for our consumers, I think we do need a path that isolates from CDO API concerns. Maybe I'm misunderstanding what you're suggesting or being too conservative about this.

> With this architecture you would gain all the technical benefits that CDO brings
> for free: access to arbitrarily large models, thread-safety, fast queries of the
> entire model and many others. But IMHO the biggest advantage is that you can
> implement the business logic (whether that's located in your UI or not) in full
> separation, i.e. without even thinking (a lot) about all these technical issues,
> because they're fully transparent with CDO.

I'm very certain that we will want to get into CDO implementations, as being able to query across model, distribute model and so on will be very important for M4. Those are really the key drivers for adoption of CDO -- CDO capabilities are much more valuable to us then the CDO API per se, and I don't think run against what we offer to the consumers through a remoting API. But given the time we have left for implementation I don't think we'll be able to tackle CDO implementation right now. That doesn't mean this is moot! All of the discussion we're having here will totally impact what we need to do going forward, so it is extremely valuable.

The point of what we're doing now is to design a high-level API for reuse that we can plug into an arbitrary EMF back end implementation, while providing implementors with a stable API. My view is that no matter what we use we want to enforce a separation between a) asynchronous, failure prone remote calls, and b) model changes. For one thing it provides a more disciplined approach to what we've been doing.

The design goals we're addressing for Reviews 2.0 are actually pretty straightforward:

1) first you obtain an object representing remote data, then you apply it to the model. 
2) Given that the remote interaction has different semantics from EMF changes, provide those higher-level notifications from the remote API.

In this sense, for the 2.0 release there is one other piece that we absolutely have to get right:

3) Make sure we don't deadlock the API. :D Or more seriously, have no appreciable performance impact for workbench users. 

One big question for you is whether there is anything about CDO interaction that would drive how we're defining the Remote API now. In other words is there anything that would prevent us from simply handing off say a parent object to a consumer and allowing them to do whatever they want w/ it using normal EMF API. I think the answer to taht is no, in which case these current decisions shouldn't impact anything down the line. 

> Next topic: UI affinity and your arguments against data binding
> 
> I must admit that I don't understand why you're putting so much focus on UIs.
> IMHO all UIs should always be treated as consumers of a strong core layer, just
> like other non-UI consumers/clients. You never know how/where others may want to

Agreed in principle, but see below..

> (re)use your functionality, so it's wise to factor most of it into components
> with minimum dependencies, especially no specific UI dependencies. If Eclipse
> data binding has too many drawbacks (which I wasn't aware of) then there may be
> other technologies to bind the UI to the core model without breaking the more
> fundamental separation between those two layers.

Please don't confuse the Use Cases for API! :O If you have the time, I'd suggest you fetch https://git.eclipse.org/r/#/c/11012/16 and take a look at the code as I think you're making some assumptions about the design. Of course I understand the need to separate UI from model dependencies -- that was the key point of the entire design; to fix what was broken with the current design. Because I guess in part to the way JFace and forms API are typically employed, data and logic end up all over in Mylyn and other Eclipse UI elements and that's one of the things I'd really like to address. In the Remote design the UI concern is entirely pluggable, with one line of code. And the design enabled us to push *all* of the model and remote interaction logic that was previously in the UI editor code down to core and in the process put it under testing. I'm really happy about that. :D

All I'm trying to point out is that in the end what is driving a lot of this is the value that cleanly dealing with issue 1) provides. I can see that CDO might in theory allow us to ignore that issues altogether, but even so, I'm not convinced we'd want to because not only are UI and model separate concerns, but IMO so are model and remote. Perhaps to a lesser extent, but the issue comes up again and again w/ Mylyn connector implementations.

> I'm not a heavy user of data binding because I don't develop many UIs, but the
> document you're referring to could not convince me of anything.

I just came across it myself, and just include it to point out that dat binding is not the be-all end-all. As I say, I don't think it's that relvant to the granularity we want here either.

> No doubt, from an application
> developer's point of view it would be nicer not to think about the darned single
> UI thread, but nobody complains about that beast :P

That's the point in the end I think. You can't avoid dealing w/ the single thread at some point, and developers need a simple way to interact with that.

So to put it concretely, let's say the user wants to check to see if a change has been made in a review. At some point they need to be able to request a refresh, and then they need to be notified of that refresh for the model in a way that they can safely use. That's all were really providing at this point, and I'm hoping to get the background plumbing working well for the base case.

Again, I really appreciate your engagement on this. It's going to make a big difference for the overall architecture as we move forward.
Comment 23 Eike Stepper CLA 2013-04-30 07:23:08 EDT
Sorry for the delay, your last comment ended up in my spam folder ;-(

What I would like to suggest regarding your high-level API is probably something that would also allow for a smooth migration towards using CDO internally: A storage-agnostic model facade like this:

http://git.eclipse.org/c/cdo/cdo.git/tree/plugins/org.gastro.rcp/src/org/gastro/rcp/IModel.java

http://git.eclipse.org/c/cdo/cdo.git/tree/plugins/org.gastro.rcp/src/org/gastro/internal/rcp/Model.java

The key concepts are ITransactionalOperation and IModel.modify(T, ITransactionalOperation<T>).

What's not totally clear to me is how, without CDO, you would protect against read access while ITransactionalOperations are running.
Comment 24 Miles Parker CLA 2013-04-30 12:44:30 EDT
(In reply to comment #23)
> Sorry for the delay, your last comment ended up in my spam folder ;-(

Heh.

> What I would like to suggest regarding your high-level API is probably something
> that would also allow for a smooth migration towards using CDO internally: A
> storage-agnostic model facade like this:
> 
> http://git.eclipse.org/c/cdo/cdo.git/tree/plugins/org.gastro.rcp/src/org/gastro/rcp/IModel.java
> 
> http://git.eclipse.org/c/cdo/cdo.git/tree/plugins/org.gastro.rcp/src/org/gastro/internal/rcp/Model.java
> 
> The key concepts are ITransactionalOperation and IModel.modify(T,
> ITransactionalOperation<T>).

Yeah, that's effectively what we're doing, but in a more managed / high-level way. Or maybe it's more like a "model-agnostic storage facade".. ;)

You define model interactions in a factory implementation, e.g.:

http://git.eclipse.org/c/mylyn/org.eclipse.mylyn.reviews.git/tree/org.eclipse.mylyn.gerrit.core/src/org/eclipse/mylyn/internal/gerrit/core/remote/GerritReviewRemoteFactory.java


And execution context can be specified by a service and also by cusotm EditFactoryProviders (but it could also just as easily be plugged into a CDO implementation.

http://git.eclipse.org/c/mylyn/org.eclipse.mylyn.reviews.git/tree/org.eclipse.mylyn.reviews.ui/src/org/eclipse/mylyn/reviews/ui/spi/remote/RemoteUiService.java
https://git.eclipse.org/r/#/c/12229/3/org.eclipse.mylyn.reviews.edit/src/org/eclipse/mylyn/reviews/edit/remote/AbstractRemoteEditFactoryProvider.java   <--WIP

This way, you don't have to reimplement any of the actual domain logic, which is the part that I think should really be isolated for maintainability.


> What's not totally clear to me is how, without CDO, you would protect against
> read access while ITransactionalOperations are running.

You can't really. That's why the whole discussion about reads only happening in context of single (UI) thread. But let's not go back there again. ;D But suffice it to say that if you then do chose to use a CDO context, and thus not have to worry about that anymore, you could reuse all of the model remote factory code without having to modify any of it. :) You could (I think) create a EditFacotry provider that wrapped all of the model calls in a CDOTransaction.
Comment 25 Miles Parker CLA 2013-06-19 20:27:48 EDT
The outlined concurrency approach seems to be working pretty well, albiet with a few growing pains.
Comment 26 Steffen Pingel CLA 2013-07-19 17:43:59 EDT
Sorry, but the current implementation that delegates to the UI thread every once in a while causing dead locks, NPEs or data inconsistencies isn't a satisfactory solution for managing concurrent access to the model. I'm reopening the bug since there is still to discuss further.
Comment 27 Steffen Pingel CLA 2013-07-19 17:46:20 EDT
Reopening as per comment 26.
Comment 28 Miles Parker CLA 2013-07-29 14:50:22 EDT
(In reply to comment #26)
> Sorry, but the current implementation that delegates to the UI thread every once
> in a while causing dead locks, NPEs or data inconsistencies isn't a satisfactory
> solution for managing concurrent access to the model. I'm reopening the bug
> since there is still to discuss further.

+1. I think you're right. I believe it would have worked fine for a case where we could closely manage the life-cycle of notifications, but I really didn't appreciate the complexities of integrating with the Mylyn life-cycle, in particular how the background update process interacts so closely with the editor life-cycle.

My suggestion at this point is to acknowledge this experiment at a relatively simple solution as a failure and try out an implementation of CDO as Eike and others have suggested all along. :) The good thing is that this would be a *relatively* isolated change.

We do need to decide how far to go w/ maintaining the current UI-based syn mode in the meantime.
Comment 29 Miles Parker CLA 2014-05-29 19:17:28 EDT
I believe this issue has been resolved (and in that sense the experiment succeeded) in that it has been almost a year now in release and we haven't had any further reports of model/UI related dead-locks or data inconsistencies.

So all of the issues seem to have turned out to be isolated defects and not a fault in the overall design and we can call this experiment a qualified success. ;)