Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 490149

Summary: [Databinding] Proposal for more efficient change processing in databinding
Product: [Eclipse Project] Platform Reporter: Stefan Xenos <sxenos>
Component: UIAssignee: Simon Scholz <simon.scholz>
Status: CLOSED WONTFIX QA Contact:
Severity: normal    
Priority: P3 CC: jens, loskutov
Version: 4.6   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard: stalebug
Bug Depends on: 490139    
Bug Blocks:    

Description Stefan Xenos CLA 2016-03-21 22:48:32 EDT
This is a proposal for a new change processing algorithm for Eclipse Databinding.

Problems in Eclipse Databinding:

- It is currently susceptible to performance degradation when there are multiple paths to the same observable. One representative pattern where performance degrades is the repeated diamond shape, where performance degrades to O(2^r) for r repetitions.

- It performs a lot of memory allocation during change processing.

- Some observables fire change events even though they haven't changed.

- Some observables recompute the observable eagerly in order to suppress redundant change notifications, but as a consequence recompute the observable once every time an input changes, which much more often than their dependencies would care about such changes.

- Calling getters or processing change events hits a lot of memory barriers and performs unnecessary thread synchronization due to heavy use of thread locals.

- All Observables currently leak memory unless they are explicitly disposed. However, internal nodes - like ComputedValue - could support disposal as an optional operation and could clean themselves up automatically once there are no further incoming references. This would simplify the programming model.


This proposal should address all these issues.

How it works:

All the ThreadLocals in ObservableTracker will become private attributes of Realm. Each Realm is associated with a thread, so the end effect is the same (actually, the functionality of the existing threadlocals will be preserved but they will be represented differently, as described below.)

Each Realm will get a DependencyGraph. DependencyGraph looks like this:

class DependencyGraph {
  long tickCounter;
  WeakKeyMap<Object, DataBindingNode> mapObservableOrSideEffectOntoDataBindingNode;
}

class DataBindingNode {
  // Time at which the node was last visited to determine if it changed.
  long tickLastVisited = -1;
  // Time at which the node was last known to be changed.
  long tickLastChanged = -1;
  // True iff the node is scheduled to be re-evaluated on the next tick.
  boolean dirty = true;
  // True iff the node will never fire another change event
  boolean isInvariant = false;
  // True iff this node is currently being recomputed
  boolean isComputing = false;
  // True iff this node needs to be visited on the next update pass
  boolean scheduledForUpdate = false;
  // List of nodes which depend upon this node (weak references)
  WeakSet<DataBindingNode> dependents;
  // Keys are nodes which this node depends on (strong references). Values have two states: 
  // "Used" or "Unused"
  HashMap<DataBindingNode, NodeState> dependencies
}


Operations on DataBindingNode:

getTickOfLastChange() {
  Recomputes the node if necessary then return tickLastChanged.
}

getTickOfLastChangedDependency() {
  find the maximum of the getTickOfLastChanged among all dependencies
}

recomputeIfNecessary() {
  Throws an exception if isComputing is true when invoked (this represents a circular dependency)
  if (dirty or graph.tickCounter > tickLastVisited) {
    dirty = false
    
    if (dirty || getTickOfLastChangedDependency() > tickLastVisited) {
      set isComputing to true, so we can detect recursion
      set all dependencies to the "Unused" state
      boolean changed = recompute()
      search through all dependencies and delete any that are still in the "Unused" state. Remove
      "this" as a dependent at the same time. Use copy-on-write to avoid creating a CME.
      set isComputing back to false
    }

    if (changed) {
      tickLastChanged = graph.tickCounter;
    }
  }
  tickLastVisited = graph.tickCounter
}

markDirty() {
  dirty = true;
  do nothing if this node is already scheduled for validation
  this.scheduledForUpdate = true
  
  schedule this node for processing on the next update if it hasn't already been done. Then recursively iterate over all of its dependents and schedule them, too.

  if graph.tickCounter == this.tickLastVisited then increment graph.tickCounter
  
  add this node (but not its dependents) to a queue in the dependency graph for validation on the next
  update pass.

  The dependency graph should schedule its own update() method to run asynchronously at the earliest opportunity.
}

// This is invoked asynchronously during an update pass on all nodes that were explicitly marked as dirty
update() {
  if (this node has public listeners attached or is an ISideEffect) {
    recomputeIfNecessary();
  }
  scheduledForUpdate = false;
  recursively call update() on all dependents
}



There would be different recompute() methods for different observable types. For SettableValue, it would look like this:

bool WritableValueNode.recompute() {
  Object oldValue = currentValue;
  this.currentValue = newValue;

  return (!Objects.equal(oldValue, newValue);
}

WritableValue.getValue() would look something like this:

getValue() {
  getterCalled();
  myDataBindingNode.recomputeIfNecessary();
  return currentValue;
}

...and WritableValue.setValue(...) would look like:

setValue(T newValue) {
  this.newValue = newValue;
  myDataBindingNode.markDirty();
}

There would be three types of ReactiveNodes -- for WritableValue, ComputedValue, and SideEffect. All other observable types would be reimplemented behind the scenes using some combination of the above, and none of them would attach listeners to one another.


The dependency graph itself has an update() method. It is scheduled to run asynchronously whenever anything in the graph is marked as dirty. It looks like this:

update() {
  for every node in update queue (the queue that was filled in by markDirty), call update() on that node.
  clear the update queue
}

The implementation of runAndMonitor would change. Rather than remembering a bunch of threadlocals, it only needs to remember the current DataBindingNode. When someone invokes getterCalled, it finds the matching DataBindingNode and connects the two nodes as a dependent/dependency. Dependencies are a map. The newly-discovered dependency is set always set to the "Used" state.
Comment 1 Stefan Xenos CLA 2016-03-21 23:00:08 EDT
Simon, this is a first stab at documenting the pseudocode for my algorithm. If implemented correctly, it should be possible to modify any combination of nodes and process the resulting update with the following properties:

1. The only memory allocated should for:
- Firing change events to any user-attached listeners (but NOT for firing events between observables).
- Attaching any newly-created edges in the dependency graph.
- Growing the ArrayList (if necessary) when inserting a node marked as dirty into the update queue.
However, if there are no user-attached listeners and the graph hasn't changed shape, the only memory allocation during an update should be for the update queue.

2. No observable will be recomputed more than once per update.

3. No observable will be recomputed unless it either has a user-created listener attached to it or a resumed side-effect depends on it directly or indirectly.

4. If an observable has a user-created listener attached to it, it won't be garbage collected until the listener is removed or it is disposed. Otherwise, the observable will be garbage collected when it is no longer referenced by user code. Side-effects always need to be explicitly disposed.

I didn't want to get into too much detail so I omitted the fine details of how ComputedValue and SideEffect are connected to the graph... but if it's not obvious, I can elaborate.
Comment 2 Stefan Xenos CLA 2016-03-22 18:36:59 EDT
There's a problem in comment 0:

   dirty = false

That should show up inside the last check for if (dirty), not before it.
Comment 3 Andrey Loskutov CLA 2017-03-29 13:38:00 EDT
Simon, do you plan to work on this?
Comment 4 Eclipse Genie CLA 2021-04-24 03:37:33 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.