Community
Participate
Working Groups
The task list spents a lot of CPU time in AbstractTaskContainter.contains() on refresh. AbstractTaskContainter.containsHelper() and AbstractTaskContainter.getChildren() need to be optimized.
I see four callers of AbstractTaskContainer.contains(), can you tell me the main caller causing this? Perhaps the check for no cycles is performed more than once per iteration?
Created attachment 81457 [details] Applying brains to problem: 3 out of 4 possible steps implemented, please test this.
Created attachment 81458 [details] mylyn/context/zip
BTW, I think contains() methods should not use getChildren() call, which will create a temporary collection that is thrown away after getting its size.
Maarten, could you please outline how your patch addresses the problem mentioned in the description?
(In reply to comment #5) > Maarten, could you please outline how your patch addresses the problem mentioned > in the description? The helper contains superfluous tests and recurses only to return again. I've tried to remove those, but I don't have access to a profiler: bug 68111 :-(. (In reply to comment #4) > BTW, I think contains() methods should not use getChildren() call, which will > create a temporary collection that is thrown away after getting its size. I'll look at getChildren() next, it's called twice now: once to see its not null, once to find the size. I cache the intermediate result to remve one call. After this I need to compare the performance with the new method countChildren, that does not allocate and dispose an intermediate result HashSet on every invocation, but just counts. Is a unit test already available for this as basis for a performance test??
Created attachment 81482 [details] patch with one call to getChildren removed. Thank you Eugene and Steffen for feedback
Maarten, it would help the review process and the project most if open questions were discussed first before patches are attached to an issue. The review of a complete and polished patch will require much less time than reviewing multiple incomplete patches and increase the chances a patch to get merged. I am not sure if these methods can be optimized without looking at the detailed profiling results. Please also note Eugene's comment on bug 194430 .
Yesterday I created a junit test for this bug to test different implementations of the containsHelper() function. caching the result of getChildren gives a 50% improvement but still gives long times for large (dipe/wide) trees. This morning I woke up with the solution: this function must recurse when it hasChildren(), the count is irrelevant! And hasChildren() is cheap, it can return as soon as there is a single child!! private boolean containsHelperB(String handle, int depth) { if (depth >= MAX_SUBTASK_DEPTH) { return false; } else { depth++; for (AbstractTask child : children) { String childLabel = child.getHandleIdentifier(); if (handle.equals(childLabel)) { return true; } else if ( child.hasChildren()) { if (((AbstractTaskContainer)child).containsHelperB(handle, depth)) { return true; } } } return false; } } public boolean hasChildren() { for (AbstractTask child : children) { return true; } return false; } When describing the problem properly, the method name reveals itself! The unit test gives the following results for a reasonably deep and wide test case: Time spent with current contains() : 54297 ms Time spent with caching containsA() : 27835 ms Time spent with hasChildren(() containsB() : 642 ms How common are cyclic structures, do they need testing as well?
Maarten, thanks for investigating that. It looks like in the current implementation the list of children is indeed build up three times. It should be trivial to make that significantly faster with a simple fix.
Created attachment 81609 [details] test case with cycle Mik, it turns out that the code is also not working as expected. Please see the attached test case which has a cycle in the subtask hierarchy. It throws a StackOverflowError.
Created attachment 81610 [details] mylyn/context/zip
(In reply to comment #11) > Mik, it turns out that the code is also not working as expected. Please see the > attached test case which has a cycle in the subtask hierarchy. It throws a > StackOverflowError. No chocking with stack overflow on my modified code using hasChildren()...
(In reply to comment #11) > It throws a StackOverflowError. Well getChildren() calls contains() inside a loop AND contains() calls containsHelper() which calls getChildren() twice in the if test and calls itself recursively once.
Created attachment 81699 [details] recursion, why doesn't Eclipse warn against this? The source of the performance problems...
getChildren() does NOT recurse and return a set with all children, grandChildren, and so on. Yet this is what the javadoc comment leads you to believe... :-( New hypothesis: This may very well be the cause of many bugs with improper display of Grouped Subtasks like - bug 205202: subtasks causing false guaranteed visibility for their parent tasks - bug 204699: Group subtasks hides scheduled tasks in Focus on Workweek tree/graph querying and navigation is a lot more complex than simple list navigation. The valid desire to group subtasks and show dependencies creates increased complexity and cost (CPU) problems Solution: The API should differentiate between looking for direct children and a deeper search: getChildren() is not the same as getDescendants()
Mik, please clarify what you want getChildren() to do, it has over a 100 callers so it must be something important! the contains() optimization is almost done, using a cheaper hasChildren() method that can be used in more places where ...getChildren.size() > 0 is or ...getChildren().isEmpty() is used now. I think in some of the callers are interested in the children only, while others want to see all descendants with, cycles are a problem only to the recursive implementation. test case: task1 -> task2 -> task3 -> (task1) task1.getChildren() returns [task2] task1.hasChildren() returns true task1.countChildren() returns 1 task1.getDescendants() returns [task2, task3] More insidious is: test case: task1 -> (task1) task1.getChildren() returns [] OR [task1] task1.hasChildren() returns false OR true task1.countChildren() returns 0 OR 1 task1.getDescendants() returns [] OR [task1] Personally I think direct self reference should be prevented at creation or addition, throwing IllegalStateException
Mik, I think you (sometimes) want getChildren to do this: /** * @return an expanded set of all descendants, excluding itself; */ public Set<AbstractTask> getDescendants() { Set<AbstractTask> childrenWithoutCycles = new HashSet<AbstractTask>(); this.getDescendantsHelper(childrenWithoutCycles, this); return Collections.unmodifiableSet(childrenWithoutCycles); } protected void getDescendantsHelper(Set<AbstractTask> visited, AbstractTaskContainer root) { for (AbstractTask child : children) { if (!visited.contains(child) && child != root) { visited.add(child); child.getDescendantsHelper(visited, root); } } }
Created attachment 81838 [details] optimized contains() Less expensive contains with less stack overflow problems. Also added hasChildren() to replace ...getChildren().size() > 0 and ...getChildren.isEmpty(); Also added getDescendants() to get a set of all descendants in a container. Also added performance Unit Test to do some load testing. Parameters to be set.
Created attachment 81839 [details] mylyn/context/zip
I fully agree that we need to optimize this and some related parts of the code. As with testing, I want us to follow Platform's example and ensure that important optimizations of this sort are captured by repeatable performance tests. These will not only help us get rid of some of the hotspots, but will also provide long-term value to the project and help avoid future performance regressions. Maarten: it's great to see your interest in this. Would you like to contribute a performance tests that highlights the underlying problem? There's a great tutorial on how to do this linked from bug 116487.
OK, I installed the test.performance stuff, wrote some test, and ran them. Quite easy actually. I've described prerequisites in bug 116487. Output of current version of containsHelper() Scenario 'org.eclipse.mylyn.tasks.tests.performance.Bug207659Test#testContainsPerformanceMany()' (average over 2000 samples): System Time: 1ms (95% in [1ms, 1ms]) Measurable effect: 0ms (0.1 SDs) Used Java Heap: -195 (95% in [-162.39K, 162.01K]) Measurable effect: 327.01K (0.1 SDs) (required sample size for an effect of 5% of stdev: 6400) Scenario 'org.eclipse.mylyn.tasks.tests.performance.Bug207659Test#testContainsPerformanceFew()' (average over 2000 samples): System Time: 24ms (95% in [24ms, 24ms]) Measurable effect: 0ms (0.1 SDs) Used Java Heap: 2.44K (95% in [-299.32K, 304.2K]) Measurable effect: 608.39K (0.1 SDs) (required sample size for an effect of 5% of stdev: 6401) Version with optimized containsHelper() Scenario 'org.eclipse.mylyn.tasks.tests.performance.Bug207659Test#testContainsPerformanceMany()' (average over 2000 samples): System Time: 0ms (95% in [0ms, 0ms]) Measurable effect: 0ms (0.1 SDs) (required sample size for an effect of 5% of stdev: 6401) Used Java Heap: 1.51K (95% in [-13.67K, 16.69K]) Measurable effect: 30.6K (0.1 SDs) (required sample size for an effect of 5% of stdev: 6401) Scenario 'org.eclipse.mylyn.tasks.tests.performance.Bug207659Test#testContainsPerformanceFew()' (average over 2000 samples): System Time: 0ms (95% in [0ms, 0ms]) Measurable effect: 0ms (0.1 SDs) (required sample size for an effect of 5% of stdev: 6400) Used Java Heap: -4.96K (95% in [-26.32K, 16.41K]) Measurable effect: 43.07K (0.1 SDs) (required sample size for an effect of 5% of stdev: 6400) I think the biggest problem in the current code is the risk of stack overflow and excessive heap usage. I propose attached patch that does not change functionality be applied asap. So its more a functionality issue/breaker than a performance issue. (In reply to comment #11) > Mik, it turns out that the code is also not working as expected. Please see the > attached test case which has a cycle in the subtask hierarchy. It throws a > StackOverflowError. Please specify the desired behaviour in the javadoc comment, especially with regard to cyclic structures. I can then work on both functional and performance tests.
Created attachment 82532 [details] Resolving performance issue in contains() Functionality still to be resolved. Change is to build up the set only once instead of three times.
Created attachment 82533 [details] mylyn/context/zip
Maarten, could you attach the tests you created?
Created attachment 82572 [details] Changed AbstractTaskContainer plus test I've got a 15 trial of the YourKit profiler :-) The main caller of getChildren is CustomTaskListDecorator.hasIncoming() taht may also be optimized...
Created attachment 82573 [details] mylyn/context/zip
Steffen: assigning to you for review.
(In reply to comment #26) > The main caller of getChildren is CustomTaskListDecorator.hasIncoming() that may > also be optimized... Indeed hasIncoming is building the getChildren list three times as well, where once would suffice... How serious are these performance problems and do we need a fix for 2.2? Then this is an easy improvement: private boolean hasIncoming(AbstractTaskContainer container) { for (AbstractTask task : container.getChildren()) { if (task != null) { AbstractTask containedRepositoryTask = task; if (containedRepositoryTask.getSynchronizationState() == RepositoryTaskSyncState.INCOMING) { return true; // } else if (task.getChildren() != null && task.getChildren().size() > 0 && hasIncoming(task)) { } else if (task.hasChildren() && hasIncoming(task)) { return true; } } } return false; } I suspect adding grouped tasks presentation with subtasks beneath tasks introduced recursion performance problems in many places. Maybe we should introduce the optimized getDescendants(), see comment 18, and use that wherepossible in place of the current getChildren in another recursion.
Created attachment 82946 [details] patch in case changes need to be reverted
Created attachment 82947 [details] mylyn/context/zip
Maarten, I have committed changes that are along the lines of your patches. I will iterate more on this and plan to setup the performance test suite within the next few weeks so we can collect a few numbers.
(In reply to comment #32) OK I see you've merged my ideas well. Now that this more hierarchical TaskList is here to stay, I think that AbstractRepositoryContainer should allow the visitor pattern for various checks. So it should have an accept() method, with a visitor that includes a Object placeholder in the arguments for collecting a subset. like children or incoming, or whatever. The visitor visit() method would be a more general case of the containsHelper(). What do you think of this?
(In reply to comment #17) > I think in some of the callers are interested in the children only, while others > want to see all descendants with, cycles are a problem only to the recursive > implementation. > > test case: task1 -> task2 -> task3 -> (task1) > > task1.getChildren() returns [task2] > task1.hasChildren() returns true > task1.countChildren() returns 1 > task1.getDescendants() returns [task2, task3] > > More insidious is: > test case: task1 -> (task1) > > task1.getChildren() returns [] OR [task1] > task1.hasChildren() returns false OR true > task1.countChildren() returns 0 OR 1 > task1.getDescendants() returns [] OR [task1] Still to do: What do we want to happen when we call getChildren on a cyclic structure? I think it should return all the children, but only once.
We will review these APIs for Mylyn 3.0 when we will be able to make larger changes to the task list model. Using a visitor or iterator for traversing the task list graph is certainly a possibility that will be considered. For the record here is the stack trace I encountered when I had a cycle in the task list: -- Error Log -- Date: Thu Nov 15 02:47:00 GMT 2007 Message: Unhandled event loop exception Severity: Error Plugin ID: org.eclipse.ui Stack Trace: org.eclipse.swt.SWTException: Failed to execute runnable (java.lang.StackOverflowError) at org.eclipse.swt.SWT.error(SWT.java:3706) at org.eclipse.swt.SWT.error(SWT.java:3624) at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:133) at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:3312) at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:2985) at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:2395) at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2359) at org.eclipse.ui.internal.Workbench.access$4(Workbench.java:2225) at org.eclipse.ui.internal.Workbench$4.run(Workbench.java:468) at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:288) at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:463) at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149) at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:106) at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:193) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:106) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:76) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:362) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:175) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:515) at org.eclipse.equinox.launcher.Main.basicRun(Main.java:455) at org.eclipse.equinox.launcher.Main.run(Main.java:1193) at org.eclipse.equinox.launcher.Main.main(Main.java:1169) Caused by: java.lang.StackOverflowError at org.eclipse.mylyn.tasks.core.AbstractTaskContainer.containsHelper(AbstractTaskContainer.java:91) at org.eclipse.mylyn.tasks.core.AbstractTaskContainer.contains(AbstractTaskContainer.java:87) at org.eclipse.mylyn.tasks.core.AbstractTaskContainer.getChildren(AbstractTaskContainer.java:74) at org.eclipse.mylyn.tasks.core.AbstractTaskContainer.containsHelper(AbstractTaskContainer.java:98) at org.eclipse.mylyn.tasks.core.AbstractTaskContainer.contains(AbstractTaskContainer.java:87) at org.eclipse.mylyn.tasks.core.AbstractTaskContainer.getChildren(AbstractTaskContainer.java:74) [...]
Steffen: in case you haven't already, please make sure that the end result of this chnage doesn't blow up the UI when there is a cycle. It's not common, but possible to get into the scenario where a partially synched set of tasks forms a cycle. Until we make this safe there has to be a safeguard somewhere.
Sorry, I should have pointed that out more clearly. The fix I made was to gracefully handle cycles that I had in my tasklist which could cause StackOverflowExceptions before (stacktrace on bug 35). I have added a test case to verify that this works now.
Fyi, committed a change to not look down tree viewer in order to determine when to draw the incoming indicator on queries and categories.