Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 466576

Summary: [server] Consider 'projects to blame' to respect Display.runDeferred and Display.runAsync
Product: [Technology] EPP Reporter: Marcel Bruch <marcel.bruch>
Component: Automated Error Reporting Client (AERI)Assignee: EPP Error Reports <error-reports-inbox>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: Ed.Merks, ed
Version: unspecified   
Target Milestone: later   
Hardware: PC   
OS: Mac OS X   
See Also: https://bugs.eclipse.org/bugs/show_bug.cgi?id=451359
Whiteboard:

Description Marcel Bruch CLA 2015-05-06 08:46:01 EDT
It happens that the list of identified projects is imprecise and blaming projects that are on the stacktrace but not necessarily cause the issue. Bug 451359 gives an example for such a problematic case.

There are, however, sometimes / often traces where projects schedule a async runnable that cause exceptions. Cutting of the trace after events like Display.runDeferred or Display.runAsync will blame other projects than those that actually scheduled the job.

It needs some investigation to make sure we do not make things worse than they are today.
Comment 1 Ed Merks CLA 2015-05-06 09:55:35 EDT
Note that I'm suggesting cutting the stack off specifically at Display.readAndDispatch() because this is part of the "standard dispatch loop".  The stack above that point will be the interesting part that causes problems while the part below the stack is a dispatch loop and cannot be blamed for problems resulting from dispatched arbitrary events.
Comment 2 Marcel Bruch CLA 2015-05-06 11:59:00 EDT
[1] is an example where strictly cutting off after the first Display.readAndDispatch() won't lead to reasonable results (no matter whether you start from the caused-by exception or just go top-down). Is there a specific pattern/call sequence how SWT schedules UI Runnables? If there would be a pattern that allows me to (more) reliably detect which projects I'd be happy to implement that. At the moment I can only think of a heuristic that step-by-step enlarges the currently investigated stack trace until there are 'more projects than platform' in it. Not very satisfying but probably better than what we have right now...


[1] https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4eeb6bee810030da05f61/details
Comment 3 Ed Willink CLA 2015-05-06 12:34:41 EDT
In the absence of a perfect algorithm, perhaps some form of depth metric such as the SimRel +1/+2/+3 build date might help. Better a MANIFEST.MF dependency depth.

Better to blame the most dependent (e.g. a +3) plugin than the least since the least dependent (e.g. a +0) gets overloaded. The +3 plugin can always triage it to the +0 if that is appropriate.
Comment 4 Marcel Bruch CLA 2015-05-06 12:52:06 EDT
(In reply to Ed Merks from comment #1)

A few examples where this approach would work and where not:

Works:

https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4eec4bee810030da06093
https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4eeb8bee810030da05f8f
https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4eeb9bee810030da05fad
// wrote before Ed W.'s comment but sunbmitted afterwards


Does not work:

https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4ef00bee810030da065cd
https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4eeb6bee810030da05f61
https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4f12abee810030da08d87

Does not work but unfair to expect:
https://dev.eclipse.org/recommenders/committers/confess/#/problems/54c4eeb8bee810030da05f99

If there is no pattern in how SWT schedules ui jobs which are not related at all, two heuristics may be used:

1. Subsequently add new Display.readAndDispatch sections until a non-platform project appears on the list
2. Index dependencies between bundles to spot unlikely relationships like mpc --> papyrus. If one ui thread section says papyrus and the next mpc and there is no link from mpc to papyrus, mpc is likely not causing this issue.


But maybe someone else has an idea - no matter how complex.

(In reply to Ed Willink from comment #3)
> In the absence of a perfect algorithm, perhaps some form of depth metric
> such as the SimRel +1/+2/+3 build date might help.

+3 is somewhat overloaded but may be worth trying. I use something like "+0, +1": Platform and JDT will be removed if any other project is present. Xtext will be removed if some other xtext clients are present etc. Maybe taking this one step further can help. Can I get a (csv) list of projects and their offsets somewhere?

> Better a MANIFEST.MF
> dependency depth.

Yes. See above comment. Would probably require some detection work on the client side. For Eclipse it may work when indexing the simrel repo once.
Comment 5 Ed Willink CLA 2015-05-06 13:02:05 EDT
(In reply to Marcel Bruch from comment #4)
> (In reply to Ed Willink from comment #3)
> > In the absence of a perfect algorithm, perhaps some form of depth metric
> > such as the SimRel +1/+2/+3 build date might help.

Sorry. This was a silly idea. Xtext at +2 is disruptive. OCL at +1 depends on XText.

> > Better a MANIFEST.MF
> > dependency depth.
> 
> For Eclipse it may work when indexing the simrel repo once.

I think this is the way to go.
Comment 6 Marcel Bruch CLA 2015-05-06 14:56:37 EDT
Okay, I spent the last hours digging through the all error reports and tested several measures. I implemented the cut-off logic for Display.runAndDispatch() and Workspace.build() und run it experimentally on the current data. The results of that are live.

FWIW, this approach does classify some errors wrong (i.e., I'd manually assigned them to other projects). But much more often the classification removes superfluous elements than removing an important one.

Please find the new project guesses for Oomph here [1]. I don't know how many elements where in that list before. But the list is still long. If you have additional filter criteria, please let me know. Whatever helps to reduce the false assignments or helps to prioritize issues will be implemented. That list is indeed not managable. side note: we will hide problems that stayed quiet for some time from that list.


The new algorithm is not yet permanently integrated. It needs more testing but will hopefully go live by the end of this week.


[1] https://dev.eclipse.org/recommenders/committers/confess/#/problems/?projects=oomph&kinds=NORMAL&kinds=FREEZE&categories=UNCONFIRMED&page=0&size=100&sort=numberOfIncidents,desc
Comment 7 Ed Merks CLA 2015-05-06 23:51:09 EDT
The new approach removed more than 60 elements from the list.  The list is indeed still long, but I'll continue to triage it today to get an impression and provide additional feedback.

One reason the list is so long is that we (Oomph) were logging any failure in setup task performance as an error, and then folks would report that.  But such feedback is already in the setup dialog's progress page and generally is not a bug but rather an authoring error, or a network failure.  As such we've stopped doing that and have generally looked at cases where we are logging errors that really should never be reported as bugs, e.g., one can always expect a network failure so one doesn't want bugs reported for such cases.  So I hope in the future, the list will be more manageable.
Comment 8 Marcel Bruch CLA 2015-05-07 15:25:45 EDT
The change is live. All new incoming problem reports are now cut-off after Display.readAndDispatch() or Workspace.build() - if exists.

Please reopen if you have suggestions or found bugs.