Community
Participate
Working Groups
We have our custom logical structures in our product. Our logical structures make use of evaluations to populate the variables view. When creating the children of the logical structures, we call: JDIThread.queueRunnable(runnable) to queue our evaluation runnable. In the runnable, we call JDIThread.runEvaluation(...) to perform evaluation. Steps to reproduce problem: 1. Turn on logical structures 2. Start testcase that enables our logical structure 3. Step a few times in testcase, and expand a few variables that use the logical structure 4. Switch to breakpoints view 5. Step to clear any possible caching and wait a little for all jobs and things to settle down. 6. Bring the variables view to the top. We found that it takes a long time for the variables view to populate if logical structure is enabled and our logical structure is at play. We used Yourkit to profile and found the following: 1. Of all the time being spent to populate the variables, half of that time is being spent on scheduling jobs. Furthermore, the majority of the time here is in the startJob and endJob method. <See attached screen capture> 2. Worker.run method is invoked 19 times. However, JobManager.findBlockJob is called 9063 times and InternalJob.isConflictingJob is called 51,000 times! 3. I put breakpoints and tracing in the JobManager.findBlockingJob method. I found that when the JobManager is trying to find a blocking job, 90% of time times, it is trying to find a blocking job for "Check Debug Command" job which is the job used to enable/disable debug actions. Are we leaking Check Debug Command jobs in the Job Manager? and that's why we are trying to find conflicting jobs for it so many times? Could our extra evaluations caused more Check Debug Command jobs to be run and scheduled? I didn't think that evaluation would cause debug actions to update.
Created attachment 111012 [details] screen cap
Samantha, Can you turn on model proxy traces to see how often the Debug view content is refreshed?
I haven't had a chance to turn on the trace. However, is it possible to target this for 3.4.2?
From our discussion a week ago, I got the impression that the problem was that the evaluations resulted in lots of "CHANGED" debug events which in turn would result in deltas with CONTENT flag. I think that this performance degradation is a side effect of fixing bug 224477, which introduced updating action if a parent of the active context has a CONTENT delta generated. Since, it is probably not possible to avoid issuing the content delta (since that would introduce other bugs), the best way to handle this is to try to coalesce the events somewhere in the chain between where the CHANGED debug event is issued and where the job is processed. The possible places where the coalescing logic could be added are: 1) Where the changed event is issued. 2) In the model proxy, where the changed event is converted into a delta. 3) In DebugCommandService, where the various commands handlers are called as a result of the delta. 4) In the DebugCommand implementation itself. At first glance, I think the easiest place to implement this would be in the command service. There the requests to the command handlers are created but never tracked in any way. It shouldn't be hard to add a map of outstanding requests (keyed with the actions), and with every new request for update of a given action, cancel the previous requests.
Created attachment 114381 [details] Patch for DebugContextService. Here's a quick and dirty patch of coalescing in DebugContextService. It's actually cancellation and not coalescing, but hopefully handling the canceled jobs would be a lot more efficient than scheduling them.
Thanks a lot Pawel! :) I will give this patch a try and see if it helps. Samantha
I tried out the patch, but unfortunately, I don't think it has helped. :( The bottleneck of the performance problem is at the number of jobs being scheduled to update the debug commands. Majority of the time is being spent by the Job Manager to figure out if certain jobs are conflicting with each other. The more jobs being scheduled, the longer it will take for the job manager to figure out what's conflicting. The patch cancels the command update request. However, canceling the command update request does not cancel the job that is going to run. The command request's isCancelled is called inside the job that gets scheduled to update the commands. I was looking at the job being scheduled in the DebugCommand.canExecute method. This job gets scheduled regardless if the request is cancelled or not. And I think by the time the request is cancelled, the job is already scheduled, and we don't really have a handle to cancel it. I have tried to figure out what's causing all the commands to update and jobs to be scheduled. We have a variable with two logical children. For us to populate the two logical children, we run about 15-20 evaluations. (i.e JDIThread.runEvaluation gets called 20 times.) Every time the an evaluation is run, JDT fires a RESUME, EVALUATION_IMPLICIT event, which causes the debug commands to be updated. For every resume event, 9 jobs get scheduled to update the debug actions. Every time an evaluation is done, JDT fires a SUSPEND, EVALUATION_IMPLICIT event, which also causes debug commands to be updated. Again, I found 9 jobs being scheduled to update the actions. I haven't figured out how to fix this yet. Maybe we can do a better in our logical structures to not run 20 evaluations. Perhaps, we can group all debug commands updates into one job, rather than having a job for each type of command update. Not sure yet.
(In reply to comment #7) > ... > The patch cancels the command update request. However, canceling the command > update request does not cancel the job that is going to run. The command > request's isCancelled is called inside the job that gets scheduled to update > the commands. I was looking at the job being scheduled in the > DebugCommand.canExecute method. This job gets scheduled regardless if the > request is cancelled or not. And I think by the time the request is cancelled, > the job is already scheduled, and we don't really have a handle to cancel it. This is not very surprising. I implemented the shouldRun() method to check the request cancel status, hoping that the job manager would be designed to take advantage of it before scheduling the job. A better approach would have been to have a listener for the request cancellation and in turn cancel the job directly. Canceling a job would remove it from the waiting queue and would hopefully avoid the performance overhead that you're seeing. Maybe we could revisit my request for a cancel listener: bug 179308 ... Absent that, I agree that the next best chance is to implement the coalescing logic inside the DebugCommand class. The command implementation could simply save the a reference to the last job it submitted, and actively cancel it if its corresponding request was canceled. Or it could re-target that job to service the new request. Additional synchronization may be needed for the latter.
investigate for 3.4.2
Created attachment 120733 [details] patch This patch cancels associated jobs when new jobs are scheduled. I believe it works as is, but we should probably be careful to uphold the contract of calling cancel()/done() on each request even if it is canceled before it's update job is run. We could do this with a job listener.
Well that's an interesting way to cancel the job. I've never would have thought of looking at what JobManager has to offer. I guess the listener will need to be on the job manager, because looking at the implementation it doesn't seem that a listener to the job itself will be called.
Created attachment 120738 [details] patch This adds canel/done handling for canceled jobs.
Created attachment 121543 [details] copyright update Updated patch with 2009 copyrights.
Released fix to HEAD, requesting approval for 3.4.2.
+1 for 3.4.2
Released to 3.4.2
Please verify, Sam. (Will be in Jan 7 M-build).
I reran the scenario with Yourkit, and found that majority of time is now spent on waiting for responses from the JVM. Amount of time spent on trying to figure out if a job should be run has been significantly reduced. There is still a lot of flashing on the screen, as a result of the actions being updated. But that's another problem that needs to be looked at. Marking this as verified.