Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 368581

Summary: JobManager does not wake up a scheduled job
Product: [Eclipse Project] Platform Reporter: John Arthorne <john.arthorne>
Component: RuntimeAssignee: John Arthorne <john.arthorne>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P3 CC: ivan.motsch, john.arthorne, loskutov, mober.at+eclipse, pwebster, remy.suen, sptaszkiewicz, stephan.leichtvogt, wbprio, yevshif
Version: 3.7.1Flags: dj.houghton: review+
Target Milestone: 3.7.2   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Bug Depends on: 366170    
Bug Blocks:    
Attachments:
Description Flags
Patch none

Description John Arthorne CLA 2012-01-13 14:17:04 EST
+++ This bug was initially created as a clone of Bug #366170 +++

We have an issue that blocks processing of an application.

It turns out that the job manager does not activate (run) a scheduled job after the specified time, even more it never starts it at all.

We traced it down to the job scheduling framework in JobManager.
That made it possible to create a hello world example.
Just open it in a workspace and launch the products/hello.product

The hello world is attached and does the following:
Main: schedule job A
Main: schedule job C
A: schedule job B and join on it
B: do sleep until a stop flag is set to true.
C: set the stop flag

Even more, all jobs scheduled after A (has started running) are never run and awakened. Therefore i added some info jobs that just say hello.

In order to not wait forever, i added a sleep 2000 after which it breaks up waiting in the main thread.
When calling IJobManager.resume at that point, it magically awakenes all ready-to-run jobs and runs them. But the javadoc says that calling resume() does nothing when the job manager is active.

Expected behaviour:
I expect that whenever scheduling a Job say schedule(200) it is run after 200 ms +/- some acceptable delay. Never running it is breaking the concept of having parallel jobs in a system.
Comment 1 John Arthorne CLA 2012-01-13 14:17:37 EST
Backporting fix to 3.7.2 stream.
Comment 2 John Arthorne CLA 2012-01-13 14:50:56 EST
Created attachment 209474 [details]
Patch
Comment 3 John Arthorne CLA 2012-01-13 14:52:29 EST
The old code was:

	if (manager.sleepHint() <= 0)
		jobQueued();

This means: if there is a job ready to run right now, ensure there is a worker thread available

The new code is:

	if (manager.sleepHint() < InternalJob.T_INFINITE)
		jobQueued();

This means: if there is any job ready to run now or at any time in the future, ensure there is a worker thread available
Comment 5 Andrey Loskutov CLA 2012-04-16 10:51:36 EDT
Hi John,

we are observing strange JUnit test failures which I think *could* be related to the job manager and this bug in particular. Unfortunately it is hard to provide stripped down test case.

What we do is to test internal message dispatching job. Test generates events in a loop and test is checking each time if the messages were dispatched properly. Dispatching should be done in a job. This works in 99.9% of the tests, but sometimes *randomly* fail.

I've did VM snapshots on fail and found that the job which was re-scheduled during the test was always in the SLEEPING state.

We have this scenario:

Main (Eclipse test framework):
    workbench.getProgressService().busyCursorWhile ()
    schedule(0) job A and join on it

A (Test job):
    re-schedules (with timeout of 10ms) the job B in the loop
    do Thread.sleep(few seconds) each time
    and fail if job B is NOT done after each sleep

B:
    dispatch events until internal message queue is not empty

Observation: sometimes job B is never executed, staying in SLEEPING state.

We see this with both 3.5.1, 3.7.1 and 3.7.2 Eclipse versions on RHEL 5.3 Linux/Sun 1.6 VM 64 bit. As the scenario we have is similar (join + sleep + schedule(time)), I've expected seeing the problem resolved in 3.7.2, but this is not the case.


Could it be that there is still some special cases left where Job manager does not activate (run) scheduled jobs?