Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 363423

Summary: Must acquire job lock during clean-up
Product: z_Archived Reporter: Gunnar Wagenknecht <gunnar>
Component: gyrexAssignee: Gunnar Wagenknecht <gunnar>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P3    
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Whiteboard:

Description Gunnar Wagenknecht CLA 2011-11-10 02:14:02 EST
The cleanup of old jobs currently happens outside the job lock. The idea was that this is not necessary. However, given the nature of a distributed environment it may happen that another node modifies the state of a job while the cleanup is running. The other node will then get an exception.

Example Stack-Trace:

..e = NoNode for /gyrex/prefs/cloud/..jobs/jobs/myjobid/history/1320761969322
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.createBackingStoreException(ZooKeeperBasedPreferences.java:262)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.flush(ZooKeeperBasedPreferences.java:481)
at ..jobs.internal.manager.JobManagerImpl.setJobState(JobManagerImpl.java:718)
at ..jobs.internal.manager.JobManagerImpl.queueJob(JobManagerImpl.java:455)
at ..jobs.internal.scheduler.SchedulingJob.execute(SchedulingJob.java:131)
at org.quartz.core.JobRunShell.run(JobRunShell.java:216)

Caused by: BackingStoreException: Error flushing node (node /cloud/..jobs/jobs/9myjobid/history). Error flushing node (node /cloud/..jobs/jobs/myjobid/history/1320761969322). KeeperErrorCode = NoNode for /gyrex/prefs/cloud/..jobs/jobs/myjobid/history/1320761969322
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.createBackingStoreException(ZooKeeperBasedPreferences.java:262)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.flush(ZooKeeperBasedPreferences.java:481)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.saveChildren(ZooKeeperBasedPreferences.java:1191)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.flush(ZooKeeperBasedPreferences.java:478)
at ..jobs.internal.manager.JobManagerImpl.setJobState(JobManagerImpl.java:718)
at ..jobs.internal.manager.JobManagerImpl.queueJob(JobManagerImpl.java:455)

Caused by: BackingStoreException: Error flushing node (node /cloud/..jobs/jobs/9myjobid/history/1320761969322). KeeperErrorCode = NoNode for /gyrex/prefs/cloud/..jobs/jobs/myjobid/history/1320761969322
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.createBackingStoreException(ZooKeeperBasedPreferences.java:262)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.flush(ZooKeeperBasedPreferences.java:481)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.saveChildren(ZooKeeperBasedPreferences.java:1191)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.flush(ZooKeeperBasedPreferences.java:478)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.saveChildren(ZooKeeperBasedPreferences.java:1191)
at ..cloud.internal.preferences.ZooKeeperBasedPreferences.flush(ZooKeeperBasedPreferences.java:478)

Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /gyrex/prefs/cloud/..jobs/jobs/myjobid/history/1320761969322
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1038)
at ..cloud.internal.preferences.ZooKeeperPreferencesService$WriteProperties.call(ZooKeeperPreferencesService.java:647)
at ..cloud.internal.preferences.ZooKeeperPreferencesService$WriteProperties.call(ZooKeeperPreferencesService.java:1)
at ..cloud.internal.zk.ZooKeeperBasedService$ZooKeeperCallable.call(ZooKeeperBasedService.java:37)
Comment 1 Gunnar Wagenknecht CLA 2011-11-10 03:25:13 EST
I added the lock and also moved the history out of the jobs node. This should reduce the conflicts during flush.