Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 333726

Summary: [prefs] EclipsePreferences#flush synchronization problems
Product: [Eclipse Project] Equinox Reporter: Krum Bakalsky <bakalsky>
Component: CompendiumAssignee: DJ Houghton <dj.houghton>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P3 CC: brian.vosburgh, dj.houghton, felix.klar, john.arthorne, kaloyan, karenfbutzke, loskutov, mober.at+eclipse, neil.hauge, remy.suen, stefan.dimov, tjwatson
Version: unspecifiedFlags: john.arthorne: review+
Target Milestone: Juno M3   
Hardware: All   
OS: All   
Whiteboard:
Bug Depends on:    
Bug Blocks: 351231, 359698, 359851, 376141    
Attachments:
Description Flags
fix dj.houghton: iplog+

Description Krum Bakalsky CLA 2011-01-07 03:37:33 EST
Hi,

I am using 64-bit Windows 7 Ultimate, with Sun Java 1.6.0_20 JDK installed. I have installed the latest "Eclipse IDE for Java EE developers" for "Windows 64 Bit" platform, it is Helios.
I have installed the latest JPA Diagram Editor from its dedicated Eclipse page: http://wiki.eclipse.org/JPA_Diagram_Editor_Project.


Scenario to reproduce:
0. Create some project with JPA 1.0 facet.
1. Open a clear fresh diagram, and create some entities.
2. Delete all entities as Java sources, remove them from the diagram as well, remove them from the persistence.xml descriptor as well, just leave the <persistence-unit name="TestJPAEditor"> 
</persistence-unit> tag there.
3. Open a clear fresh diagram and create a new Inherited Entity.
4. The entity java source is OK, but according to it, the entity does not inherit the created mapped superclass.

Hint: This behaviour, e.g. deleting everything and starting from the beginning, seems to cause quite strange things to happen.
Comment 1 Stefan Dimov CLA 2011-01-18 07:48:27 EST
Krum, there is a limitation that when you use the diagram editor the entities should not be registered in perisistence.xml. It's related to this bug:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=257530

So, you should use the editor in a project in which the annotated classes are discovered automatically. I've tried your scenario and it seems that when you create a new JPA in which the classes are being discoverd automatically when you open a diagram and create a few entities they still are being registered in the persistence.xml, which is the actual bug here.
Comment 2 Stefan Dimov CLA 2011-01-18 09:58:13 EST
Seems like a bug in Dali framework. I was able to reproduce the bug without involving diagram editor.

1. Create new JPA 1.0 project (choose 'Discover annotated classes automatically')
2. Create a new entity in it (with the wizard - leave 'Add to entity mappings in xml' unchecked)

Result: The new entity is being registered in the persistence.xml. Seems that

JptCorePlugin.discoverAnnotatedClasses(IProject) returns 'false'.

The problem is not always reproducible. I'm assigning this ticket to the 'Framework' component
Comment 3 Neil Hauge CLA 2011-02-01 17:29:43 EST
I can't seem to reproduce the "root" bug here.  When I create an Entity with the New Entity... wizard and the project is set to discover entities, the persistence.xml is not modified.  If there is a bug here it would be have to be in the New Entity wizard code, which I suppose is possible.

Note that the "Add to entity mappings in xml" setting refers to the orm.xml file and is not related to the persistence.xml file.
Comment 4 Stefan Dimov CLA 2011-02-22 09:58:12 EST
I'll debug it ...
Comment 5 Stefan Dimov CLA 2011-02-24 09:02:39 EST
Ok, seems that this is a complicated synchronizational problem. Here is another scenario to reproduce it:

1. Open package org.eclipse.core.internal.preferences.EclipsePreferences
2. Goto line 258 (The start of the 'for' cycle in the convertToProperties(...) method)
3. Put on it a conditional breakpoint with condtion:
        (temp.size() == 1) && (keys[0].equals("org.eclipse.jpt.core.platform"))
4. Open runtime in debug mode and create a few JPA projects with default settings, except for 'Discover annotated classes automatically', which should be set to true. If the debugging doesn't stop at the breakpoint. Delete all the projects and start again creating them. Then it most certainly will get to the breakpoint.
5. When stopped at the breakpoint check the values of 'temp' and 'properties'. You will see that they are pointing at the different objects with different contents (key/value properties). The 'properties' variable should contain something like:

----------------------------
org.eclipse.jpt.core.platform -> generic
org.eclipse.jpt.jpa.core.discoverAnnotatedClasses -> true
----------------------------

while the 'temp' should contain only:

----------------------------
org.eclipse.jpt.core.platform -> generic
----------------------------

6. Let debugging go on
7. After creating the project open the file: 
<Project>/.settings/org.eclipse.jpt.core.prefs
It's supposed to contain:

------------------------------
'org.eclipse.jpt.jpa.core.discoverAnnotatedClasses=true'
------------------------------

but it doesn't.

I'm still figuring out what's going on, but it's hardly debuggable since there are many different threads involved. It would be good if someone else (more familiar) with that part of code also have a glance.
Comment 6 Stefan Dimov CLA 2011-02-24 09:03:42 EST
'1. Open package org.eclipse.core.internal.preferences.EclipsePreferences' should be read as:

'1. Open the class org.eclipse.core.internal.preferences.EclipsePreferences'
Comment 7 Stefan Dimov CLA 2011-05-09 09:54:56 EDT
Created attachment 195072 [details]
fix
Comment 8 Stefan Dimov CLA 2011-05-09 09:56:51 EDT
    After some more research I think that there is a synchronization glitch in the org.eclipse.core.internal.preferences.EclipsePreferences class. The fix is simple - EclipsePreferences.flush() should be synchronized. 

    I'm proposing a patch and moving this bug to the equinox product. The patch definitely fixes the problem, but I'm not sure if it hasn't some unwanted side effect. (I didn't see any, but it's a very widely used core class, so I haven't tested the fix it in all possible scenarios).

    If (someone from) the equinox team doesn't agree with me, it means that we somehow misuse the preferences, so please give as some clue where to look at.
Comment 9 Thomas Watson CLA 2011-05-09 11:36:22 EDT
(In reply to comment #8)
>     After some more research I think that there is a synchronization glitch in
> the org.eclipse.core.internal.preferences.EclipsePreferences class. The fix is
> simple - EclipsePreferences.flush() should be synchronized. 
> 

It would help if you described what the glitch is and how synchronizing flush helps.
Comment 10 Stefan Dimov CLA 2011-05-09 12:28:45 EDT
As far as I was able to figure out JptJpaCorePlugin sets subsequently some project preferences and invokes flushing in another thread. There is a new thread for every flush (JptJpaCorePlugin&PreferencesFlushJob class) and since the flushing is not synchronized some of the newly set prefs are never flushed. That's why synchronizing the flush() method helps.
Comment 11 Stefan Dimov CLA 2011-06-27 08:51:47 EDT
Now, that Indigo is release, would you, please, take a look at this one?
Comment 12 DJ Houghton CLA 2011-07-05 15:34:53 EDT
Thanks for the analysis. 
Patch released.
Comment 13 Stefan Dimov CLA 2011-07-11 03:23:28 EDT
Would you mind submitting the patch in the maintenance branch too?
Comment 14 DJ Houghton CLA 2011-07-11 06:32:52 EDT
See bug 351231.
Comment 15 Stefan Dimov CLA 2011-07-11 06:41:38 EDT
Ooops ... I missed that. Thanks!
Comment 16 DJ Houghton CLA 2011-10-03 14:06:47 EDT
This fix has caused a deadlock so we need to revert and come up with a new solution. See Bug 359485 and Bug 359698.
Comment 17 Stefan Dimov CLA 2011-10-04 03:53:17 EDT
Ok, I'm open for suggestions ...
Comment 18 Stefan Dimov CLA 2011-10-04 03:56:07 EDT
I will investigate further, but right now I have no idea how to fix the original problem ...
Comment 19 DJ Houghton CLA 2011-10-04 10:10:00 EDT
I'm able to reproduce the problem locally and I think I have lowered it down to a problem in the EclipsePreferences.convertToProperties method. 

If you change the breakpoint condition to be: 
   temp.size() > 0 && keys[0].equals("org.eclipse.jpt.core.platform")
then at one point it will pause where there are 2 threads trying to flush the same preference node. It is exactly the same preference node (same object id) which is good - this was my big worry. If you inspect the nodes you will see the properties tables are the same object and contain 2 keys. That is ok too. The problem is that the "temp" var in one case contains one key and in the other case it contains 2.

I think this is what is happening:

- thread 1 calls flush, dirty is set to false, we proceed to #convertToProperties, we create our copy of "temp" with one key/value pair. Then wait.
- thread 2 adds a new property to the table which marks the node as dirty, then flush is called, dirty is set to false, we proceed to #convertToProperties, we create our copy of "temp" with 2 k/v pairs and continue and save the file.
- thread 1 resumes and overwrites the file with the one with 1 k/v pair

So it is just a matter of which #flush finishes last.
Comment 20 DJ Houghton CLA 2011-10-05 15:31:51 EDT
John, please review my patches. I created a branch called bug333726 in both the equinox.bundles repo and the platform.resources repo. These are the commits:

http://git.eclipse.org/c/equinox/rt.equinox.bundles.git/commit/?h=bug333726&id=87de8bd7f8c08c359e7278811db55f9a95c0d1cc

http://git.eclipse.org/c/platform/eclipse.platform.resources.git/commit/?h=bug333726&id=2ad68847ef3ed1b0ab2e593fe2d162f4985f6fec
Comment 21 DJ Houghton CLA 2011-10-06 09:55:03 EDT
Where can I get the latest code? I am trying to install JPA from:
   http://download.eclipse.org/rt/eclipselink/nightly-updates
(got that from the web page)

but it doesn't appear to be on that update site. I'd like to install the latest so I can test out the patch. Thanks.
Comment 22 DJ Houghton CLA 2011-10-06 09:57:15 EDT
Nevermind, just found it at the bottom of the wiki page. (I was looking for it on the project web page instead)
Comment 23 Neil Hauge CLA 2011-10-06 11:12:55 EDT
(In reply to comment #22)
> Nevermind, just found it at the bottom of the wiki page. (I was looking for it
> on the project web page instead)

Are you looking for JPA Tooling or a JPA implementation?  If you are looking in RT, you are getting the implementation, but not tooling.  Just making sure you are getting what you expect.  JPA Tooling is found in the Web Tools repo.
Comment 24 DJ Houghton CLA 2011-10-06 11:28:41 EDT
I was looking for whatever it takes for me to follow the steps in comment 2. :-)  I got it though, thanks.
Comment 25 John Arthorne CLA 2011-10-06 15:37:07 EDT
Fix looks good. The main problem is that convertToProperties creates a copy of the properties outside of any lock. In the resources implementation we then acquire the appropriate scheduling rule and modify the file. We need to be making that copy inside the scheduling rule block (operation) to avoid the case where we write stale information.
Comment 26 DJ Houghton CLA 2011-10-06 16:32:34 EDT
Ok, I'm happy with the fix and have released it. I've run all the tests along with testing the scenario in comment 2 (getting it to fail and then replacing the JARs and it passes), and also confirmed that we won't get the deadlock reported in bug 359485. (again, got the deadlock then replaced the JARs and tested to ensure we didn't get it.
Comment 27 Andrey Loskutov CLA 2012-03-30 11:10:02 EDT
Hi,

I've got time to analyze the problem in bug 375601.
I hope that this could drive the fix forward.

Let's take a look at the stack from the bug 375601.

Thread A
CCoreInternals.savePreferences() tries to save CDT project preferences
!   Thus InstancePreferences lock is taken via synchronized call to EclipsePreferences.flush()
    EclipsePreferences.convertToProperties() is then going through all the children of InstancePreferences
    and asking them recursively to convert to properties too.
    One child of InstancePreferences is instance of ProjectPreferences
    ProjectPreferences calls PreferencesService.shareStrings() from inside the EclipsePreferences.convertToProperties() call
    PreferencesService.shareStrings() calls EclipsePreferences.getChildren() back
    and EclipsePreferences.getChildren() calls EclipsePreferences.getChild()
!   which tries to acquire lock on "this" which is ProjectPreferences instance

Also in "Thread A" InstancePreferences lock taken -> ProjectPreferences lock awaiting

Thread B
PDOMSetupJob.run() tries to create indexer for the project
!   PDOMManager.createIndexer() takes a lock on ProjectPreferences (as workaround for bug 359485)
    and inside the locked code calls IndexerPreferences.getProperties()
    which asks EclipsePreferences(from LocalProjectScope) to get indexer preferences via IndexerPreferences.getLocalPreferences()
    EclipsePreferences (instance of InstancePreferences) then looks for the child node
!   and tries to acquire lock on "this" which is InstancePreferences

Also in "Thread B" ProjectPreferences lock taken -> InstancePreferences lock awaiting

Threads A<->B deadlocked...

IMHO it looks like the change made in bug 333726 to synchronize EclipsePreferences.flush() and following convertToProperties() opened a can of worms. Now flush() is called with a lock held on "this", BUT each child node can also acquire any number of *extra* locks during flush().

I think one part of the problem here is that EclipsePreferences previously used synchronized "this" locks to guard "children" field access only. But unfortunately due the bug 333726 changes the *same* lock is now used for flush() method which does lot of open calls.

So now read/write access to "children" AND flush() with open calls are synchronized with the *same*, *publicly accessible* lock.
This is the root cause of the bug 375601.

Interestingly, internalChildNames() is the only method allowed to access "children" field without holding any lock, (thus internalChildNames() may read *out-of-date* value of "children") which is another potential leak IMHO.

Anyway, what can we do?

One possible solution would be to create a dedicated *private* lock object for
synchronizing EclipsePreferences internally.
Then convert all synchronized methods and all synchronized(this) blocks to use this lock object.
The benefit is that the EclipsePreferences instance lock is not exposed to the client code anymore (CDT) and so there could be less possibilities for 3rd party to deadlock the EclipsePreferences.

Second possibility would be to fix the CDT issue ONLY.
As far as I can see the extra lock on ProjectPreferences inside the PDOMManager code was added to fix the bug 359485.
I don't think this is appropriate fix, as locking on external object you do not own (ProjectPreferences node) is always a bad idea. So we could fix it by simply moving the IndexerPreferences.getProperties() call outside the ProjectPreferences lock held by PDOMManager.createIndexer().
This call is protected by lock on ProjectPreferences ONLY inside the createIndexer() code, but NOT inside the changeIndexer(). So moving it outside the lock would just fix current inconsistency.
Additionally, if there is a need to synchronize both method calls, they use an extra *internal* lock.

However, this would fix ONLY the CDT part of the problem. All other EclipsePreferences clients will still be "open for deadlocks".

What do you think?
Comment 28 John Arthorne CLA 2012-04-05 13:16:57 EDT
(In reply to comment #27)
> One possible solution would be to create a dedicated *private* lock object for
> synchronizing EclipsePreferences internally.
> Then convert all synchronized methods and all synchronized(this) blocks to use
> this lock object.
> The benefit is that the EclipsePreferences instance lock is not exposed to the
> client code anymore (CDT) and so there could be less possibilities for 3rd
> party to deadlock the EclipsePreferences.

This makes sense to me. I have opened equinox bug 376206 for this.

> Second possibility would be to fix the CDT issue ONLY.
> As far as I can see the extra lock on ProjectPreferences inside the PDOMManager
> code was added to fix the bug 359485.
> I don't think this is appropriate fix, as locking on external object you do not own (ProjectPreferences node) is always a bad idea. 

I agree with this too. Client code that is synchronizing on the preference object is a risky pattern too. Likely they are making assumptions about what that object is guarding which could easily change. Is there a CDT bug open for this?
Comment 29 Andrey Loskutov CLA 2012-04-05 15:09:22 EDT
(In reply to comment #28)
> I agree with this too. Client code that is synchronizing on the preference
> object is a risky pattern too. Likely they are making assumptions about what
> that object is guarding which could easily change. Is there a CDT bug open for
> this?
Should be bug 375601, as the plan is to remove the CDT workaround for 3.8.
Comment 30 Martin Oberhuber CLA 2012-04-06 06:30:08 EDT
(In reply to comment #28)

I agree that synchronizing on public objects (or even objects owned by somebody else) is usually a bad idea, so making the lock objects private is good through bug 376206.

Though in my experience, Open Calls from inside a synchronized block are even more dangerous (and Joshua Bloch in "Effective Java" devotes a whole chapter to this - item 49 at pp196ff). Since Open Calls pose risk of deadlock with clients looking totally innocent. They may not lock or synchronize anything by themselves .. they just happen to cause a Thread Switch from inside their code. And, Open Calls are equally dangerous regardless of whether the lock object is public or private.

So just to be sure, have all Open Calls from inside the Preferences synchronized blocks completely been removed by now, or was bug 359698 just the tip of the iceberg ?
Comment 31 John Arthorne CLA 2012-04-09 11:37:28 EDT
(In reply to comment #30)
> So just to be sure, have all Open Calls from inside the Preferences
> synchronized blocks completely been removed by now, or was bug 359698 just the
> tip of the iceberg ?

Well, ProjectPreferences has always obtained a lock when modifying the preferences file in the workspace, and this hasn't changed. Resource change events are broadcast while that lock is held. To get rid of that we would need to stop supporting synchronous resource change events. But from what I can see the EclipsePreferences object lock is no longer held while saving/loading preferences.

Also NodeChangeListeners can still be called during the synchronized flush operation. We can try changing that but it would alter the timing of when these events are broadcast.
Comment 32 John Arthorne CLA 2012-04-09 12:57:55 EDT
(In reply to comment #31)
> Also NodeChangeListeners can still be called during the synchronized flush
> operation. We can try changing that but it would alter the timing of when these
> events are broadcast.

Actually I'm not sure this is possible. EclipsePreferences#internalFlush has this code:

			String[] childrenNames = childrenNames();
			for (int i = 0; i < childrenNames.length; i++)
				node(childrenNames[i]).flush();


and the method node(String) can call preference node listeners if it causes a node to be created, but this shouldn't happen in this case because we are just iterating over existing children.