Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 338393

Summary: IndirectSet.clear() violates Set contract which may lead to data inconsistencies
Product: z_Archived Reporter: Patric Rufflar <patric>
Component: EclipselinkAssignee: Nobody - feel free to take it <nobody>
Status: CLOSED FIXED QA Contact:
Severity: critical    
Priority: P3 CC: andrei.ilitchev, tom.ware
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Attachments:
Description Flags
JavaSE Testcase demonstrating issue
none
Suggested patch - always clear the delegate set none

Description Patric Rufflar CLA 2011-02-28 06:07:20 EST
Build Identifier: EclipseLink 2.1.2

IndirectSet violates the Set.clear() contract:
"Removes all of the elements from this set (optional operation). This set will be empty after this call returns (unless it throws an exception)."

But this happen if the hashCode of an added entity changes. 
Reason: IndirectSet.clear() relies on the Iterator.remove() and not on the delegate's clear() method.

As an result, entities may not be deleted from the database.

Reproducible: Always

Steps to Reproduce:
1. Consider two entites Child and Parent which define a 1:n relation (OneToMany mapping). Parent's Set defines a CascadeType.PERSIST
2. Add Child to the Parent, commit transaction
3. Retrieve child from Parent's Set
4. Change child in a way that its hashCode changes
5. call EntityManager.remove(child)
6. invoke Parent's Set.clear() -> entities are not removed
7. commit transaction -> child will be undeleted and hence not removed
Comment 1 Tom Ware CLA 2011-03-18 09:34:52 EDT
Setting target and priority.  See the following page for the meanings of these fields:

http://wiki.eclipse.org/EclipseLink/Development/Bugs/Guidelines

Community: Please vote for this bug if it is important to you.  Votes are one of the main criteria we use to determine which bugs to fix next.
Comment 2 Patric Rufflar CLA 2012-01-28 11:47:07 EST
Any chance to get this fixed in 2.3.3 (or at least 2.4.0?).
It's almost a year this critical bug is open, and AFAIK I think the fix will be a one-liner...
Comment 3 Tom Ware CLA 2012-01-30 15:40:00 EST
We took another look.  Unfortunately, this is not as small a fix as it appears to be from the outside.  We have some change-tracking code that has to be notified when the list is cleared -> that is the reason for the use of the iterator to remove instead of the delegate's clear method.

We will consider targetting for either 2.3.3 or 2.4.0.
Comment 4 David Minsky CLA 2012-01-30 15:49:30 EST
Created attachment 210282 [details]
JavaSE Testcase demonstrating issue

Note that in order to reproduce issue, Child.hashCode and Child.equals have been implemented. Issue did not appear to reproduce without these modifications.
Comment 5 Tom Ware CLA 2012-01-30 15:54:46 EST
Is your underlying collection a HashSet?  If so, changing the hash code of an object after it is in the set is likely to cause problems even with the java implementation - you'll change the bucket the object is in after it has been put in the set.
Comment 6 Tom Ware CLA 2012-01-30 16:07:51 EST
The following illustrates the peril of changing the hashcode of an object already in a hashset:

public class Data {

    public String name;
    
    public boolean equals(Object data){
        if (data == null){
            return false;
        }
        if (!(data instanceof Data)){
            return false;
        }
        return ((Data)data).name.equals(this.name);
    }
    
    public int hashCode(){
        return name.hashCode();
    }
    
}

public class HashSetTest {

    
    public static void main(String[] args){
        Data data = new Data();
        data.name = "1";
        
        HashSet<Data> set = new HashSet<Data>();
        set.add(data);
        
        data.name = "2";
        
        set.remove(data);
        
        System.out.println(set.size());
    }
}

Remove fails here and the object remains in the set -> output is 1
Comment 7 Patric Rufflar CLA 2012-01-30 16:33:50 EST
Hi Tom,

you're a right, this issue only appears when using a HashSet and if the
underlying hashCode() in the object changes.
Please keep in mind that some people use equals()/hashCode() on the business id
or primary key which may change during a transaction.
( http://stackoverflow.com/questions/5031614/the-jpa-hashcode-equals-dilemma ) 

Nevertheless, this does not violate the Object hashCode() contract.

But in contrast to Set.clear() - which always has to remove all elements.
Comment 8 Tom Ware CLA 2012-01-31 08:32:21 EST
In general, the primary key of an object should not change (except when it is first assigned with sequencing)  As such, it is a good candidate for use in the Hashcode function.  Using other parts of an object as part of the hashcode is more risky as they are more prone to change.

Is your issue with a hashcode function that uses only elements of the primary key?  It is not evident from the description above if so.  If it is, please provide more details about the Entities and mappings involved.
Comment 9 Patric Rufflar CLA 2012-01-31 09:17:29 EST
>In general, the primary key of an object should not change (except when it is
>first assigned with sequencing) 
Indeed. But this is exactly the (most important) case.

Imagine a hashCode() implementation based on the PK, and you put such an entity into an IndirectSet before persisting it (beside the fact if this is good style or not - it may happen) - a subsequent clear() will not only keep the element in the Set but even more fatal it puts the EntityManager in an inconsistent state.

The inconsistency causes that even if you call em.remove() on that entity within the same transaction - the entity will still be persisted, hence commiting wrong data.
Comment 10 Tom Ware CLA 2012-01-31 09:26:10 EST
Is that the issue you are seeing?
Comment 11 Patric Rufflar CLA 2012-01-31 10:01:46 EST
Yes (see also initial bug description).
Comment 12 Tom Ware CLA 2012-01-31 10:06:44 EST
Can you please provide some example code including the relevant mappings on your entities and some pseudo code for what you are doing.
Comment 13 Patric Rufflar CLA 2012-01-31 10:26:27 EST
I cannot provide "working" source code right now at a short term,
but you just need to do the following:

1. Create/use a persistence unit which contains an entity ParentEntity and an entity ChildEntity. The ChildEntity defines a hashCode()/equals() method only based on its primary key, a Long.

2. The ParentEntity defines a OneToMany ChildEntity. This mapping is realized using a Set:

  @OneToMany(mappedBy = "parent_", cascade = { CascadeType.PERSIST,
      CascadeType.MERGE })
  private Set<ChildEntity>           childSet_;

  public Set<ChildEntity> getChildSet() {return childSet_;}


3. Create an EntityManager of this persistence unit, begin a transaction.
4. Create a new instance of ChildEntity, child.
5. Create a new instance of ParentEntity, parent.
6. Add the child to the parent's Set, parent.getChildSet().add(child)
7. persist the child, em.persist(child)
8. call clear() on the set, em.getChildSet().clear()
9. remove the child from the persistence context, em.remove(child)
10. commit the transaction. The child will be inserted to the database anyhow
-> wrong data is written to the database
Comment 14 Patric Rufflar CLA 2012-01-31 10:30:51 EST
Due to that editing of comments is not possible, I've got to correct myself:

Maybe step 5. needs to be corrected to this:

5. Fetch an existing instance of ParentEntity using em.find(), store the reference in the variable parent.
Comment 15 Tom Ware CLA 2012-02-01 09:36:17 EST
Targetting for investigation in 2.3.3
Comment 16 Andrei Ilitchev CLA 2012-05-03 16:40:02 EDT
Created attachment 215026 [details]
Suggested patch - always clear the delegate set
Comment 17 Andrei Ilitchev CLA 2012-05-03 16:55:14 EDT
Checked the fix into both trunk (2.4) and 2.3.3.
Comment 18 Eclipse Webmaster CLA 2022-06-09 10:28:02 EDT
The Eclipselink project has moved to Github: https://github.com/eclipse-ee4j/eclipselink