Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 251253

Summary: FacetProjectWorkingCopy creating many threads.....
Product: [WebTools] WTP Common Tools Reporter: Chuck Bridgham <cbridgha>
Component: Faceted Project FrameworkAssignee: Konstantin Komissarchik <konstantin>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: ccc, deboer, mbaker
Version: 3.0.3   
Target Milestone: 3.1 M6   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Attachments:
Description Flags
possible patch
none
Stack dump none

Description Chuck Bridgham CLA 2008-10-17 12:02:11 EDT
Looking through a deadlock scenario = I noticed a new thread is being created from a listener to refresh the runtimes list...  shouldn't this be re-using the same thread for this refresh?


Thread [Thread-465] (Suspended)	
	FacetedProjectWorkingCopy.suspendEventNotification() line: 1646	
	FacetedProjectWorkingCopy.access$1(FacetedProjectWorkingCopy) line: 1644	
	FacetedProjectWorkingCopy$1$1.run() line: 209	
Thread [Thread-466] (Suspended)	
	FacetedProjectWorkingCopy.resumeEventNotification() line: 1656	
	FacetedProjectWorkingCopy.refreshTargetableRuntimes() line: 1164	
	FacetedProjectWorkingCopy$1$1.run() line: 219	
Thread [Thread-468] (Suspended)	
	FacetedProjectWorkingCopy.resumeEventNotification() line: 1656	
	FacetedProjectWorkingCopy.refreshTargetableRuntimes() line: 1164	
	FacetedProjectWorkingCopy$1$1.run() line: 219	
Thread [Thread-469] (Suspended)	
	FacetedProjectWorkingCopy.suspendEventNotification() line: 1646	
	FacetedProjectWorkingCopy.access$1(FacetedProjectWorkingCopy) line: 1644	
	FacetedProjectWorkingCopy$1$1.run() line: 209
Comment 1 Chuck Bridgham CLA 2008-10-17 13:27:45 EDT
Created attachment 115431 [details]
possible patch
Comment 2 Chuck Bridgham CLA 2008-10-17 13:28:48 EDT
Using a single "Job" instance for the refresh, I no longer run into this issue...
Is this a possible solution?
Comment 3 Chuck Bridgham CLA 2008-10-20 10:03:40 EDT
Raising this as a hotbug request, we need resolution on this quickly
Comment 4 Konstantin Komissarchik CLA 2008-10-20 10:54:54 EDT
I do not understand the problem scenario here. Do you have a repro? You mention a deadlock, but you didn't attach any thread dumps.

I haven't looked at a patch, but using a Job is not necessarily better than using separate threads. It all depends on specifics. Specifics are also necessary to justify the severity and hotbug request.
Comment 5 Chuck Bridgham CLA 2008-10-20 14:49:01 EDT
Created attachment 115592 [details]
Stack dump
Comment 6 Chuck Bridgham CLA 2008-10-20 15:00:41 EDT
Hi - Sorry, I didn't go into more detail.... 

We have an adopter migration routine, that will detect, and install addition facets to projects if needed.

What we are seeing, is the creation of additional Threads over time, while the migration routine is executing, and they block each other. (looks like the RuntimeManager listener is being called several times a second, creating a new thread each time...)

My suggestion for a "single" Job instance was only to prevent the creation of many threads, and the convenience of the Job api queuing up execution requests on the single instance.

A single Thread instance could be used to do the same thing, although would need to explicitly call join on the running thread instance.
Comment 7 Konstantin Komissarchik CLA 2008-10-20 18:15:09 EDT
Ok. I think I understand the scenario, but I think the true cause of the problem you are seeing is elsewhere. Making threading changes to FacetedProjectWorkingCopy would just mask the problem. 

The key question is what's going on that causes this high rate of events to be generated in the first place. The RuntimeManager only generates these events if it believes that the list of runtimes has changed. Looking at the thread dump, dozens of these events have been generated while the working copy was locked. The question is why? I would guess that this is caused by the runtime bridge. I recommend putting a breakpoint at RuntimeManagerImpl.bridge() method as a way to determine why it thinks the list of runtimes is changing. 
Comment 8 Chuck Bridgham CLA 2008-10-21 10:44:11 EDT
Thanks, I agree some thread contention is going on here besides... and we are pursuing this, but I still think a "single" thread could avoid any out of memory issue.  I'll remove the hotbug request... but leave as major for 303.
Comment 9 Konstantin Komissarchik CLA 2008-10-21 17:34:21 EDT
Ok. Let me know what you find out. Changing threading behavior in a service release is rather risky, so I want to make sure that we understand the problem completely before attempting any changes.
Comment 10 Chuck Bridgham CLA 2008-10-22 11:19:29 EDT
ok we have solved some related issues, but come back to this....

Thread [Thread-9] (Suspended)	
	FacetedProjectWorkingCopy$1.handle() line: 228	
	RuntimeManagerImpl.notifyRuntimeListeners() line: 283	
	RuntimeManagerImpl.bridge() line: 437	
	RuntimeManagerImpl.getRuntimes() line: 156 [local variables unavailable]	
	RuntimeManager.getRuntimes() line: 80 [local variables unavailable]	
	RuntimeManagerImpl$RuntimeValidationThread.run() line: 1189	


Here is an example thread that spawns a new thread because when it goes through the bridge() call, the flag is changed to modified = true because of these blocks:

// Remove the absolete entries.
            
            for( BridgedRuntime r : existing.values() )
            {
                if( ! exported.contains( r.getNativeRuntimeId() ) )
                {
                    runtimes.delete( r.getName() );
                    modified = true;
                }
            }

and

 // Create the new entries.
            
            for( String id : exported )
            {
                if( ! existing.containsKey( id ) )
                {
                    try
                    {
                        final IRuntimeBridge.IStub stub = br.bridge( id );
                        
                        final BridgedRuntime r 
                            = new BridgedRuntime( brid, id, stub );
                        
                        r.setName( createUniqueRuntimeName( id ) );
                        
                        runtimes.add( r.getName(), r );
                        modified = true;
                    }
                    catch( CoreException e )
                    {
                        FacetCorePlugin.log( e );
                    }
                }
            }

It falls into either of these blocks, and generates another thread because of the notification. 

Any idea whats going on here?
Comment 11 Konstantin Komissarchik CLA 2008-10-22 11:32:23 EDT
That code should only be setting the modified flag if the list of runtimes has changed. Is there anything changing the runtimes in your scenario? If not, do you mind stepping into those blocks of code to figure out why the modified flag is being set?
Comment 12 Chuck Bridgham CLA 2008-10-22 17:17:20 EDT
We verified adopter code was adding a server vm, but then had problems during install, and then removed....

All of the fiddling with the vm list caused the notifications....

This is no longer a stop-ship for us, but I still changing this design for a single threaded request queue would be best in the long run...
Comment 13 Konstantin Komissarchik CLA 2008-10-22 17:21:38 EDT
Ok. Lowering the severity and adding to the 3.1 queue.
Comment 14 Konstantin Komissarchik CLA 2008-10-22 17:23:14 EDT
According to my understanding of the problem, I don't believe there is any relationship to Bug 225207 so removing the dependency. Please correct me if I missed something.
Comment 15 Matt Baker CLA 2009-02-17 15:28:29 EST
There is a bug in this code that I believe causes all the threads to be created.  
In this method the bit that looks like

runtimes.contains( name) 

should instead be

runtimes.containsKey( name )

RuntimeManagerImpl

    private static String createUniqueRuntimeName( final String suggestion )
    {
        String name = suggestion;
        
        for( int i = 1; runtimes.contains( name ); i++ )
        {
            name = suggestion + " (" + i + ")"; //$NON-NLS-1$ //$NON-NLS-2$
        }
        
        return name;
    }

When the bridge code posted above gets called it tries to generate a unique name for a runtime returned by a bridge.  When the bridge returns a name that already exists and the createUniqueRuntimeName() method is called it ALWAYS fails and returns the same name.

After the bridged runtimes are added, this whole thing gets called once more since the modified flag is set and fires the refresh event, but then fails again because this line in the bridge() method:

 if( ! existing.containsKey( id ) ) 

fails because unique names are not unique due to the failure of createUniqueName()

This causes the the call back into the bridge to fetch the bridged runtimes again which then once again resets the modified flag which fires the event which refreshes the runtimes....

I could be wrong in my analysis.  But the above method is broken either way.
Comment 16 Konstantin Komissarchik CLA 2009-02-18 11:14:07 EST
Thanks, Matt. I don't know if the problem that you describe in Comment #15 is related to the original problem reported here, but that code is clearly wrong.

I released a fix to correct the problem that Matt has identified and changed the available runtimes refresh thread to a Job like Chuck has suggested.

Changed released to 3.1 M6 code stream.