Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 358301 - [DSTORE] Hang during debug source look up
Summary: [DSTORE] Hang during debug source look up
Status: REOPENED
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.4   Edit
Assignee: David McKnight CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 361000
  Show dependency tree
 
Reported: 2011-09-20 16:19 EDT by Samuel Wu CLA
Modified: 2012-05-23 19:52 EDT (History)
1 user (show)

See Also:


Attachments
Stack trace during when UI was locked (33.08 KB, text/plain)
2011-09-20 16:24 EDT, Samuel Wu CLA
no flags Details
patch to synchronize on maps (8.54 KB, patch)
2011-09-22 10:57 EDT, David McKnight CLA
no flags Details | Diff
Stacktrace (23.22 KB, application/octet-stream)
2011-09-22 16:21 EDT, Samuel Wu CLA
no flags Details
patch to check for outofmemory errors (6.21 KB, patch)
2011-10-14 11:50 EDT, David McKnight CLA
no flags Details | Diff
additional patch to deal with other out of memory cases (10.43 KB, patch)
2011-10-19 14:31 EDT, David McKnight CLA
no flags Details | Diff
a couple more cases (3.21 KB, patch)
2011-11-07 10:49 EST, David McKnight CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Samuel Wu CLA 2011-09-20 16:19:07 EDT
Build Identifier: RSE 3.2 maintenance (RSE-runtime-M20110601-1650.zip)

When the source file was not found for a debug session, the user ran the action Change Text File to find a file on a remote host, the UI was locked up

Reproducible: Sometimes

Steps to Reproduce:
1. Set the source look up to Default and start a debug session
2. In the Source_Not_Found debug editor, run the action Change Text File 
3. Select a remote host and expand its Root directory
4. The UI was locked up
Will attach the stacktrace
Comment 1 Samuel Wu CLA 2011-09-20 16:24:36 EDT
Created attachment 203711 [details]
Stack trace during when UI was locked

A few traces were taken and they all contain the following.

Thread[ModalContext,RUNNABLE,118]
		java.util.HashMap.findNonNullKeyEntry(HashMap.java:525)
		java.util.HashMap.putImpl(HashMap.java:622)
		java.util.HashMap.put(HashMap.java:605)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.convertToHostFile(DStoreFileService.java:1375)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.convertToHostFiles(DStoreFileService.java:1401)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.fetch(DStoreFileService.java:2170)

It looks that the call never returned and the UI was waiting for it.
Comment 2 Samuel Wu CLA 2011-09-20 16:28:04 EDT
The following trace was from another case which never returned as well. Since it was on a non-GUI thread, the UI was not blocked.

Thread[Worker-8,RUNNABLE,40]
		java.util.HashMap.findNonNullKeyEntry(Unknown Source)
		java.util.HashMap.getEntry(Unknown Source)
		java.util.HashMap.containsKey(Unknown Source)
		org.eclipse.rse.subsystems.files.core.subsystems.RemoteFileSubSystem.cacheRemoteFile(RemoteFileSubSystem.java:1275)
		org.eclipse.rse.subsystems.files.core.subsystems.RemoteFileSubSystem.cacheRemoteFile(RemoteFileSubSystem.java:1313)
		org.eclipse.rse.internal.subsystems.files.local.model.LocalFileAdapter.convertToRemoteFiles(LocalFileAdapter.java:59)
		org.eclipse.rse.subsystems.files.core.servicesubsystem.FileServiceSubSystem.list(FileServiceSubSystem.java:578)
		org.eclipse.rse.subsystems.files.core.subsystems.RemoteFileSubSystem.list(RemoteFileSubSystem.java:976)
Comment 3 Samuel Wu CLA 2011-09-20 16:33:12 EDT
Similar problem.

Thread[Worker-0,RUNNABLE,18]
		java.util.HashMap.findNonNullKeyEntry(HashMap.java:526)
		java.util.HashMap.putImpl(HashMap.java:622)
		java.util.HashMap.put(HashMap.java:605)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.convertToHostFile(DStoreFileService.java:1375)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.convertToHostFiles(DStoreFileService.java:1401)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.fetch(DStoreFileService.java:2170)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.list(DStoreFileService.java:2030)
		org.eclipse.rse.subsystems.files.core.servicesubsystem.FileServiceSubSystem.internalList(FileServiceSubSystem.java:379)
		org.eclipse.rse.subsystems.files.core.servicesubsystem.FileServiceSubSystem.list(FileServiceSubSystem.java:571)
		org.eclipse.rse.subsystems.files.core.subsystems.RemoteFileSubSystem.list(RemoteFileSubSystem.java:976)
Comment 4 Samuel Wu CLA 2011-09-20 16:49:35 EDT
The user ran into this problem was on RSE-runtime-M20110316-2215.zip and that worked fine for him.
Comment 5 David McKnight CLA 2011-09-22 10:57:39 EDT
Created attachment 203846 [details]
patch to synchronize on maps

Can you see if this patch helps?
Comment 6 Samuel Wu CLA 2011-09-22 16:20:43 EDT
Thank you for the patch, Dave. I can't actually reproduce the problem myself. I tried to do a source look up in a directory which contains a lot of files. And I got the following problem and the source look up didn't return.

Thread[Worker-11,TIMED_WAITING,47]
		java.lang.Object.wait(Native Method)
		java.lang.Object.wait(Object.java:196)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:372)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:288)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:236)
		org.eclipse.rse.services.dstore.AbstractDStoreService.dsQueryCommand(AbstractDStoreService.java:129)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.getFile(DStoreFileService.java:1270)
		org.eclipse.rse.subsystems.files.core.servicesubsystem.FileServiceSubSystem.updateRemoteFile(FileServiceSubSystem.java:594)
		org.eclipse.rse.subsystems.files.core.servicesubsystem.FileServiceSubSystem.list(FileServiceSubSystem.java:575)
		org.eclipse.rse.subsystems.files.core.subsystems.RemoteFileSubSystem.list(RemoteFileSubSystem.java:977)

The connection was still active and I tried to expand a filter in RSE. But it did return either.
Thread[Worker-13,TIMED_WAITING,90]
		java.lang.Object.wait(Native Method)
		java.lang.Object.wait(Object.java:196)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:372)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:288)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:236)
		org.eclipse.rse.services.dstore.AbstractDStoreService.dsQueryCommand(AbstractDStoreService.java:129)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.fetch(DStoreFileService.java:2187)

I then tried to expand root and it didn't return.
		java.lang.Object.wait(Native Method)
		java.lang.Object.wait(Object.java:196)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:372)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:288)
		org.eclipse.rse.services.dstore.util.DStoreStatusMonitor.waitForUpdate(DStoreStatusMonitor.java:236)
		org.eclipse.rse.services.dstore.AbstractDStoreService.dsQueryCommand(AbstractDStoreService.java:129)
		org.eclipse.rse.services.dstore.AbstractDStoreService.dsQueryCommand(AbstractDStoreService.java:97)
		org.eclipse.rse.internal.services.dstore.files.DStoreFileService.getRoots(DStoreFileService.java:1986)
		org.eclipse.rse.subsystems.files.core.servicesubsystem.FileServiceSubSystem.getRoots(FileServiceSubSystem.java:389)

Something seems to be wrong with the server. 

I'll attach the stack trace
Comment 7 Samuel Wu CLA 2011-09-22 16:21:26 EDT
Created attachment 203865 [details]
Stacktrace
Comment 8 David McKnight CLA 2011-09-23 10:47:10 EDT
Samuel, do you see that stacks you're hitting as the same problem as the one your customer hit?  Is there a way to reproduce this from pure RSE (i.e. without your source lookup mechanism)?
Comment 9 Samuel Wu CLA 2011-09-23 12:17:42 EDT
Hi Dave,
When I tried the Remote Folder source look up with the same directory, it simply returned quickly with the source not found message. But the source file was in a subdirectory of the remote folder. That's why the customer switch to the source look up of our own.
I also did a file search on the same file in the same directory in RSE. It ended up with a connection drop. I didn't see the out of memor message on the server side when I let the server to launch in the foreground.
Comment 10 David McKnight CLA 2011-09-26 13:41:07 EDT
(In reply to comment #9)
> Hi Dave,
> When I tried the Remote Folder source look up with the same directory, it
> simply returned quickly with the source not found message. But the source file
> was in a subdirectory of the remote folder. That's why the customer switch to
> the source look up of our own.
> I also did a file search on the same file in the same directory in RSE. It
> ended up with a connection drop. I didn't see the out of memor message on the
> server side when I let the server to launch in the foreground.

Samuel, it looks like you're described a few different problems.  I'm not sure whether this bug is the place each of these issues.  For whatever is reproducible via RSE, could you provide me with a environment that I could use to hit the problem?
Comment 11 Samuel Wu CLA 2011-10-14 11:23:34 EDT
A bit of further investigation shows that a possible cause of the problem is that the RSE server had run out of memory but the RSE connection didn't drop.  When the user tried to get anything from the GUI thread, it locked up the GUI.

We may want to terminate the dstore server once it runs out of memory.
Comment 12 David McKnight CLA 2011-10-14 11:50:27 EDT
Created attachment 205210 [details]
patch to check for outofmemory errors
Comment 13 David McKnight CLA 2011-10-14 11:52:25 EDT
The attached patch will detect out of memory errors and attempt exit.  It's still possible that in some of those cases, an out of memory error will be hit during the exit.  I've committed the patch to the HEAD stream.  

Do you need this backported?
Comment 14 Samuel Wu CLA 2011-10-14 12:12:51 EDT
Bug 361000 was opened for backporting. Thanks.
Comment 15 David McKnight CLA 2011-10-19 14:31:42 EDT
Created attachment 205555 [details]
additional patch to deal with other out of memory cases
Comment 16 David McKnight CLA 2011-11-07 10:48:55 EST
There are a couple more cases that can be handled.
Comment 17 David McKnight CLA 2011-11-07 10:49:10 EST
Created attachment 206528 [details]
a couple more cases
Comment 18 David McKnight CLA 2011-11-07 10:50:21 EST
I committed the updated patch.
Comment 19 Martin Oberhuber CLA 2011-11-25 05:25:40 EST
Catching away the OutOfMemoryError seems an odd way handling this.

Here are a couple thoughts:

1.) Has it ever been analyzed why the OOME occurs ? The Eclipse Memory Analyzer
    makes it fairly easy to analyze a heap dump. On an Oracle VM, just launch
    with "-vmargs -XX:+HeapDumpOnOutOfMemoryError". For other VM's see
    http://wiki.eclipse.org/MemoryAnalyzer#Getting_a_Heap_Dump

2.) OutOfMemoryError is a subclass of "Error" for which the Java API Docs say:
       "An Error is a subclass of Throwable that indicates serious problems 
        that a reasonable application should not try to catch."
    My understanding is that an OOME should terminate the app automatically.
    Unless the dstore sever catches Error or Throwable somewhere else ?
    I suggest checking whether dstore server catches away errors, since it
    shouldn't do that. Similar errors (eg ThreadDeatch) would otherwise likely
    lead to the same problem we see here.

3.) Note that some VM's allow running a command on OutOfMemoryError. This could
    eg be used to re-start the server ... see -
       -XX:OnError="<cmd args>;<cmd args>"
       -XX:OnOutOfMemoryError="<cmd args>;<cmd args>"
    here:
       http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
    This seems to make more sense than just do a hardcoded exit...

I'm not going to enforce any of these suggestions (Wind River doesn't use dstore) but I'll reopen the bug for comments. Feel free to mark closed again if you think these comments are bogus.
Comment 20 Martin Oberhuber CLA 2012-05-23 19:52:19 EDT
REOPENED doesn't look like a proper state for this, can you look at some of my thoughts and comment ?