Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 301330

Summary: OSX Cocoa - Things get slower over time when running long sequence of tests
Product: [Eclipse Project] Platform Reporter: Francis Upton IV <francisu>
Component: SWTAssignee: Platform-SWT-Inbox <platform-swt-inbox>
Status: CLOSED WONTFIX QA Contact:
Severity: normal    
Priority: P3 CC: cocoakevin, lshanmug, sgandon, Silenio_Quarti, skovatch
Version: 4.3Keywords: triaged
Target Milestone: ---   
Hardware: PC   
OS: Mac OS X   
Whiteboard: stalebug
Attachments:
Description Flags
Several stack traces
none
More stack traces later in the same run
none
Test class that uses readAndDispatch()
none
Traces on 4.3M7
none
UI test program that shows the leak none

Description Francis Upton IV CLA 2010-01-30 03:57:58 EST
I have a test suite with a couple of thousand tests for my RCP application.  On Windows and Linux it runs in about an hour. On the Mac it takes over 6 hours.

In trying to track down the problem, it seems the tests get slower over time, like there is some kind of leak. I have not tried profiling yet, but I have captured a bunch of stack traces during the execution of the test to see if I can find a pattern; they are attached.

The way my tests work is (unlike the org.eclipse.ui tests) the window running the RCP app is normally not visible through most of the tests (to allow me to work on the machine when the tests are running). Not sure if this is a factor or not.

I'm happy to add any diagnostic code necessary to track down this problem.
Comment 1 Francis Upton IV CLA 2010-01-30 04:06:59 EST
Created attachment 157688 [details]
Several stack traces

Also, this is on a 10.6 system running with a 32-bit Java 1.5 VM which was taken from a 10.5 system.

The reason I do this that what my product uses Hibernate which I build from source and Hibernate does not build on JDK 1.6.

I saw the posting about the problems associated with using a 10.5 JRE 1.5 JVM on a 10.6 system and as far as I can tell they have to do with AWT use.  I ran into these when my RCP application would not even boot because it hung looking for the AWT libraries. I fixed the application to remove all use of AWT and I have not had any problems with the app at all, so I do not think this current problem is related to that issue.

The Eclipse version is 3.5.1.
Comment 2 Francis Upton IV CLA 2010-01-30 04:23:49 EST
Created attachment 157689 [details]
More stack traces later in the same run

They are starting to look more the same, and the run is getting even slower, tests cases taking about 5x as they would on the other platforms.
Comment 3 Kevin Barnes CLA 2010-02-02 10:10:47 EST
Francis, do you call Display#readAndDispatch during your test runs? We create and dispose NSAutoreleasePool objects in that function.
Comment 4 Francis Upton IV CLA 2010-02-02 11:42:57 EST
(In reply to comment #3)
> Francis, do you call Display#readAndDispatch during your test runs? We create
> and dispose NSAutoreleasePool objects in that function.

Yes I do, I use it when I want to make sure all outstanding events have been processed. I call it between test cases and at other times. Is it bad that I do that, should I be doing something else?
Comment 5 Kevin Barnes CLA 2010-02-02 13:20:04 EST
>  Is it bad that I do that, should I be doing something else?

No, that's good. I was going to suggest adding that code if you didn't have it. Cocoa has a ref counting system for memory management. When you create objects you have the option of putting them into an NSAutoreleasePool instead of managing the releasing of the object yourself. The unfortunate side affect of this is that you have to release your pools frequently, or the objects in them stick around. In SWT we manage the pools for you in read and dispatch, and Shell#setWindowVisible. 
We have seen the behavior you describe while running tests when either readAndDispatch isn't called, or it's only called on a modal event loop and shells are never opened.

Is your test code open source?
Comment 6 Francis Upton IV CLA 2010-02-02 13:25:21 EST
Created attachment 157935 [details]
Test class that uses readAndDispatch()

> Is your test code open source?

Well the one class that calls it is now open source (per the terms of the bugzilla agreement). Though it depends on a bunch of other junk. I have no objection to releasing it as open source, it's just a question of where and packaging it. Hopefully from this class though you can find what you need.
Comment 7 Francis Upton IV CLA 2010-02-02 13:28:55 EST
(In reply to comment #5)
> >  Is it bad that I do that, should I be doing something else?
 In SWT we manage the pools for you in read and dispatch, and
> Shell#setWindowVisible. 

This is likely the problem; with my tests hundreds of test cases are run and the window is never set to be visible, so this pool then is accumulating all of this stuff. Is there something I can call in my tests to clean this out? I would imagine this problem is going to show up in other situations similar to mine where testing is done in a "sort of headless" way. Or other situations where the heuristic that SWT uses to reset the pool does not match what the app is doing, so having this documented and having explicit control would be good. On the other hand, it's a platform dependent thing which is bad. Of course none of this is telling you anything you don't know so I will stop now. :)
Comment 8 Kevin Barnes CLA 2010-02-02 15:27:32 EST
From the stack traces, it looks like your tests are run in asyncExec. Do you have one asyncExec per test, or one that runs all of your tests? Can you try adding code like "shell.setVisible(true); shell.setVisible(false);" to your test code somewhere?
Comment 9 Francis Upton IV CLA 2010-02-02 16:48:57 EST
(In reply to comment #8)
> From the stack traces, it looks like your tests are run in asyncExec. Do you
> have one asyncExec per test, or one that runs all of your tests? Can you try
> adding code like "shell.setVisible(true); shell.setVisible(false);" to your
> test code somewhere?
I think the platform is started with one big asyncExec, and all of my tests run on the UI thread under that (I just do the same thing the Eclipse UI tests do, use the Junit plugin support to run the tests).

I tried adding a setVisible() as you suggested between each test case, and the problem still seems to happen, I'm getting the same stack traces as before, and things progressively slow down. And I can see that the setVisible()s are getting called because the shell becomes visible each time (which sort of defeats the purpose of having it be non-visible in the first place, so it's not really a workable solution for me). One alternative I can try (not sure if it would work) is to use the vfb stuff that Paul (and I use) on Linux. But I don't know if the Mac uses the X stuff or not.
Comment 10 Francis Upton IV CLA 2013-05-16 16:58:21 EDT
This happens on 3.6.1 as well, and it's getting important to me to fix it as I want to move my development to the Mac. I would be happy to put some counters or tracing in the code to try and track this down since I can easily make it happen.
Comment 11 Francis Upton IV CLA 2013-05-18 16:32:59 EDT
Created attachment 231185 [details]
Traces on 4.3M7

And here are some stack traces for 4.3M7, which are substantially the same as they were in 3.6.1. I just did a kill -3 a few times and it's always in msg_send, and the amount of non-Java memory that is consumes seems to grow quite a bit. It's running a suite of about 1400 test cases, and it seems to gradually slow down as it gets further along into the suite, tests that normally take a second to run are running in 3-4 seconds after the first couple of hundred tests and it gets worse.

I have the XCode instrumentation tools and have been profiling a little bit with those. One thing I noticed in my testing with 3.8.2 is that the number of SWTCanvasView objects (as shown by the OSX instruments) seemed to be about twice or more than the number of SWTCanvasView objects as shown in JProfiler memory view. The ones shown in JProfiler roughly corresponded to the Composite objects and seemed to be within normal limits. The ones showing in OSX instruments seemed to be growing much beyond the active Java objects. I'm going to do a complete run with the OSX instrumentation turned on to try and further characterize this.

I would be happy for any guidance about what I can look for using the Mac profiling tools which seem to be pretty good.
Comment 12 Francis Upton IV CLA 2013-05-18 16:40:21 EDT
Additional data: if I run one portion of the tests my themselves and do a bunch of kill -3 so get the tracing, it's often in the normal Java code (not in a msg send) which is what I would expect.
Comment 13 Francis Upton IV CLA 2013-05-23 12:38:16 EDT
More progress. I have been focused on SWTCanvasView which is clearly leaking when I compare the counts with JProfiler. I have since instrumented the SWT source to show the number of these objects which should be allocated (by having a static count of when they are allocated in Composite.createHandle() and released in Control.releaseHandle(). 

When I do this running a Snippet (like Snippet98) is behaves as expected when looking at the Allocations instrument, the retainCount field of the SWTCanvasView, and the static counter I added. No matter how many composites I create using that snippet, all of the counts line up.

However, when running in my tests, the retainCount and the count according to the Allocations instrument get larger over time which is a leak. The number of Composite objects are not getting larger, so this indicates that something is retaining the SWTCanvasView outside of the lifetime of the Composite object (I'm assuming the Composite goes away at releaseHandle(), I guess I can check this with the Allocations instrument). 

I also put instrumentation around the addPool/removePool and verified they were being called a lot. And I went further and made it only use one pool all the time (which was removed in removePool) to make sure this was not a pool leak issue. (I added a Shell.setVisible(false) and then true between each test case.

I'm now going to try to make a small test case (using my tests) that shows the problem in a short time to try and narrow it down.

Does anyone have an idea about why an SWTCanvasView would have a lifetime longer than its enclosing Composite?
Comment 14 Francis Upton IV CLA 2013-05-23 17:37:08 EDT
I have been able to reproduce the problem with a loop closing and recreating one of the EditorParts associated with my product. According to JProfiler, and my own instrumentation added to the SWT code, the Composite and SWTCanvasView are completely stable as editors are closed. However, in the allocations instrument, the SWTCanvasView is growing each time by a considerable amount, much higher than one object per editor (I have not calculated exactly).

When releasing the handle (the code in Control), I check the retainCount, and in in the loop creating the editors, this only is higher than 2 (it's a 3) consistently according to the stack trace below, this corresponds to releasing the main Composite associated with the EditorPart (this holds several controls that implement the editor).

So it's clear the Java objects Composite and SWTCanvasView are being correctly garbage collected and there is no leak there, but the corresponding native object to SWTCanvasView is leaking quite a lot in this situation.

Next step is to try and find out what things can use and retain a reference to the SWTCanvasView object. Any ideas or help is appreciated.


Daemon Thread [Thread-1] (Suspended (breakpoint at line 2847 in Control))	
	owns: RunnableLock  (id=212)	
	Composite(Control).releaseHandle() line: 2847	
	Composite(Scrollable).releaseHandle() line: 311	
	Composite(Control).destroyWidget() line: 1037	
	Composite(Widget).release(boolean) line: 1310	
	Composite(Control).release(boolean) line: 2836	
	Composite(Widget).dispose() line: 663	
	EditorPane(PartPane).dispose() line: 182	
	EditorReference(WorkbenchPartReference).dispose() line: 681	
	WorkbenchPage.disposePart(WorkbenchPartReference) line: 1810	
	WorkbenchPage.handleDeferredEvents() line: 1514	
	WorkbenchPage.deferUpdates(boolean) line: 1498	
	WorkbenchPage.closeEditors(IEditorReference[], boolean) line: 1472	
	WorkbenchPage.closeEditor(IEditorPart, boolean) line: 1536	
	Editor$14.run() line: 1724	
	UISynchronizer(Synchronizer).syncExec(Runnable) line: 180	
	UISynchronizer.syncExec(Runnable) line: 150	
	Display.syncExec(Runnable) line: 4607	
	Editor(AppEditorImpl).displaySyncExec(Runnable) line: 2100	
	Editor.closeMainNode(BaseMainNode, boolean) line: 1707	
	MemoryLeakTest.testSwtLeak() line: 36	
	NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
	NativeMethodAccessorImpl.invoke(Object, Object[]) line: 39	
	DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25	
	Method.invoke(Object, Object...) line: 597	
	MemoryLeakTest(TestCase).runTest() line: 168	
	MemoryLeakTest(TransformTestBase).runTest() line: 1168	
	MemoryLeakTest(TestCase).runBare() line: 134	
	TestResult$1.protect() line: 110	
	TestResult.runProtected(Test, Protectable) line: 128	
	TestResult.run(TestCase) line: 113	
	MemoryLeakTest(TestCase).run(TestResult) line: 124	
	TestSuite.runTest(Test, TestResult) line: 243	
	TestSuite.run(TestResult) line: 238	
	SuiteMethod(JUnit38ClassRunner).run(RunNotifier) line: 83	
	JUnit4TestClassReference(JUnit4TestReference).run(TestExecution) line: 50	
	TestExecution.run(ITestReference[]) line: 38	
	RemotePluginTestRunner(RemoteTestRunner).runTests(String[], String, TestExecution) line: 467	
	RemotePluginTestRunner(RemoteTestRunner).runTests(TestExecution) line: 683	
	RemotePluginTestRunner(RemoteTestRunner).run() line: 390	
	RemotePluginTestRunner.main(String[]) line: 62	
	PlatformUITestHarness$1.run() line: 47	
	RunnableLock.run() line: 35	
	UISynchronizer(Synchronizer).runAsyncMessages(boolean) line: 135	
	Display.runAsyncMessages(boolean) line: 3946	
	Display.readAndDispatch() line: 3623	
	Workbench.runEventLoop(Window$IExceptionHandler, Display) line: 2701	
	Workbench.runUI() line: 2665	
	Workbench.access$4(Workbench) line: 2499	
	Workbench$7.run() line: 679	
	Realm.runWithDefault(Realm, Runnable) line: 332	
	Workbench.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 668	
	PlatformUI.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 149	
	IDEApplication.start(IApplicationContext) line: 124	
	UITestApplication(NonUIThreadTestApplication).runApp(Object, IApplicationContext, String[]) line: 54	
	UITestApplication.runApp(Object, IApplicationContext, String[]) line: 41	
	UITestApplication(NonUIThreadTestApplication).start(IApplicationContext) line: 48	
	EclipseAppHandle.run(Object) line: 196	
	EclipseAppLauncher.runApplication(Object) line: 110	
	EclipseAppLauncher.start(Object) line: 79	
	EclipseStarter.run(Object) line: 353	
	EclipseStarter.run(String[], Runnable) line: 180	
	NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
	NativeMethodAccessorImpl.invoke(Object, Object[]) line: 39	
	DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25	
	Method.invoke(Object, Object...) line: 597	
	Main.invokeFramework(String[], URL[]) line: 629	
	Main.basicRun(String[]) line: 584	
	Main.run(String[]) line: 1438	
	Main.main(String[]) line: 1414
Comment 15 Francis Upton IV CLA 2013-05-24 18:07:29 EDT
Created attachment 231506 [details]
UI test program that shows the leak

Attached is a JUnit test case adapted from the Eclipse Platform UI test suite that demonstrates the problem:

1) Check out the platform.ui git repo and import the projects
2) Copy this file into the org.eclipse.ui.tests.api package in Eclipse UI Tests
3) Start the test with JProfiler (or the like), then start the Allocation instrument on the test

In your profiler you will see the number of SWT Composite objects stabilize at around 26, but when you look the Allocation instrument you will see the corresponding SWTCanvasView objects monotonically increasing for each editor created, showing the leak.
Comment 16 Francis Upton IV CLA 2013-05-25 04:01:25 EDT
So now I'm a bit stuck and need some help. My next thought is to put a breakpoint at retain() in SWTCanvasView. To do this, I would need to add the retain method in the Java and then generate the JNI, but in trying this, I can't find out where there is any JNI for this class. It does seem from the profiling that this class exists both in C and Java, but I can't figure out how it exists in C.

Can someone guide me a little?
Comment 17 Lakshmi P Shanmugam CLA 2013-06-04 07:52:33 EDT
(In reply to comment #16)
> So now I'm a bit stuck and need some help. My next thought is to put a
> breakpoint at retain() in SWTCanvasView. To do this, I would need to add the
> retain method in the Java and then generate the JNI, but in trying this, I
> can't find out where there is any JNI for this class. 

You can either copy or override the retain method from NSObject to SWTCanvasView.java. No need to generate any JNI. NSObject.retain() uses OS.objc_msgSend(this.id, OS.sel_retain) and the JNI for the method already exists.
Comment 18 Francis Upton IV CLA 2013-06-28 03:05:12 EDT
(In reply to comment #15)
> Created attachment 231506 [details]
> UI test program that shows the leak
> 
> Attached is a JUnit test case adapted from the Eclipse Platform UI test
> suite that demonstrates the problem:
> 
> 1) Check out the platform.ui git repo and import the projects
> 2) Copy this file into the org.eclipse.ui.tests.api package in Eclipse UI
> Tests
> 3) Start the test with JProfiler (or the like), then start the Allocation
> instrument on the test
> 
> In your profiler you will see the number of SWT Composite objects stabilize
> at around 26, but when you look the Allocation instrument you will see the
> corresponding SWTCanvasView objects monotonically increasing for each editor
> created, showing the leak.

Lakshmi, can you try this program to duplicate the problem? I'm still having it.
Comment 19 Lakshmi P Shanmugam CLA 2017-07-03 07:46:15 EDT
Bug triaged, visit https://wiki.eclipse.org/SWT/Devel/Triage for more
information.
Comment 20 Eclipse Genie CLA 2019-09-07 17:23:24 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.