Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 542923 - Initialization of Display and other classes derived from Device is broken
Summary: Initialization of Display and other classes derived from Device is broken
Status: CLOSED WONTFIX
Alias: None
Product: Platform
Classification: Eclipse Project
Component: SWT (show other bugs)
Version: 4.11   Edit
Hardware: PC All
: P3 major with 2 votes (vote)
Target Milestone: ---   Edit
Assignee: Platform-SWT-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-19 11:14 EST by Gregor Schmid CLA
Modified: 2020-12-28 02:40 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gregor Schmid CLA 2018-12-19 11:14:55 EST
I'm revisiting Bug 540762 that got a workaround for SWT 4.10.

Here's more background about what the root cause of that bug was and
why I think that it needs a proper fix. I'm sorry if this is all a bit
verbose, sounds uppity or preachy - I'm trying to help and to capture
the whole picture.

From Bug 540762

java.lang.NullPointerException
	at org.eclipse.swt.widgets.Display.getMessageCount(Display.java:2010)
	at org.eclipse.swt.widgets.Display.foregroundIdleProc(Display.java:1314)
	at org.eclipse.swt.internal.win32.OS.VtblCall(Native Method)
	at org.eclipse.swt.widgets.Display.init(Display.java:2688)
	at org.eclipse.swt.graphics.Device.<init>(Device.java:141)
	at org.eclipse.swt.widgets.Display.<init>(Display.java:469)
	at org.eclipse.swt.widgets.Display.<init>(Display.java:460)
	at org.eclipse.ui.internal.Workbench.createDisplay(Workbench.java:788)
        ...

I got that too. This started to happen after a Windows Insider Preview
update. I can reproduce the issue with SWT 4.8, 4.9, and 4.10 up to
Milestone 1, the workaround of Bug 540762 fixed it for me in 4.10 M3.

This means that SWT 4.8 and 4.9 are going to remain broken on Windows
Insider Previews and generally after the next major Windows update in
spring 2019 because this is an SWT issue, not a Windows bug.

So let's take a quick look at the code of Display.java and Device.java
as seen in the stacktrace above.

Device constructor:

public Device(DeviceData data) {
	synchronized (Device.class) {
	        ... other stuff ...
		init ();
	}
}

Device.init:

/**
 * Initializes any internal resources needed by the
 * device.
 * <p>
 * This method is called after <code>create</code>.
 * </p><p>
 * If subclasses reimplement this method, they must
 * call the <code>super</code> implementation.
 * </p>
 *
 * @see #create
 */
protected void init () {
	  ...
}

Display constructor (trivial):

public Display (DeviceData data) {
	super (data);
}


Display.init();

protected void init () {
	super.init ();

	... 

	[1]
	/* Create the idle hook */
	foregroundIdleCallback = new Callback (this, "foregroundIdleProc", 3); //$NON-NLS-1$
	foregroundIdleProc = foregroundIdleCallback.getAddress ();
	if (foregroundIdleProc == 0) error (SWT.ERROR_NO_MORE_CALLBACKS);
	idleHook = OS.SetWindowsHookEx (OS.WH_FOREGROUNDIDLE, foregroundIdleProc, 0, threadId);

        ... 

	[2]
	if (appName != null) {
		/* Delete any old jump list set for the ID */
		long /*int*/ [] ppv = new long /*int*/ [1];
		int hr = OS.CoCreateInstance (TaskBar.CLSID_DestinationList, 0, OS.CLSCTX_INPROC_SERVER, TaskBar.IID_ICustomDestinationList, ppv);
		if (hr == OS.S_OK) {
			/*ICustomDestinationList::DeleteList*/
			OS.VtblCall (10, ppv [0], appName);
			/*IUnknown::Release*/
			OS.VtblCall (2, ppv [0]);
		}
	}
       
        ...
}


The problem is that the init() method is called from the superclass
constructor, which is kind of an anti-pattern. When a Display Object
is created, the first thing that happens is that the Device
constructor is called. Contrary to expectations this is done _before_
the non-static member variables of Display get initialized! The Device
constructor calls the overloaded init method, so Display.init() is
called on a mostly uninitialized Display object. Specifically the
following non-static members of Display are still null/false at this
point:

	MSG hookMsg = new MSG ();
	boolean runDragDrop = true, dragCancelled = false;
	...
	Synchronizer synchronizer = new Synchronizer (this);
	Consumer<RuntimeException> runtimeExceptionHandler = DefaultExceptionHandler.RUNTIME_EXCEPTION_HANDLER;
	Consumer<Error> errorHandler = DefaultExceptionHandler.RUNTIME_ERROR_HANDLER;
	boolean runMessagesInIdle = false, runMessagesInMessageProc = true;
	...
	long /*int*/ nextTimerId = SETTINGS_ID + 1;
	...
	byte [] keyboard = new byte [256];
	...
	Cursor [] cursors = new Cursor [SWT.CURSOR_HAND + 1];
	...
	Widget [] skinList = new Widget [GROW_SIZE];

The reason for the exception above is, of course, that synchronizer is
still null. In the Display.init() code shown above the idleCallback is
registered first (c.f. [1]) and later an OS.VtblCall is made (c.f.
[2]). What changed in the Windows Insider Preview update is simply
that Windows now triggers an idle callback during the vtbl call where
it did not do so before. This is perfectly legal and will likely stay
that way.

So why are SWT 4.7 and older not directly affected?

Up to 4.7 the foregroundIdleProc method started with

		if (runMessages && getMessageCount () != 0) {
		   ...

where runMesages was another non-static member:

 	boolean runMessages = true, ...

So in the not-yet-initialized Display object runMessages was false
until after the init() call when the Display constructor has time to
finish. For some reason the runMessages member was removed in 4.8. It
may have been an earlier workaround for the exact same problem, so
removing it was a mistake, but the whole anti-pattern is so
non-obvious that it must have seemed natural.

I think that this is a time-bomb and that we are lucky things didn't
get worse. That SWT versions 4.8 and 4.9 on Windows will probably
remain broken (I don't know what the update policy is for this case)
is bad enough. A similar trivial change might break Display
initialization on any OS. Besides, users implementing derived classes
will inadvertently make the same mistakes.

To make a long story short - what should be done about it?

Given that the classes are public and derived classes might exist in
used code, the Device.init() pattern cannot be changed.

I propose to document the issue with a short comment in the affected
places of the code and move all non-static member initialization of
the affected classes to the beginning of their init methods.

I'll gladly assist but cannot lead the effort as I have no experience
yet with the SWT development process.

Besides, a code review for similar occurrences of this anti-pattern
might not be a bad idea...
Comment 1 Mickael Istria CLA 2018-12-20 04:53:30 EST
Would you be able to write a code snippet that reproduces the issue?
Comment 2 Gregor Schmid CLA 2018-12-20 06:16:08 EST
Code snippet: Yes and no. There is no easy way to crash this deliberately. To reproduce the referenced bug 540762 all you need to do is start Eclipse 4.8 on a current Windows insider build (my version is 18298) and it is very likely that you'll get the error described in 540762.

I already provided the analysis for the cause of the error. If you want some code to run, edit the init() method of Display.java and add a print statement like:

protected void init () {
	super.init ();
	System.err.println("synchronizer value in init: " + synchronizer);
	...

You will get null. You can do it for all other non-static members I already listed and get the same result. My point is that it is extremely risky to run the init() method with all its side-effects without initializing the members first. Bug 540762 is proof of that. Moving member initialization into the init method will prevent similar effects in the future.

Here's what it could look like:

Change Display member initialization from


	MSG hookMsg = new MSG ();
	boolean runDragDrop = true, dragCancelled = false;
	...
	Synchronizer synchronizer = new Synchronizer (this);
	...etc.


to

	/* Note: Non-static member initialization must happen in init(). */

	MSG hookMsg;
	boolean runDragDrop, dragCancelled;
	...
	Synchronizer synchronizer;
	...etc.

and extend the init() method:

protected void init () {
	super.init ();

	/* Non-static member initialization must happen here. */
	hookMsg = new MSG ();
	runDragDrop = true;
	dragCancelled = false;
	...
	synchronizer = new Synchronizer (this);
	...etc.


Does that help?
Comment 3 Gregor Schmid CLA 2019-01-07 09:40:01 EST
Good news: After the recent update to Windows Insider Preview 18309 I can no longer reproduce the issue with SWT 4.8 and 4.9. We're going to keep our eyes open and I'll add a note if the situation changes again.

Independent of that I still think it would be a good idea to fix the issue for future SWT versions so that Display initialization is safe independent of event timing and other non-forseeable changes at OS level.
Comment 4 Eclipse Genie CLA 2020-12-28 02:40:10 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.