Community
Participate
Working Groups
I'm revisiting Bug 540762 that got a workaround for SWT 4.10. Here's more background about what the root cause of that bug was and why I think that it needs a proper fix. I'm sorry if this is all a bit verbose, sounds uppity or preachy - I'm trying to help and to capture the whole picture. From Bug 540762 java.lang.NullPointerException at org.eclipse.swt.widgets.Display.getMessageCount(Display.java:2010) at org.eclipse.swt.widgets.Display.foregroundIdleProc(Display.java:1314) at org.eclipse.swt.internal.win32.OS.VtblCall(Native Method) at org.eclipse.swt.widgets.Display.init(Display.java:2688) at org.eclipse.swt.graphics.Device.<init>(Device.java:141) at org.eclipse.swt.widgets.Display.<init>(Display.java:469) at org.eclipse.swt.widgets.Display.<init>(Display.java:460) at org.eclipse.ui.internal.Workbench.createDisplay(Workbench.java:788) ... I got that too. This started to happen after a Windows Insider Preview update. I can reproduce the issue with SWT 4.8, 4.9, and 4.10 up to Milestone 1, the workaround of Bug 540762 fixed it for me in 4.10 M3. This means that SWT 4.8 and 4.9 are going to remain broken on Windows Insider Previews and generally after the next major Windows update in spring 2019 because this is an SWT issue, not a Windows bug. So let's take a quick look at the code of Display.java and Device.java as seen in the stacktrace above. Device constructor: public Device(DeviceData data) { synchronized (Device.class) { ... other stuff ... init (); } } Device.init: /** * Initializes any internal resources needed by the * device. * <p> * This method is called after <code>create</code>. * </p><p> * If subclasses reimplement this method, they must * call the <code>super</code> implementation. * </p> * * @see #create */ protected void init () { ... } Display constructor (trivial): public Display (DeviceData data) { super (data); } Display.init(); protected void init () { super.init (); ... [1] /* Create the idle hook */ foregroundIdleCallback = new Callback (this, "foregroundIdleProc", 3); //$NON-NLS-1$ foregroundIdleProc = foregroundIdleCallback.getAddress (); if (foregroundIdleProc == 0) error (SWT.ERROR_NO_MORE_CALLBACKS); idleHook = OS.SetWindowsHookEx (OS.WH_FOREGROUNDIDLE, foregroundIdleProc, 0, threadId); ... [2] if (appName != null) { /* Delete any old jump list set for the ID */ long /*int*/ [] ppv = new long /*int*/ [1]; int hr = OS.CoCreateInstance (TaskBar.CLSID_DestinationList, 0, OS.CLSCTX_INPROC_SERVER, TaskBar.IID_ICustomDestinationList, ppv); if (hr == OS.S_OK) { /*ICustomDestinationList::DeleteList*/ OS.VtblCall (10, ppv [0], appName); /*IUnknown::Release*/ OS.VtblCall (2, ppv [0]); } } ... } The problem is that the init() method is called from the superclass constructor, which is kind of an anti-pattern. When a Display Object is created, the first thing that happens is that the Device constructor is called. Contrary to expectations this is done _before_ the non-static member variables of Display get initialized! The Device constructor calls the overloaded init method, so Display.init() is called on a mostly uninitialized Display object. Specifically the following non-static members of Display are still null/false at this point: MSG hookMsg = new MSG (); boolean runDragDrop = true, dragCancelled = false; ... Synchronizer synchronizer = new Synchronizer (this); Consumer<RuntimeException> runtimeExceptionHandler = DefaultExceptionHandler.RUNTIME_EXCEPTION_HANDLER; Consumer<Error> errorHandler = DefaultExceptionHandler.RUNTIME_ERROR_HANDLER; boolean runMessagesInIdle = false, runMessagesInMessageProc = true; ... long /*int*/ nextTimerId = SETTINGS_ID + 1; ... byte [] keyboard = new byte [256]; ... Cursor [] cursors = new Cursor [SWT.CURSOR_HAND + 1]; ... Widget [] skinList = new Widget [GROW_SIZE]; The reason for the exception above is, of course, that synchronizer is still null. In the Display.init() code shown above the idleCallback is registered first (c.f. [1]) and later an OS.VtblCall is made (c.f. [2]). What changed in the Windows Insider Preview update is simply that Windows now triggers an idle callback during the vtbl call where it did not do so before. This is perfectly legal and will likely stay that way. So why are SWT 4.7 and older not directly affected? Up to 4.7 the foregroundIdleProc method started with if (runMessages && getMessageCount () != 0) { ... where runMesages was another non-static member: boolean runMessages = true, ... So in the not-yet-initialized Display object runMessages was false until after the init() call when the Display constructor has time to finish. For some reason the runMessages member was removed in 4.8. It may have been an earlier workaround for the exact same problem, so removing it was a mistake, but the whole anti-pattern is so non-obvious that it must have seemed natural. I think that this is a time-bomb and that we are lucky things didn't get worse. That SWT versions 4.8 and 4.9 on Windows will probably remain broken (I don't know what the update policy is for this case) is bad enough. A similar trivial change might break Display initialization on any OS. Besides, users implementing derived classes will inadvertently make the same mistakes. To make a long story short - what should be done about it? Given that the classes are public and derived classes might exist in used code, the Device.init() pattern cannot be changed. I propose to document the issue with a short comment in the affected places of the code and move all non-static member initialization of the affected classes to the beginning of their init methods. I'll gladly assist but cannot lead the effort as I have no experience yet with the SWT development process. Besides, a code review for similar occurrences of this anti-pattern might not be a bad idea...
Would you be able to write a code snippet that reproduces the issue?
Code snippet: Yes and no. There is no easy way to crash this deliberately. To reproduce the referenced bug 540762 all you need to do is start Eclipse 4.8 on a current Windows insider build (my version is 18298) and it is very likely that you'll get the error described in 540762. I already provided the analysis for the cause of the error. If you want some code to run, edit the init() method of Display.java and add a print statement like: protected void init () { super.init (); System.err.println("synchronizer value in init: " + synchronizer); ... You will get null. You can do it for all other non-static members I already listed and get the same result. My point is that it is extremely risky to run the init() method with all its side-effects without initializing the members first. Bug 540762 is proof of that. Moving member initialization into the init method will prevent similar effects in the future. Here's what it could look like: Change Display member initialization from MSG hookMsg = new MSG (); boolean runDragDrop = true, dragCancelled = false; ... Synchronizer synchronizer = new Synchronizer (this); ...etc. to /* Note: Non-static member initialization must happen in init(). */ MSG hookMsg; boolean runDragDrop, dragCancelled; ... Synchronizer synchronizer; ...etc. and extend the init() method: protected void init () { super.init (); /* Non-static member initialization must happen here. */ hookMsg = new MSG (); runDragDrop = true; dragCancelled = false; ... synchronizer = new Synchronizer (this); ...etc. Does that help?
Good news: After the recent update to Windows Insider Preview 18309 I can no longer reproduce the issue with SWT 4.8 and 4.9. We're going to keep our eyes open and I'll add a note if the situation changes again. Independent of that I still think it would be a good idea to fix the issue for future SWT versions so that Display initialization is safe independent of event timing and other non-forseeable changes at OS level.
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug. If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie.