| Summary: | eclipse crashed when exiting with Help Context open | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | Olaf Flebbe <of> | ||||||||||||||
| Component: | SWT | Assignee: | Grant Gayed <grant_gayed> | ||||||||||||||
| Status: | CLOSED DUPLICATE | QA Contact: | |||||||||||||||
| Severity: | major | ||||||||||||||||
| Priority: | P3 | CC: | ericwill, filip.hrbek, irbull, jacob_champlin, lasse.loevdahl, lufimtse, remy.suen, rsternberg, yahorr | ||||||||||||||
| Version: | 3.7 | ||||||||||||||||
| Target Milestone: | --- | ||||||||||||||||
| Hardware: | PC | ||||||||||||||||
| OS: | Linux-GTK | ||||||||||||||||
| Whiteboard: | |||||||||||||||||
| Attachments: |
|
||||||||||||||||
|
Description
Olaf Flebbe
Created attachment 197600 [details]
backtrace
Likely a problem due to XULRunner 2.0. Created attachment 197705 [details]
Next Crash
I found a way to reproduce the crash: Start Eclipse, Help/Help Contents, wait for Window to settle, Use File/Exit. If you close the help window first: No crash. Olaf I've tried the comment 4 steps on 64-bit Ubuntu 10 and 64-bit Fedora 15 but don't see the crash. Is it possible that there are more steps/context required to make the crash happen? Do the comment 4 steps crash for you if they're the first thing you do in a new workspace? I double checked the steps and installed a fresh Fedora 15: Installed a fresh Fedora 15 into a vmware, logged in as simple user Unpacked sun jdk 1.6u24 64 Bit unpacked eclipse-SDK-3.7RC3-linux-gtk-x86_64.tar.gz Started eclipse with> PATH=/home/olaf/jdk1.6.0.24/bin:$PATH eclipse/eclipse Help/Help Contents switch back to Main Menu/File Exit-> CRASH. ------------ # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fb290130e1b, pid=2035, tid=140405641705216 # # JRE version: 6.0_24-b07 # Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libwebkitgtk-1.0.so.0+0xb29e1b] # # An error report file with more information is saved as: # /home/olaf/hs_err_pid2035.log # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # ---- See Attachment for Fedora Crashreport Created attachment 197785 [details]
Fedora Crash Report
(In reply to comment #6) > # C [libwebkitgtk-1.0.so.0+0xb29e1b] What versions of WebKit GTK+ do you have installed? > What versions of WebKit GTK+ do you have installed?
My Fedora 15 comes with:
[olaf@localhost ~]$ rpm -qf /usr/lib64/libwebkitgtk-1.0.so.0.7.0
webkitgtk-1.4.0-1.fc15.x86_64
(In reply to comment #8) > (In reply to comment #6) > > # C [libwebkitgtk-1.0.so.0+0xb29e1b] > > What versions of WebKit GTK+ do you have installed? My SuSE 11.4 comes with libwebkitgtk-1_0-0-1.3.10-5.1.x86_64 Olaf (In reply to comment #9) > > What versions of WebKit GTK+ do you have installed? > > My Fedora 15 comes with: > > [olaf@localhost ~]$ rpm -qf /usr/lib64/libwebkitgtk-1.0.so.0.7.0 > webkitgtk-1.4.0-1.fc15.x86_64 Did a yum update, libwebkitgtk stays the same, eclipse still crashing. Hi, Please tell me what I can do to fix this very annoying bug. AFAIK provided all infos you requested. Can you reproduce it with the info I supplied? I was away for a few days and can work a bit on this issue. I will try to install debuginfos in order to get further insight into the stacktrace. Or do you have already a clue whats going on? Olaf with the help of the Debuginfo for libwebkitgtk I found IMHO the relevant data in the bugtrace:
There seems to be a Problem with destructing the Timers in webkit. It may be related to this statement in webkit (Timer.cpp 53)
// Simple accessors to thread-specific data.
51 static Vector<TimerBase*>& timerHeap()
52 {
53 return threadGlobalData().threadTimers().timerHeap();
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This may break in destruction (atexit) phase of webkit.
If any of these statements returns NULL, it will result in this stacktrace.
54 }
55
56 // Class to represent elements in the heap when calling the standard library heap algorithms.
57 // Maintains the m_heapIndex value in the timers themselves, which allows us to do efficient
58 // modification of the heap.
59 class TimerHeapElement {
60 public:
61 explicit TimerHeapElement(int i)
62 : m_index(i)
63 , m_timer(timerHeap()[m_index])
This is the traceback line...
64 {
65 checkConsistency();
66 }
See for instance:
http://trac.webkit.org/browser/releases/WebKitGTK/webkit-1.3.3/WebCore/platform/Timer.cpp
#7 0x00007fec64155637 in os::abort(bool) () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#8 0x00007fec642a8cf8 in VMError::report_and_die() () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#9 0x00007fec642a9871 in crash_handler(int, siginfo*, void*) () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#10 <signal handler called>
No symbol table info available.
#11 0x00007fec64152c06 in os::is_first_C_frame(frame*) () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#12 0x00007fec642a7a91 in VMError::report(outputStream*) () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#13 0x00007fec642a8bd5 in VMError::report_and_die() () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#14 0x00007fec6415bfe5 in JVM_handle_linux_signal () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#15 0x00007fec6415830e in signalHandler(int, siginfo*, void*) () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#16 <signal handler called>
No symbol table info available.
#17 TimerHeapElement (this=<value optimized out>) at Source/WebCore/platform/Timer.cpp:63
No locals.
#18 operator* (this=<value optimized out>) at Source/WebCore/platform/Timer.cpp:136
No locals.
#19 push_heap<WebCore::TimerHeapIterator> (this=<value optimized out>) at /usr/include/c++/4.5/bits/stl_heap.h:168
No locals.
#20 WebCore::TimerBase::heapDecreaseKey (this=<value optimized out>) at Source/WebCore/platform/Timer.cpp:228
No locals.
#21 0x00007fec514d73f9 in heapPop (this=0x7fec522ea0a0, newTime=<value optimized out>) at Source/WebCore/platform/Timer.cpp:268
fireTime = <value optimized out>
#22 heapDelete (this=0x7fec522ea0a0, newTime=<value optimized out>) at Source/WebCore/platform/Timer.cpp:235
No locals.
#23 WebCore::TimerBase::setNextFireTime (this=0x7fec522ea0a0, newTime=<value optimized out>) at Source/WebCore/platform/Timer.cpp:298
currentHeapInsertionOrder = 789
wasFirstTimerInHeap = true
isFirstTimerInHeap = <value optimized out>
oldTime = <value optimized out>
#24 0x00007fec6468a5a1 in __run_exit_handlers () from /lib64/libc.so.6
No symbol table info available.
#25 0x00007fec6468a5f5 in exit () from /lib64/libc.so.6
No symbol table info available.
#26 0x00007fec63f27819 in vm_direct_exit(int) () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#27 0x00007fec642b854f in VM_Exit::doit() () from /usr/lib64/jvm/java-1.6.0-sun-1.6.0/jre/lib/amd64/server/libjvm.so
I'm still not able to reproduce this. I'm on 64-bit Fedora 15, did a yum update, got the Sun 1.6u24 jre, and launched with a command line similar to the one in comment 6. The only difference is that I'm using the Eclipse 3.7RC4 release, but this should not matter since no SWT code changed between 3.7RC3 and 3.7RC4. Downgrading report to major since this does not currently seem to affect multiple users. I'll continue to keep an eye out for this, and if others are seeing this then hopefully they can chime in with additional hints to make it happen (environment settings, etc.). (In reply to comment #14) > I'm still not able to reproduce this. I'm on 64-bit Fedora 15, did a yum > update, got the Sun 1.6u24 jre, and launched with a command line similar to the > one in comment 6. The only difference is that I'm using the Eclipse 3.7RC4 > release, but this should not matter since no SWT code changed between 3.7RC3 > and 3.7RC4. > > Downgrading report to major since this does not currently seem to affect > multiple users. I'll continue to keep an eye out for this, and if others are > seeing this then hopefully they can chime in with additional hints to make it > happen (environment settings, etc.). Very strange. I am running an unmodified, freshly installed Fedora 15 64 Bit, updated it (it comes with a updated libwebkitgtk 1.4.1) eclipse classic 64 Bit from the official indigo release, openjdk, jre, sun jdk. All crash: Start eclipse, Help/Help Contents. Switch Back to Main Window. File/exit -> Crash. # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f8b5e53fdeb, pid=2769, tid=140237433657088 # # JRE version: 6.0_26-b03 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libwebkitgtk-1.0.so.0+0xb2adeb] _NPN_ReleaseVariantValue+0x4d542b # # An error report file with more information is saved as: # /home/olaf/hs_err_pid2769.log # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp Crashes even on opensuse 11.4.,, Olaf I can now confirm that I see a genuine SWT problem. I used the BrowserExample from SWT to generate browser.jar, and copied the swt implementation from Indigo. In order to reproduce a crash: started a browser with java -cp org.eclipse.swt.gtk.linux.x86_64_3.7.0.v3735b.jar:browser.jar BrowserExample And closed the window. -> crash. Seems to be a little bit runtime dependent. Some collegues of mine needs two or three attempts before the browser crashes on suse 11.4. Can still reproduce the problem on FC15. Unpack BrowserExample.tgz, copy SWT to reproduce yourself. --------------- # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f8459ae018b, pid=28924, tid=140206724519680 # # JRE version: 6.0_26-b03 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libwebkitgtk-1.0.so.0+0x9df18b] _NPN_ReleaseVariantValue+0x3f0a0b # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp Created attachment 198521 [details]
Demo howto crash swt with libwebkitgtk
I debugged i further. It seems to bug in the way libwebkitgtk is used by SWT. In the demo case ThreadGlobalData::destroy() is called first, destroying the ThreadTimers, without removing Timers first. Later in the shutdown code the Timer of the MainThread will be stopped, causing to reference invalid ThreadGlobalData. Why is this only the case for SWT? The GTK Demo seems to work fine. Is this the random order of static object destruction in c++? IMHO I have understood the problem a bit and have some kind of workaround for libwebkitgtk. But the real problem is located in the SWT binding. WebKit Timers are intermixed with Threads. It is not possible to stop() a Timer in a different thread than the one it was created in. Timers are thread specific by design. Thus Timers own an internal data structure referenced through TLS (Thread local storage). (The ThreadSpecific<ThreadGlobalData> class), organizing all the Timeres of the current Thread, in order to reduce the OS overhead of timers. The code for Timers is in Source/WebCore/platform/Timers.cpp. There is a static allocated Timer used by the Cairo backend: PurgeScratchBufferTimer in Source/WebCore/platform/graphics/cairo/ContextShadowCairo.cpp This PurgeScratchBufferTimer is constructed at the time libwebkitgtk is loaded into SWT and may be used later on. What happens if a SWT examples exits()? The libc Runtime runs all the static destructors, including those objects located in the shared libraries. And it runs these destructors within the thread calling exit()! But this exit()ed thread may not be the thread which constructed the C++ Objects. At least for Java-SWT. The "REAL PATCH" would be to ensure libwebkit objects are destructed in the same thread as they are constructed. AFAIK this is very difficult to ensure in java. The fake patch is to short-circuit Timer destruction for the case currentThread() != construction time thread. Unfortunately some of the elegance of the Timer Class in Webkit breaks down (the need of the m_thread member). Why did you not see the problem ? You may not have used the cairo backend, shadows, or your machine is a lot slower (or faster) than mine, because the PurgeScratchBufferTimer timeout is 2 ms. If your PurgeScratchBufferTimer fires "faster" you will not see the problem. BTW: I found out the core reason of the java crashes with building webkit without NDEBUG (i.e. assert's are active). Do you see any chance to unload the SWT/JNI libwebkit Dlls in a controlled way (like http://codethesis.com/tutorial.php?id=1) in the same thread as it have been loaded before? With the attached patch (the same for Upstream Webkit BTW) I do not see no crashes any more. Created attachment 198664 [details]
patch relative to webkit 1.3 (same for current webkit)
Is there any progress? I keep getting the same error on openSUSE 11.4 with Eclipse Indigo. Sometimes it happens even after saving a java file in my workspace. It's very annoying to restart Eclipse every 10 minutes or so. Created attachment 201875 [details]
Fixed patch (which also includes necessary Timer.h changes)
Eclipse appeared to be completely unusable in my environment. It was crashing too often after many operations, such as auto-completion or source code generation.
I decided to apply suggested patch to webkit (which is still just a workaround, not the final solution). However, the first attempt failed on missing class member "m_thread". I am including yet another patch which also includes the necessary change to the header file.
Finally I can work with Eclipse in my new environment!
For those who plan to use the same workaround - building webkit is a resource and time consuming task - it took several hours to build it on my machine, but finally it succeeded.
Unfortunately, although the workaround made Eclipse crash less likely, it did not solve the problem. Now it crashes with a slightly different report, but still in libwebkitgtk: # Problematic frame: # C [libwebkitgtk-1.0.so.0+0x1144185] WTF::OSAllocator::reserveAndCommit(unsigned long, WTF::OSAllocator::Usage, bool, bool)+0x45 This is a major issue for me and my team. I am concerned that its gone down the path of this being a webkit problem. However, Eclipse 3.6.2 does not exhibit this issue running on the exact same system. > cat /etc/SuSE-release openSUSE 11.4 (x86_64) VERSION = 11.4 CODENAME = Celadon > rpm -qa | grep libwebkit libwebkitgtk-1_0-0-32bit-1.3.10-5.1.x86_64 libwebkitgtk-devel-1.3.10-5.1.x86_64 libwebkitgtk-1_0-0-1.3.10-5.1.x86_64 re: comment 24 Eclipse 3.6.2 uses a mozilla-based renderer by default, not WebKitGTK, which is why you see the difference. If you want to force Eclipse 3.7.x to use the mozilla-based renderer as well, there was a settable property introduced in the Eclipse 3.7.1 stream, see https://bugs.eclipse.org/bugs/show_bug.cgi?id=349837#c12 onwards. Eclipse 3.7.1 will be released some time in September ( http://www.eclipse.org/eclipse/development/plans/freeze_plan_3_7_1.php ). Can confirm that I get this issue too. >dpkg-query -W libwebkitgtk* libwebkitgtk-1.0-0 1.4.3-0ubuntu2 libwebkitgtk-1.0-common 1.4.3-0ubuntu2 libwebkitgtk-3.0-0 1.4.3-0ubuntu2 libwebkitgtk-3.0-common 1.4.3-0ubuntu2 >uname -smv Linux #18-Ubuntu SMP Tue Sep 13 23:38:01 UTC 2011 x86_64 # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fe9a6ba0450, pid=12006, tid=140642958526208 # # JRE version: 6.0_22-b04 # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.1-b03 mixed mode linux-amd64 ) # Problematic frame: # C [libwebkitgtk-1.0.so.0+0x4c2450] # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # Downgraded to Helios and the issue seems to have gone. I get this error with the 'WebKit in a browser' snippet [1] It fails almost always if I close the window while the web page is being loaded. >dpkg-query -W libwebkitgtk* libwebkitgtk-1.0-0 1.4.3-0ubuntu4 libwebkitgtk-1.0-common 1.4.3-0ubuntu4 libwebkitgtk-3.0-0 1.4.3-0ubuntu4 libwebkitgtk-3.0-common 1.4.3-0ubuntu4 >uname -smv Linux #29-Ubuntu SMP Tue Feb 14 12:48:51 UTC 2012 x86_64 # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f79d3ba430b, pid=5059, tid=140161563236128 # # JRE version: 6.0_23-b23 # Java VM: OpenJDK 64-Bit Server VM (20.0-b11 mixed mode linux-amd64 compressed oops) # Derivative: IcedTea6 1.11pre # Distribution: Ubuntu 11.10, package 6b23~pre11-0ubuntu1.11.10.2 # Problematic frame: # C [libwebkitgtk-1.0.so.0+0xa4430b] _NPN_ReleaseVariantValue+0x42b5eb # # An error report file with more information is saved as: # /home/test/jboss/svn/browsersim/org.jboss.tools.vpe.browsersim/hs_err_pid5059.log # # If you would like to submit a bug report, please include # instructions how to reproduce the bug and visit: # https://bugs.launchpad.net/ubuntu/+source/openjdk-6/ # 1. http://git.eclipse.org/c/platform/eclipse.platform.swt.git/tree/examples/org.eclipse.swt.snippets/src/org/eclipse/swt/snippets/Snippet351.java (In reply to Yahor Radtsevich from comment #27) > I get this error with the 'WebKit in a browser' snippet [1] > It fails almost always if I close the window while the web page is being > loaded. I can't reproduce this bug, or the original bug from comment 0. Due to the age of this bug I'm going to close it. Please feel free to re-file a new bug with OS + GTK info and a snippet reproducer. Adding Leo to CC since he's been working on webkit. *** This bug has been marked as a duplicate of bug 509658 *** |