Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 248210 - [Net] Consistent crashes from UnixProxyProvider
Summary: [Net] Consistent crashes from UnixProxyProvider
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Team (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 major (vote)
Target Milestone: 3.5 M3   Edit
Assignee: Pawel Pogorzelski CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 248166 (view as bug list)
Depends on:
Blocks: 248166
  Show dependency tree
 
Reported: 2008-09-22 18:22 EDT by David Williams CLA
Modified: 2009-06-02 07:17 EDT (History)
3 users (show)

See Also:


Attachments
Console screenshot (176.41 KB, image/jpeg)
2008-09-25 08:35 EDT, Pawel Pogorzelski CLA
no flags Details
Patch_v01 (1.73 KB, patch)
2008-10-17 10:26 EDT, Pawel Pogorzelski CLA
pawel.pogorzelski1: iplog+
Details | Diff
Rebuilt proxygnome (84.20 KB, application/octet-stream)
2008-10-17 10:36 EDT, Pawel Pogorzelski CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2008-09-22 18:22:24 EDT
At least, I think that's what the crashes are due to. 

Could this be related to bug 245850? 

As documented in bug 248166, we in WTP started seeing these crashes with the I20080909-1121 build ... and now see with the M2 build too. 

Only for Linux, i386, gtk.
Comment 1 Pawel Pogorzelski CLA 2008-09-23 05:29:09 EDT
Dave,
the crash is connected to bug 232495. Please have a look at the discussion hold there since it explains possible causes of the crash.

One more thing Dave, is there a possibility to directly access the machine? I don't know whether the crash is related to JVM/OS configuration or failures on the native code side. This would speed up fixing and testing the solution.
Comment 2 David Williams CLA 2008-09-23 10:32:47 EDT
(In reply to comment #1)
> Dave,
> the crash is connected to bug 232495. Please have a look at the discussion hold
> there since it explains possible causes of the crash.
> 
> One more thing Dave, is there a possibility to directly access the machine? I
> don't know whether the crash is related to JVM/OS configuration or failures on
> the native code side. This would speed up fixing and testing the solution.
> 

I've read through bug 232495 and don't really understand what I can/should do, from that bug. 

But, yes, since we luckily work for same company, I can give you temporary SSH access to that machine, will send info in separate note. 
Comment 3 Pawel Pogorzelski CLA 2008-09-25 08:27:28 EDT
I managed to narrow the problem, it seems to be not related to bug 232495 since there was a problem with linking the library providing native Gnome proxy support.

In the case of this bug the core dump is caused by errors on the native code side. The call System.loadLibrary(gnomeproxy-1.0.0) is successful but the subsequent native method invocation, that is gconfInit() fails.

CC'ing Francis since he provided the native C code for Gnome support.
Comment 4 Pawel Pogorzelski CLA 2008-09-25 08:35:36 EDT
Created attachment 113445 [details]
Console screenshot

This is a screenshot illustrating how the crash looks like.

The class UnixProxyProvider executed there is a stand alone class with only purpose to load the libgnomeproxy-1.0.0.lib and execute gconfInit().

As I mentioned in the previous comment library gets loaded but invocation of gconfInit() fails.
Comment 5 Francis Upton IV CLA 2008-09-25 13:25:21 EDT
(In reply to comment #4)
> Created an attachment (id=113445) [details]
> Console screenshot
> 
> This is a screenshot illustrating how the crash looks like.
> 
> The class UnixProxyProvider executed there is a stand alone class with only
> purpose to load the libgnomeproxy-1.0.0.lib and execute gconfInit().
> 
> As I mentioned in the previous comment library gets loaded but invocation of
> gconfInit() fails.
> 
It's been a long time since I looked at that code.  I think there is pretty extensifve tracing that you can turn on which should allow you to see where it's actually dieing in the gconfInit() routine.  And of course you can add tracing as necessary.  That's the way I would approach it.
Comment 6 David Williams CLA 2008-09-28 23:50:41 EDT
I'd like to change this severity to 'blocker' since it is blocking a set of our JUnit tests from working on one of our test build machines. I'm sure as more linux users start picking up M2 stacks some subset of them will be blocked as well. 

Is there a work around? 

Why does this occur on some machines and not others? 

Thanks, 
Comment 7 Pawel Pogorzelski CLA 2008-10-02 05:30:37 EDT
The workaround will be available after bug 242057 and 249448 will be fixed.
Comment 8 Pawel Pogorzelski CLA 2008-10-03 04:22:31 EDT
Bug 249448 has been fixed which means that Eclipse should start without crashing. The crash will occur with the first use of IProxyService though. To avoid it you have to change proxy settings to manual or allow direct connection. You can do this through UI or IProxyService API.

Bug 242057 on the other hand will provide ability to change the property through plugin_customization.ini file.

David, do the workaround works for you?
Comment 9 David Williams CLA 2008-10-04 12:05:23 EDT
(In reply to comment #8)

> 
> David, do the workaround works for you?
> 

Do you mean conceptually? Or is there a specific build that has the fix and you are asking me to test that? (If so, let me know which I-build). 

Conceptually, I don't know if we deliberately use 'IProxyService'. How do I tell? Just look for references to that Interface in our code? I do see one place we use it directly in our code ... but, since there were 8 or 12 crashes, I assume it's called indirectly?

Conceptually, if I have to change the "plugin_customization.ini" file, then I'll say I don't know what that is and don't know what to change nor to what value. I'm sure this all seems obvious to you, but not to me. Plus, that's a hard work around in practice, since in our automated tests, we just download "the latest" build. and run the tests ... sounds like now there will need to be some "manual" step for some machines, that tweaks the plugin_customization.ini. That doesn't seem right and suspect many 'end-users' (and/or paying customers of adopter products) would still find this a regression (and the paying customers expect to be fixed :) 

Let me know if/how I can help further. 
Comment 10 Pawel Pogorzelski CLA 2008-10-06 11:35:49 EDT
> Do you mean conceptually?

I mean specific build. The fix for bug 249448 is already in the HEAD and it will be available in tomorrow's I-build.

> Conceptually, I don't know if we deliberately use 'IProxyService'. How do
> I tell? Just look for references to that Interface in our code? I do see
> one place we use it directly in our code ... but, since there were 8
> or 12 crashes, I assume it's called indirectly?

You're right, IProxySevervice is used by the Eclipse Communication Framework.

> Conceptually, if I have to change the "plugin_customization.ini" file, then
> I'll say I don't know what that is and don't know what to change nor to what
> value.

Since the fix for bug 242057 is not ready you cannot use the "plugin_customization.ini" file to avoid the crash. I'll provide such a file as soon as the fix is available. This will disable using system proxy.

The way to avoid the crash right now is to call IProxyService.setSystemProxiesEnabled(false) prior to making any connection through the service, probably in the tests setup. This call sets the preference to false which equals to overriding it by "plugin_customization.ini".
Comment 11 David Williams CLA 2008-10-06 11:55:46 EDT
(In reply to comment #10)

> 
> I mean specific build. The fix for bug 249448 is already in the HEAD and it
> will be available in tomorrow's I-build.
> 

Ok, we'll try the I-build by about Wednesday. 

> 
> Since the fix for bug 242057 is not ready you cannot use the
> "plugin_customization.ini" file to avoid the crash. I'll provide such a file as
> soon as the fix is available. This will disable using system proxy.
> 
> The way to avoid the crash right now is to call
> IProxyService.setSystemProxiesEnabled(false) prior to making any connection
> through the service, probably in the tests setup. This call sets the preference
> to false which equals to overriding it by "plugin_customization.ini".
> 

Just to sanity check explicitly, there's nothing I can do on my system to help?, such as create/set some system proxies, even if the "blank"? Use KDE instead of Gnome?  I'm not sure I could or would do any of these ... I'm just wondering about the distant future and those that have headless products that might run into this. 

Comment 12 David Williams CLA 2008-10-10 04:39:10 EDT
Just for the record, I am still seeing the problem using build I20081007-1600. 

Our JUnit's still crash 6 times. It might have been 8 before ... but, I think that means even though we don't call it directly, it's getting called indirectly, still. 

This still seems like a regression to me ... if someone has to change their code to work around it ... but I guess you are saying the problem is in the native code and another "fix" just happened to expose this bug in the native code?

Comment 13 Pawel Pogorzelski CLA 2008-10-13 12:39:07 EDT
David, I still work on the fix for the native side code.

In the meantime you can delete the fragment org.eclipse.core.net.linux.x86 from your target platform since this plugin contains only the problematic native library. This is an alternative workaround to calling IProxyService.setSystemProxiesEnabled(false) on the API level.
Comment 14 David Williams CLA 2008-10-17 08:37:44 EDT
Thanks for the tip about deleting org.eclipse.core.net.linux.x86. 

That at least got our JUnit's working again. 

So, I'll change from 'blocking' to 'major', since there is at least a work around. Long term, if it get's to look like there will never be fix, I'll look again at changing source code to work around it (by doing that 'init', somewhere)  but deleting the file is definitely easier (and less "permanent" than changing so many JUnit tests. 


Comment 15 Pawel Pogorzelski CLA 2008-10-17 10:26:00 EDT
Created attachment 115396 [details]
Patch_v01

The fix for the native code side...
Comment 16 Pawel Pogorzelski CLA 2008-10-17 10:36:59 EDT
Created attachment 115397 [details]
Rebuilt proxygnome

...and the recompiled library.

Just for the record. The crash was caused by not calling GLib function g_type_init() prior to gconf_client_get_default(). On most machines is didn't cause any trouble because on the top of GLib they run GTK+/GNOME which handle the initialization of the library. Since GConf is providing the backend of the Eclipse's proxygnome library it is possible to run it without GTK+/GNOME. Thus the initialization is necessary.

I recompiled the library on Red Hat EL 3 update 3, as the previous patch explains. Previously the library was compiled on RHEL 4.
Comment 17 Francis Upton IV CLA 2008-10-17 10:42:09 EDT
(In reply to comment #16)
> Created an attachment (id=115397) [details]
> Rebuilded proxygnome
> 
> ...and the recompiled library.
> 
> Just for the record. The crash was caused by not calling GLib function
> g_type_init() prior to gconf_client_get_default(). On most machines is didn't
> cause any trouble because on the top of GLib they run GTK+/GNOME which handle
> the initialization of the library. Since GConf is providing the backend of the
> Eclipse's proxygnome library it is possible to run it without GTK+/GNOME. Thus
> the initialization is necessary.
> 
> I recompiled the library on Red Hat EL 3 update 3, as the previous patch
> explains. Previously the library was compiled on RHEL 4.
> 

Nice catch, glad you got this all worked out.  Sorry I was not able to help more.
Comment 18 Tomasz Zarna CLA 2008-10-17 10:51:34 EDT
The fix and lib released to HEAD.
Comment 19 David Williams CLA 2008-10-17 19:13:07 EDT
*** Bug 248166 has been marked as a duplicate of this bug. ***
Comment 20 Pawel Pogorzelski CLA 2008-10-23 09:36:32 EDT
CC'ing Mike because he wanted to have a look at the bug.
Comment 21 David Williams CLA 2008-10-30 10:02:21 EDT
Just to confirm, this has been working fine in our latest JUnit runs, on the latest I-builds. 

Much thanks.