Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 316500 - org.eclipse.ecf.ssl is missing required Import-Package for javax.net
Summary: org.eclipse.ecf.ssl is missing required Import-Package for javax.net
Status: RESOLVED FIXED
Alias: None
Product: ECF
Classification: RT
Component: ecf.core (show other bugs)
Version: 3.1.0   Edit
Hardware: PC Windows Vista
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: ecf.core-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 323761 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-06-10 13:46 EDT by Tuukka Lehtonen CLA
Modified: 2010-09-07 17:10 EDT (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tuukka Lehtonen CLA 2010-06-10 13:46:00 EDT
Short version:
javax.net should be added as an Imported-Package in the org.eclipse.ecf.ssl fragment

Longer version:
I'm currently spawning separate headless OSGi applications from within another Eclipse Workbench application and I've ran into this problem that originates from the org.eclipse.ecf.ssl fragment:

[log;+0300 2010.06.10 16:48:39:788;INFO;org.eclipse.ecf;org.eclipse.core.runtime.Status[plugin=org.eclipse.ecf;code=0;message=Unexpected Error in ECFPlugin.start;severity4;exception=java.lang.NoClassDefFoundError: javax/net/SocketFactory;children=[]]]
java.lang.NoClassDefFoundError: javax/net/SocketFactory
	at org.eclipse.ecf.internal.ssl.ECFTrustManager.start(ECFTrustManager.java:83)
	at org.eclipse.ecf.internal.core.ECFPlugin.start(ECFPlugin.java:316)
	at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:783)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:774)
	at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:755)
	at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:352)
	at org.eclipse.osgi.framework.internal.core.AbstractBundle.resume(AbstractBundle.java:370)
	at org.eclipse.osgi.framework.internal.core.Framework.resumeBundle(Framework.java:1068)
	at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:557)
	at org.eclipse.osgi.framework.internal.core.StartLevelManager.incFWSL(StartLevelManager.java:464)
	at org.eclipse.osgi.framework.internal.core.StartLevelManager.doSetStartLevel(StartLevelManager.java:248)
	at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:445)
	at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:227)
	at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:337)
Caused by: java.lang.ClassNotFoundException: javax.net.SocketFactory
	at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:494)
	at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:410)
	at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:398)
	at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:105)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
	... 15 more

Looking at ECFTrustManager, it can be seen that it is attempting to use javax.net.ssl.SSLSocketFactory, which is inherited from javax.net.SocketFactory.

According to the modern eclipse boot classloader delegation rules, by default Eclipse is in strict OSGi R4 mode (http://wiki.eclipse.org/Equinox_Boot_Delegation). To my understanding, this implies that plug-ins must always list any non-java.* package classes they are using as Imported-Packages in the plug-in's manifest. Such packages include javax.net and javax.net.ssl.

Looking at the manifest of org.eclipse.ecf.ssl, it can be seen that javax.net.ssl is listed as imported but javax.net is not. I suspect this is the reason for the activation problem above.

Looking at the 3.2 release of ECF, the manifest would still seem to have the same issue.

To verify that this indeed is the problem, I tried giving osgi.compatibility.bootdelegation=true to the application and the problem disappeared. The following settings fixed the problem also:
osgi.java.profile.bootdelegation=override
org.osgi.framework.bootdelegation=sun.*, com.sun.*, javax.*

I'm not exactly sure why this problem only seems to manifest when I'm launching other headless application from my workbench app.
Comment 1 Scott Lewis CLA 2010-06-22 23:00:28 EDT
Added javax.net to imported packages.  I don't quite understand why I'm not seeing this as a compile error in my Eclipse 3.6 environment, but in any case, it does not hurt much to put javax.net in the imported packages so that this doesn't occur in your environment.  Resolving as fixed.
Comment 2 Scott Lewis CLA 2010-08-26 17:49:06 EDT
*** Bug 323761 has been marked as a duplicate of this bug. ***
Comment 3 Scott Lewis CLA 2010-08-26 17:51:06 EDT
Adding Matt Flaher for verification of fix for this bug, as Matt is one of authors of the *.ssl code...I am not.
Comment 4 Scott Lewis CLA 2010-08-30 18:51:21 EDT
(In reply to comment #3)
> Adding Matt Flaher for verification of fix for this bug, as Matt is one of
> authors of the *.ssl code...I am not.

I would like to include this fix for Eclipse 3.6.1/Helios maintenance (coming up real soon now).  This is the only ECF bug fix that must/probably should be in 3.6.1.

But before including this fix, I feel like I need to get Matt's or Thomas' verification that adding the javax.net to imported packages is the way to go here...since with the VMs that I use and test on I never was able to reproduce the NoClassDefFoundError myself...and I am not the author of the org.eclipse.ecf.ssl fragment that references the javax.net class.

So Matt, Pascal, and/or Thomas...could you verify that adding the import package is the way to fix this bug?  This is what I have done as per previous comments, and am prepared to build a Helios stream maintenance release with this bug fix to include in Equinox/p2/Eclipse 3.6.1.

Also...since this is for the maintenance release (rather than Indigo stream), how do you want to coordinate with Eclipse platform/p2 build?  (we have been using bug 219499 for HEAD-based new builds...and I would like to/will continue doing that for Indigo, but need to know how to coordinate the ECF contribution for the maintenance release...to include the fix for this bug).

Thanks.
Comment 5 Scott Lewis CLA 2010-09-03 14:53:29 EDT
I need some input/help from the p2/equinox/platform releng on getting this fix into 3.6.1/Helios (which is coming up very soon now).  

Otherwise, the fix for this bug won't make it into 3.6.1.

Someone who is associated with the p2/Eclipse maintenance releng please contact me directly at slewis at composent.com about getting the ECF 3.3.1 platform integration build into the 3.6.1 platform build.
Comment 6 Thomas Watson CLA 2010-09-07 10:35:24 EDT
This fix looks correct.  But I don't think this is necessarily a critical bug for SR1 of Helios.  I suspect the failure launching with headless mode can easily be worked around by setting the following configuration property, which is set by default in most other configurations:

osgi.compatibility.bootdelegation=true
Comment 7 Scott Lewis CLA 2010-09-07 10:49:15 EDT
So that the bug author and everyone else on this bug know...the fix for this bug will apparently not be in Eclipse 3.6 SR1.  

See bug 324596 for discussion about the process issues that prevented inclusion in SR1.  As ECF project lead you have my apologies, but there is apparently nothing more I could do about it.
Comment 8 Alex Blewitt CLA 2010-09-07 11:12:06 EDT
Using the boot delegation is an insane hack that should be never have been allowed in the first  place. No-one in their right mind enables it for production systems since it completely defeats the point of OSGi, especially when there is the correct way of solving it. 

If "not being able to use this functionality at all" is not a critical bug for production systems, then what is the point of service releases? 

The only reason Eclipse runs with this flag set is due to historic behavioural issues with the initial port to OSGi. There is no equivalent in any other OSGi framework or client - the fact that it might happen to work in one particular system highlights the fact that Eclipse's continued use of this parameter will continue to hide erroneous calls. 

Sincie this is not being fixed, I will have no choice but to recommend agaibstbusibg ECF for osgi remote services, and toques Felix' implantation instead.
Comment 9 Thomas Watson CLA 2010-09-07 11:19:36 EDT
(In reply to comment #8)
> Sincie this is not being fixed, I will have no choice but to recommend
> agaibstbusibg ECF for osgi remote services, and toques Felix' implantation
> instead.

Since you don't like my original work around.  Here is another one.  Attach your own fragment to ecf that does the import of the package.  This will require no special configuration setting.
Comment 10 Alex Blewitt CLA 2010-09-07 11:30:43 EDT
The right fix is to fix the bundle's imports and push itnout with SR-1. Anything else is just (a) hacking and (b) requires the users to know about, and navigate to thus specific bug to download the attachment. 

Whilst the fragment is a better fix than the boot delegation, it still doesn't help anyone who wants to use ECF remote services in 3.6 since the SSL isn't resolvable from the update site as it stands - whereas Felix works out of the box. 

Why is this marked as "normal" anyway? It should be higher since it prevents usage.
Comment 11 John Arthorne CLA 2010-09-07 11:42:44 EDT
(In reply to comment #10)
> The right fix is to fix the bundle's imports and push itnout with SR-1.
> Anything else is just (a) hacking and (b) requires the users to know about, and
> navigate to thus specific bug to download the attachment. 

Nobody is doubting what the correct long term fix is. The only issue is that the request to include this in SR1 came after our last scheduled build (Sept. 1st), and after our final test pass (Sept 2) [1]. The tentative build for this week (Sept. 8th) is only for real blockers, and a bug with multiple workarounds by definition isn't blocking. These end-game rules may seem pointless but they are critical to us being able to stabilize and deliver the massive amount of code that goes into the Eclipse release train. I suggest getting this fix into the first Helios SR2 build that will happen in another week or two, and in the meantime using one of the available workarounds.

[1] http://www.eclipse.org/eclipse/development/plans/freeze_plan_3_6_1.php
Comment 12 Scott Lewis CLA 2010-09-07 13:47:00 EDT
(In reply to comment #8)
> Using the boot delegation is an insane hack that should be never have been
> allowed in the first  place. No-one in their right mind enables it for
> production systems since it completely defeats the point of OSGi, especially
> when there is the correct way of solving it. 
> 
> If "not being able to use this functionality at all" is not a critical bug for
> production systems, then what is the point of service releases? 
> 
> The only reason Eclipse runs with this flag set is due to historic behavioural
> issues with the initial port to OSGi. There is no equivalent in any other OSGi
> framework or client - the fact that it might happen to work in one particular
> system highlights the fact that Eclipse's continued use of this parameter will
> continue to hide erroneous calls. 
> 
> Sincie this is not being fixed, I will have no choice but to recommend
> agaibstbusibg ECF for osgi remote services, and toques Felix' implantation
> instead.


Alex:  Another workaround is to install the ECF feature patch into Eclipse, which is included in existing builds...and will be in the ECF contribution for Helios SR1.  Installing the feature patch (name:  'ECF 3.4 Patch for Eclipse 3.5-3.6') will apply the fix associated with this bug.

Just for your/other's reference, ECF's builder is here:  https://ecf2.osuosl.org/hudson/ and with this one can access p2 repos of the latest (including this bug fix, of course).  For example, the ECF sdk (with the feature patch) is available here:  https://ecf2.osuosl.org/hudson/job/R-Release_3_3-sdk.feature/ and it includes the feature patch as well as remote services, as distinct features.

Alex I would request that you not penalize the ECF project due to the limitations of the platform's SR1 release process.  We/ECF can't do anything about that process, and unfortunately I don't think the existing PMC really cares very much about ECF consumer's needs (like promptly fixing bugs)...at least relative to other concerns.  This does *not* reflect the ECF project's priorities, and I'm as upset about it as you appear to be.
Comment 13 John Arthorne CLA 2010-09-07 16:36:08 EDT
(In reply to comment #12)
> Alex I would request that you not penalize the ECF project due to the
> limitations of the platform's SR1 release process.  We/ECF can't do anything
> about that process, and unfortunately I don't think the existing PMC really
> cares very much about ECF consumer's needs (like promptly fixing bugs)...at
> least relative to other concerns.  This does *not* reflect the ECF project's
> priorities, and I'm as upset about it as you appear to be.

Scott, you released a fix for this bug in June. The fact that you waited until after our final scheduled build in September to request including it in the platform (bug 324596) is not "due to a limitation of the platform's release process". If promptly fixing bugs for your consumers was your priority, you could have contributed this fix to the platform's SR1 builds when they started over two months ago. 

Last minute changes are not without risk, as we learned from the ECF contribution to 3.6.0 RC4 which caused several build failures (bug 314901). If there is a barrier preventing you from making contributions earlier in the release cycle, we should all work together to address them for future releases.
Comment 14 Scott Lewis CLA 2010-09-07 17:10:31 EDT
(In reply to comment #13)
<stuff deleted>
> 
> Scott, you released a fix for this bug in June. The fact that you waited until
> after our final scheduled build in September to request including it in the
> platform (bug 324596) is not "due to a limitation of the platform's release
> process". If promptly fixing bugs for your consumers was your priority, you
> could have contributed this fix to the platform's SR1 builds when they started
> over two months ago. 


Since Helios, we/ECF have been doing needed structural work on our own build...as requested by ECF consumers...and this made our doing a build for the contribution impossible up until mid Aug.

Further, as per comment 3 and comment 4, I have been trying to get comment/review on the fix (and the problem) from the authors of the code before getting things actually contributed to the release.  No word was received from anyone in response to this comment/review request...and this delayed things further...until I finally concluded that no comment/review by the code authors was forthcoming...and contacted Pascal directly about the contribution.

As well...we/ECF have no resources...to do platform integration builds...or anything else...and we also have zero say in the Eclipse/Equinox planning release process.  

So I consider your admonition a case of process-induced 'blame the victim'...which is not very appealing IMHO.

>If
> there is a barrier preventing you from making contributions earlier in the
> release cycle, we should all work together to address them for future > releases.

Agreed.  So instead of requiring that I/we do everything to accomodate what is IMHO now a stilted and cumbersome service release process, perhaps the platform and/or releng team...or someone with some resources...should do something about it...other than blaming me for not doing more free work. 

For example...perhaps the platform releng should build the ECF bundles from source instead of requiring us to do that building ourselves to contribute to the platform.  

Or contribute resources to ECF...for bug fixing, enhancements, or more timely builds...rather than simply demanding more free work to accomodate the platform's release process.