Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 18598

Summary: Unable to cancel frozen connection
Product: [Eclipse Project] Platform Reporter: Christophe Elek <celek>
Component: Update (deprecated - use Eclipse>Equinox>p2)Assignee: Platform-Update-Inbox <platform-update-inbox>
Status: RESOLVED WONTFIX QA Contact:
Severity: major    
Priority: P3 CC: A.Kuckartz, antonio.petrelli, axb, btripkov, caniszczyk, daniel, Dave_Thomson, eagtstools, erich_gamma, fraenkel, gaetan, jacob, jacob, karasiuk, Kevin_McGuire, klicnik, manahan, nadment, nikolaymetchev, stephen.francisco, ursreupke, ysaillet, zina
Version: 2.0Keywords: helpwanted
Target Milestone: ---   
Hardware: PC   
OS: Windows 2000   
Whiteboard:
Attachments:
Description Flags
thread dump during hang of update manager none

Description Christophe Elek CLA 2002-06-01 10:54:56 EDT
Attempt to open a bookmark at w3.org
You cannot cancel the attempt to connect
Comment 1 Vlad Klicnik CLA 2002-06-04 10:10:03 EDT
*** Bug 18813 has been marked as a duplicate of this bug. ***
Comment 2 Vlad Klicnik CLA 2002-06-04 13:15:56 EDT
F3 candidate
Comment 3 Christophe Elek CLA 2002-07-29 16:32:41 EDT
*** Bug 21790 has been marked as a duplicate of this bug. ***
Comment 4 Christophe Elek CLA 2002-08-09 12:41:19 EDT
*** Bug 22310 has been marked as a duplicate of this bug. ***
Comment 5 Dejan Glozic CLA 2002-08-12 10:11:11 EDT
We will likely not be able to fix this problem for 2.0.1 release. We lack the 
suitable support in Java to have fine-grain control over attempts to open a 
network connection to a server with a random URL. We cannot set the timeout for 
the method to return, and even if we attempt to connect in another thread, we 
are not sure we can safely kill the connection by killing the thread (not sure 
if OS resources are cleanly recycled or we will have a resource leak that way).
Comment 6 Dejan Glozic CLA 2002-09-03 10:18:03 EDT
*** Bug 23106 has been marked as a duplicate of this bug. ***
Comment 7 Dejan Glozic CLA 2002-09-04 18:10:43 EDT
*** Bug 23128 has been marked as a duplicate of this bug. ***
Comment 8 Dejan Glozic CLA 2002-09-04 18:19:54 EDT
*** Bug 22143 has been marked as a duplicate of this bug. ***
Comment 9 Andy Brook CLA 2002-09-05 03:56:10 EDT
My 2p on this, granted timeouts etc:  It would be possible to dispose of the UI
component immediately and let the connection timeout naturally and be cleaned up
(by executing on another thread) - i.e. not terminating the thread but allowing
it to timeout naturally would ensure OS resources arent be gobbled up
indefinitely and the user would see immediate feedback through UI being disposed.

Maybee this is a leetle too hacky :)
Comment 10 Christophe Elek CLA 2002-09-05 08:28:04 EDT
That is one possibility, my concern is leaving unclosed connections (maybe 1 
out of 100) but still...
Also, by looking at teh JDK 1.3 doc, I realize there is no clean way of killing 
a spawn thread... I believe 1.4 has a way, but not 1.4 for URL stream.

So we may have to hack anyway, and start a thread and use a deprecated call if 
the timer expires..

Unless I miss something... anyone ?
Comment 11 Michael Fraenkel CLA 2002-09-05 08:49:42 EDT
For 1.3, you don't have any options for the connect taking forever.  There is 
no good way to kill a thread regardless of deprecated methods.  They are all 
broke.
In my case, the reads were blocking which can be controlled via setSoTimeout.
Comment 12 Christophe Elek CLA 2002-09-05 09:34:10 EDT
hum isn't that only for socket ?
Excue my ingnorance but I thought in 1.3 you can set teh socket timout, not the 
URL connection (even though they may run on sockets right ? ;-)

What we open is URL.openConnection()...
Comment 13 Steve Francisco CLA 2002-09-05 17:23:25 EDT
Would it be simple to attempt to open a socket (where a timeout can be defined) 
and only attempt to use the URL if the socket was possible?  This need only be 
attempted for http connections (not file:///).  If the socket fails to connect, 
then we report that it was not available and avoid the hang.
Comment 14 Michael Fraenkel CLA 2002-09-06 08:52:20 EDT
Found the easiest/only solution.

You can affect the connect timeout and read timeout of a HttpURLConnection via 
the following System properties:

sun.net.client.defaultConnectTimeout
sun.net.client.defaultReadTimeout

The defaults for both are -1.
Comment 15 Christophe Elek CLA 2002-09-06 09:18:03 EDT
Yep, good catch but ;-)
1) isn't that only in 1.4 ?
2) sun.* classes, never very good ;-)
3) will it work if Eclipse runs on J9 or QNX ?

nevertheless, I agree this is the 'perfect' solution in a 'perfect' world.

http://java.sun.com/j2se/1.4/docs/guide/net/properties.html
Comment 16 Christophe Elek CLA 2002-09-06 13:43:59 EDT
*** Bug 20099 has been marked as a duplicate of this bug. ***
Comment 17 Dejan Glozic CLA 2002-09-10 17:18:20 EDT
*** Bug 18505 has been marked as a duplicate of this bug. ***
Comment 18 Christophe Elek CLA 2003-01-10 10:01:44 EST
Action Taken: We should target this one for M5. It doesn't have a workaround.
Action Plan: Investigate HttpURLConnection.disconnect()
Comment 19 Dejan Glozic CLA 2003-01-20 15:55:02 EST
I am lowering the priority to P2. Even though there is no workaround, we cannot 
claim that we can fix it (we are limited by the underlying network support in 
java.net). P1 would mean that we will not ship without this, which is a bit too 
strong for this defect.
Comment 20 Dejan Glozic CLA 2003-02-13 13:07:09 EST
*** Bug 30993 has been marked as a duplicate of this bug. ***
Comment 21 Christophe Elek CLA 2003-02-18 13:08:36 EST
*** Bug 32140 has been marked as a duplicate of this bug. ***
Comment 22 Dejan Glozic CLA 2003-02-18 13:46:53 EST
*** Bug 32140 has been marked as a duplicate of this bug. ***
Comment 23 Dejan Glozic CLA 2003-02-19 16:02:50 EST
This is how 2.1 implementation is going to look like:

1) We will definitely set sun properties:

sun.net.client.defaultConnectTimeout
sun.net.client.defaultReadTimeout

to something moderate (say 60 seconds)

2) We will find out if equivalent properties are available for other 
implementations (J9, IBM VM) and set those as well.

3) We will not try to address all connections in 2.1 (too complex), 
but most of the problems happen at the front end i.e. when trying to
connect a site (wrong URL, too slow, network problems, no proxy etc.).
We will handle update site connection using a connection manager class.
When InputStream is needed from the HttpURLConnection, we will
spawn another thread to call 'getInputStream()'. In the main thread
(again, not the GUI thread but the main worker thread for the connection),
we will call 'join' on the connection thread with some small 
interval (say 200ms). When the connection thread is done or 'join' times out,
we will check if the user pressed 'Cancel' button. We will leave the
blocked connection thread die the natural 'timeout' death while the UI is fully
responsive.

In an unlikely event that the connection does not time out ever, we have a 
limit of connection threads will can spawn (10). Again, the UI will be 
fully responsive and the worse that can happen would be to restart Eclipse
in an orderly fashion without loosing our work. Restarting would force
these threads to close.

This change will affect attempts to expand the site bookmark in the UI
and search for new updates. In both cases, 'Cancel' will cause the UI
to immediately return and be operational.

The change will not affect problems with the network that happen
in the middle of a read. We read bytes in buffer-size blocks and
if the network is slow, we will eventually read the buffer and react
to the 'cancel' button between the two reads. If the 'read' method
blocks, the only solution is for the timeout to throw IOException.
You need to be on JDK 1.4 for this.

I am resolving this defect as 'Later' to be reopened in 2.2 when we 
will rework the whole network connection layer in Update.
Comment 24 David Williams CLA 2006-04-30 02:18:37 EDT
Is later now? :) 

I'm reopening because I just saw a "hang" on a socket read (I'll attach dump) which not only was not cancelable, but froze up whole UI/Display. (didn't repaint). 

This was starting off with RC1a platform, and trying Callisto RC1a site. 
Not sure, exactly, why it would hang, but if it helps, I did try both the "plain" Callisto site.xml from a browser, and tried, from a browser, the download mirros script, and both responed right away. 

I was on widnows xp, sp2


Comment 25 David Williams CLA 2006-04-30 02:20:11 EDT
Created attachment 39898 [details]
thread dump during hang of update manager
Comment 26 David Williams CLA 2006-04-30 02:54:15 EDT
FYI, as can been seen from thread dump, I was using a Sun 1.5 VM. 

I partially re-opened this thread, instead of a new one, so the whole 1.4 history could be recalled. I'd suggest one improvement might be made to 
ConnectionThreadManager 
so it was sensitive if it was using a 1.5 VM, then it could use 
setConnectTimeout API on URLConnection. 

As second, more minor, improvement to ConnectionThreadManager ... those 
1.4 Sun properties are set in the constructor, and never reset to what they were, if any. A slightly better pattern would be to remember their current values, if any, and reset them when done. 

Comment 27 David Williams CLA 2006-04-30 03:00:28 EDT
Branko ... If I'd noticed the long (old) CC list on this one I would have just oepned a new one :) ... but, I'm adding you ... I wanted to be sure the right people "saw" it ... and not sure who that might be ... except perhaps you? 

Comment 28 Chris Aniszczyk CLA 2006-10-25 00:32:34 EDT
Can we look at this issue again somewhat? I hit this with J9 and other lovely JVMs. I think the approach David suggests maybe reasonable. However, the argument can be made that people are responsible for setting the proper system properties for timeouts per each JVM.
Comment 29 Dejan Glozic CLA 2007-04-13 11:09:47 EDT
We had a problem with JDK 1.3 in the past but with 1.4 the connection timeout was set much shorter. I don't see how JDK 1.5 would help us other than to make this timeout shorter still. In addition, I don't have the cycles for playing with the method and testing it out. Patches welcome.
Comment 30 Michael Berg CLA 2007-07-27 12:02:22 EDT
Seems there is a number of problems with Cancel not working. When i try to Cancel the workspace from being rebuilt it just ignores me and keeps on building it.
Comment 31 John Arthorne CLA 2008-04-14 13:24:22 EDT
*** Bug 147803 has been marked as a duplicate of this bug. ***
Comment 32 John Arthorne CLA 2008-04-14 13:25:25 EDT
*** Bug 165311 has been marked as a duplicate of this bug. ***
Comment 33 John Arthorne CLA 2008-04-14 13:26:10 EDT
*** Bug 198282 has been marked as a duplicate of this bug. ***
Comment 34 John Arthorne CLA 2008-04-14 13:26:29 EDT
*** Bug 166810 has been marked as a duplicate of this bug. ***
Comment 35 Antonio Petrelli CLA 2008-04-29 06:09:49 EDT
I solved, in a dirty way, the problem by using Simple DNS Plus under Windows XP.
The local DNS server returns "refused" when an external address is queried, while for internal ones (local servers, proxy) they are forwarded to the "normal" DNS servers.
Anyway this is a terrible hack, the real solution is *not* doing DNS queries when behind a proxy!
Comment 36 Zina Mostafia CLA 2008-11-07 15:02:39 EST
Considering the severity of this bug, please provide a target milstone when this defect would be fixed.
Comment 37 John Arthorne CLA 2008-11-10 20:37:31 EST
This bug is against the legacy Update Manager that has been replaced in Eclipse 3.4 with a new provisioning system from Equinox. I suggest trying out recent builds, particularly 3.5 M3 or greater, and this situation should be better. We now use Apache HTTP client, which deals with frozen connections much better than the java.net HTTP client. If you are still seeing problems in 3.5 M3 or later, please open a new bug against Equinox p2.