Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 325824 - Proxy access to dependencies using Ivy erratic and all Virgo Build jobs failing.
Summary: Proxy access to dependencies using Ivy erratic and all Virgo Build jobs failing.
Status: CLOSED FIXED
Alias: None
Product: Virgo
Classification: RT
Component: unknown (show other bugs)
Version: unspecified   Edit
Hardware: PC Mac OS X - Carbon (unsup.)
: P3 critical (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 325820 (view as bug list)
Depends on:
Blocks: 325820
  Show dependency tree
 
Reported: 2010-09-21 06:02 EDT by Steve Powell CLA
Modified: 2010-10-18 11:04 EDT (History)
4 users (show)

See Also:


Attachments
Screenshot (102.93 KB, image/jpeg)
2010-09-21 10:47 EDT, Denis Roy CLA
no flags Details
Screenshot (49.17 KB, image/jpeg)
2010-09-23 10:31 EDT, Denis Roy CLA
no flags Details
Ant log file showing successful proxy access (38.87 KB, application/x-gzip)
2010-09-28 08:27 EDT, Denis Roy CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Powell CLA 2010-09-21 06:02:00 EDT
Proxy variables are not correctly set in build environments. Affects all virgo.* builds
Comment 1 Denis Roy CLA 2010-09-21 10:47:53 EDT
Created attachment 179317 [details]
Screenshot

Here is a screenshot of the global options for the master and both slaves.  I've confirmed that proxy.eclipse.org is the right IP and is responding correctly.
Comment 2 Steve Powell CLA 2010-09-22 06:31:24 EDT
Your screenshot not-withstanding, this is what I see from our ant target 'set-hudson-proxy':
(taken from job virgo-osgi-extensions.snapshot  #38)

--8<------
[build-osgi-extensions] $ /shared/common/apache-ant-1.7.1/bin/ant -file build.xml -Declipse.buildId=hudsonbuild -Dbuild.stamp=CI-${BUILD_ID} -Dci.build=true set-hudson-proxy clean clean-integration test javadoc-api javadoc-all package
Buildfile: build.xml

set-hudson-proxy:

BUILD FAILED
/opt/users/hudsonbuild/workspace/virgo.osgi-extensions.snapshot/virgo-build/multi-bundle/common.xml:31: Property 'http.proxyPort' is not set for set-hudson-proxy target.
----------

and the definition of the target 'set-hudson-proxy' is (now):

--8<-----
<target name="set-hudson-proxy" description="Sets the jets3t property file to proxy settings from env variables.">
	<fail unless="http.proxyHost">Property 'http.proxyHost' is not set for set-hudson-proxy target.</fail>
	<fail unless="http.proxyPort">Property 'http.proxyPort' is not set for set-hudson-proxy target.</fail>
	<echo file="../virgo-build/lib/ivy/jets3t.properties"># Hudson Generated Proxy Configuration
httpclient.proxy-autodetect=false
httpclient.proxy-host=${http.proxyHost}
httpclient.proxy-port=${http.proxyPort}
</echo>
</target>
---------

Notice that the host is set but the port is not.
Comment 3 Steve Powell CLA 2010-09-22 10:17:24 EDT
We have abandoned the use of the s3 protocol and removed set-hudson-proxy form our target list, and instead use the http protocol for ivy resolution.

The jobs are all failing to resolve dependencies.  Why does this not work???

This is now critical.
Comment 4 Steve Powell CLA 2010-09-22 10:19:41 EDT
Extract from virgo.apps.snapshot build -- all rest similar...

--8<--------
resolve.compile:
[ivy:cachepath] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ ::
[ivy:cachepath] :: loading settings :: file = /opt/users/hudsonbuild/workspace/virgo.apps.snapshot/virgo-build/common/ivysettings.xml
[ivy:cachepath] Sep 22, 2010 10:07:03 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
[ivy:cachepath] Sep 22, 2010 10:07:03 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: Retrying request
[ivy:cachepath] Sep 22, 2010 10:10:12 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
[ivy:cachepath] Sep 22, 2010 10:10:12 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: Retrying request
[ivy:cachepath] Sep 22, 2010 10:13:21 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
[ivy:cachepath] Sep 22, 2010 10:13:21 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: Retrying request
------------

None of the Virgo builds are running to completion.  The use of proxies by ant for http connections ought to be automatic -- am I right??
Comment 5 Steve Powell CLA 2010-09-22 10:24:13 EDT
Some jobs appear to get through the first use of the  proxy for dependencies, buyt fail on subsequent ones (by which I mean stick at them)  E.g. virgo.test.snapshot has the same symptoms as reported in Comment 4 but AFTER it has successfully resolved other stuff.

Is the proxy giving us access to all public external sites equally?  Could this be why Bug 325254 is happening?
Comment 6 Denis Roy CLA 2010-09-22 10:35:43 EDT
There must be something specific to Ivy, since from the console, I can access any page using the proxy:

droy@hudson:~> wget http://cbc.ca
--2010-09-22 10:30:01--  http://cbc.ca/
Resolving proxy.eclipse.org... 206.191.52.57
Connecting to proxy.eclipse.org|206.191.52.57|:9898... connected.
Proxy request sent, awaiting response... 301 Moved Permanently
Location: http://www.cbc.ca/ [following]
--2010-09-22 10:30:01--  http://www.cbc.ca/
Connecting to proxy.eclipse.org|206.191.52.57|:9898... connected.
Proxy request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html'

    [ <=>                                                                                                                                                                                                ] 38,453      --.-K/s   in 0.008s  

2010-09-22 10:30:01 (4.54 MB/s) - `index.html' saved [38453]



If I look at the logs on proxy.eclipse.org, I can see why this is failing:

172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/libraries/external/org.junit/com.springsource.org.junit/4.7.0/com.springsource.org.junit-4.7.0.jar HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/com.springsource.org.junit-4.7.0.jar HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/libraries/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/libraries/external/org.junit/com.springsource.org.junit/4.7.0/com.springsource.org.junit-4.7.0.jar HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/bundles/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/ivy-3.6.1.R36x_v20100806.xml HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/bundles/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/org.eclipse.osgi-3.6.1.R36x_v20100806.jar HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/libraries/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/ivy-3.6.1.R36x_v20100806.xml HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/libraries/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/org.eclipse.osgi-3.6.1.R36x_v20100806.jar HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/bundles/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/ivy-3.6.1.R36x_v20100806.xml HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/bundles/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/org.eclipse.osgi-3.6.1.R36x_v20100806.jar HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/libraries/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/ivy-3.6.1.R36x_v20100806.xml HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"
172.30.206.142 - - [22/Sep/2010:10:34:01 -0400] "HEAD /ivy/libraries/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/org.eclipse.osgi-3.6.1.R36x_v20100806.jar HTTP/1.1" 404 - "-" "Apache Ivy/2.1.0"


What is the full URL to those files?
Comment 7 Denis Roy CLA 2010-09-22 10:40:54 EDT
The FAQ states this:

First, check if the ibiblio site is ok with your favorite browser. If the site is ok, maybe it's a problem of proxy configuration. Set your ANT_OPTS environment variable
to configure your proxy if you have one.
For instance:

set ANT_OPTS=-Dhttp.proxyHost=myproxy -Dhttp.proxyPort=3128

Or for authenticated proxy:

set ANT_OPTS=-Dhttp.proxyHost=myproxyhost -Dhttp.proxyPort=8080 -Dhttp.proxyUserName=myproxyusername -Dhttp.proxyPassword=myproxypassword -Dhttps.proxyHost=myproxyhost -Dhttps.proxyPort=8080

If it still doesn't work, maybe it's your dependency file which is not ok. Check
if the module name you depend on is actually a name of directory under
www.ibiblio.org/maven/. If this is the case, check if the jar with a name like
[module]-[revision].jar is present under the jars directory of this module on ibiblio.
For instance: www.ibiblio.org/maven/commons-httpclient/jars/commons-httpclient-2.0.jar

If this is the case, check your ivy configuration to see if you actually use the ibiblio
or ivyrep resolver.

Finally, you can check if the files were not downloaded but corrupted
(Ivy has no md5 checking for the moment) by checking your lib directory and opening
the jars if any with an unzip program.

If you still have problems post on the Mailing lists
mentioning your OS, your version of ant, your version of ivy, your configuration file
and your ivy file.


Is there anything in there that can be helpful?
Comment 8 Denis Roy CLA 2010-09-22 10:48:07 EDT
I noticed in your console output for virgo.test.snapshot :

[ivy:cachepath] :: loading settings :: file = /opt/users/hudsonbuild/workspace/virgo.test.snapshot/virgo-build/common/ivysettings.xml


I cannot find that file.
Comment 9 Chris Frost CLA 2010-09-22 10:49:44 EDT
This log output is what I expect, it's trying several locations and only one will have it but the full path it should be trying should start with:

http://repository.springsource.com 

So for the org.eclipse.osgi-3.6.1.R36x_v20100806.jar file the full and correct path is:

http://repository.springsource.com/ivy/bundles/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/org.eclipse.osgi-3.6.1.R36x_v20100806.jar

Chris.
Comment 10 Denis Roy CLA 2010-09-22 10:52:08 EDT
From the console I can fetch that file without any issues:

hudson:~ # wget http://repository.springsource.com/ivy/bundles/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/org.eclipse.osgi-3.6.1.R36x_v20100806.jar
--2010-09-22 10:49:15--  http://repository.springsource.com/ivy/bundles/external/org.eclipse.osgi/org.eclipse.osgi/3.6.1.R36x_v20100806/org.eclipse.osgi-3.6.1.R36x_v20100806.jar
Resolving proxy.eclipse.org... 206.191.52.57
Connecting to proxy.eclipse.org|206.191.52.57|:9898... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1147555 (1.1M) [application/octet-stream]
Saving to: `org.eclipse.osgi-3.6.1.R36x_v20100806.jar'

100%[===================================================================================================================================================================================================>] 1,147,555    876K/s   in 1.3s    

2010-09-22 10:49:17 (876 KB/s) - `org.eclipse.osgi-3.6.1.R36x_v20100806.jar' saved [1147555/1147555]
Comment 11 Steve Powell CLA 2010-09-22 11:00:37 EDT
(In reply to comment #8)
The file virgo.test.snapshot/virgo-build/common/ivysettings.xml file is visible in the workspace of the test.snapshot build from the user interface.  It ought to be there before every run.

Where were you looking?
Comment 12 Steve Powell CLA 2010-09-22 11:01:56 EDT
(In reply to comment #10)
OK So why can't we?

Is there an ant/ivy thing I need to do to use the proxy system you have set up?  I was under the impression that this was automatic.
Comment 13 Denis Roy CLA 2010-09-22 11:02:54 EDT
In the resolve.compile section of the console output.

https://hudson.eclipse.org/hudson/job/virgo.test.snapshot/222/console


The referenced ivysettings.xml does not exist.
Comment 14 Denis Roy CLA 2010-09-22 11:06:35 EDT
> Is there an ant/ivy thing I need to do to use the proxy system you have set up?
>  I was under the impression that this was automatic.


*shrug*  I don't know.  I don't even know what Ivy is.

As you can see, I didn't have to tell wget to use a proxy, but wget did issue a correct Proxy request after it was told to do so.  Perhaps Ivy is not that smart, hence the reason I'd like to look at your ivysettings.xml file.
Comment 15 Steve Powell CLA 2010-09-22 11:07:16 EDT
(In reply to comment #13)
I don't think so.  Otherwise it would say it couldn't find it, don't you think.

I'll run the job with -v just to make sure....
Comment 16 Steve Powell CLA 2010-09-22 11:14:00 EDT
See https://hudson.eclipse.org/hudson/me/my-views/view/Virgo/job/virgo.test.snapshot/224/console

for the same job running with -v

The ivysettings are being loaded  (where ARE you looking for this file? I can see it 

(here is a link: 
https://hudson.eclipse.org/hudson/me/my-views/view/Virgo/job/virgo.test.snapshot/ws/virgo-build/common/ivysettings.xml

which is what I get from my hudson page -- of course it might not work for you)

[I notice the term 'ws' in the path -- that's not the same as workspace....???]

Lots more evidence in the log now.....
Comment 17 Denis Roy CLA 2010-09-22 11:14:33 EDT
> I don't think so.  Otherwise it would say it couldn't find it, don't you think.

Yep, it's there on slave2.  I was looking on the master.

So at this point I can safely assume the problem lies within Ivy since:

a) the proxy works fine from the console of each slave system
b) other projects can fetch dependencies without issues
c) all your builds using Ivy seem to be failing

Of course there could be other build configurations outside of Hudson that could be interfering.  Unfortunately I'm not a build engineer and all the files I've looked at are Greek to me.

While we can spend days/weeks/months troubleshooting this, are there any alternatives to Ivy that you can possibly use?  I realize the dogmatic answer to this question is 'no' but I'm just trying to help you get a green build.  I didn't write Ivy and no matter how long I stare at the configs, everything seems OK and it _should_ work, but it doesn't.
Comment 18 Steve Powell CLA 2010-09-23 04:08:20 EDT
Sorry to go quiet -- we had a network outage -- preceded by some glitches, and I thought we would abandon it until the network calmed down.

I will modify (and rename) set-hudson-proxy to do proxy checks; refresh all the workspaces (we may have a modified properties file from the previous set-hudson-proxy target) and run again.  Hopefully I can tease apart the two issues:  proxy settings and proxy usage.
Comment 19 Steve Powell CLA 2010-09-23 05:27:49 EDT
Denis -- your proxy variables are NOT SET -- and the following excerpt from the job log clearly shows why!!!

TYPO in your ANT_OPTS??

---8<------------
check-proxy:

BUILD FAILED
/opt/users/hudsonbuild/workspace/virgo.test.snapshot/virgo-build/multi-bundle/common.xml:54: Property 'http.proxyHost' set to 'proxy.eclipse.org-Dhttp.proxyPort=9898' but 'http.proxyPort' not set.
	at org.apache.tools.ant.taskdefs.Exit.execute(Exit.java:142)
	at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
	at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
	at org.apache.tools.ant.Task.perform(Task.java:348)
	at org.apache.tools.ant.Target.execute(Target.java:357)
	at org.apache.tools.ant.Target.performTasks(Target.java:385)
	at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1337)
	at org.apache.tools.ant.Project.executeTarget(Project.java:1306)
	at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
	at org.apache.tools.ant.Project.executeTargets(Project.java:1189)
	at org.apache.tools.ant.Main.runBuild(Main.java:758)
	at org.apache.tools.ant.Main.startAnt(Main.java:217)
	at org.apache.tools.ant.launch.Launcher.run(Launcher.java:257)
	at org.apache.tools.ant.launch.Launcher.main(Launcher.java:104)

.
.
.
Finished: FAILURE
-----------------

We will leave the check-proxy target in from now on.

When you've fixed this we will proceed to test ant resolution.
Comment 20 Steve Powell CLA 2010-09-23 05:28:51 EDT
(Re: Comment 19) This excerpt was taken from virgo.test.snapshot build number 226 (triggered).
Comment 21 Denis Roy CLA 2010-09-23 09:48:05 EDT
(In reply to comment #19)
> Denis -- your proxy variables are NOT SET -- and the following excerpt from the
> job log clearly shows why!!!
> 
> TYPO in your ANT_OPTS??
[snip]
> When you've fixed this we will proceed to test ant resolution.


Well, that's interesting.  If you look at the screenshot attached to this bug, you can see the space between each parameter.  I even copied the ANT_OPTS string and pasted it into a monospaced font text editor, and there is a space there.

Just for kicks, I've set the string to this:

-Dhttp.proxyHost="bambam" -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.noProxyHosts="*.eclipse.org" -Dhttps.noProxyHosts="*.eclipse.org"

Can you see if it's picking up the change?
Comment 22 Denis Roy CLA 2010-09-23 09:48:47 EDT
The above change will not actually produce a functional build -- I just want to see if your debug sees "bambam" and correctly identifies the parameters.
Comment 23 Steve Powell CLA 2010-09-23 10:11:59 EDT
Tried this (with a clean workspace) and job failed (see Bug 326060).  Ho Hum... tried again....
Comment 24 Steve Powell CLA 2010-09-23 10:18:10 EDT
Virgo.test.snapshot build fails with bad proxy settings (#229).

Did not see bambam at all....

BUILD FAILED
/opt/users/hudsonbuild/workspace/virgo.test.snapshot/virgo-build/multi-bundle/common.xml:54: Property 'http.proxyHost' set to 'proxy.eclipse.org-Dhttp.proxyPort=9898' but 'http.proxyPort' not set.

Workspace looks fine after the mess cleaning it caused.....

but otherwise I don't see the correct properties being set.

Incidentally, build 138 of virgo.web.snapshot (which I have also disabled) ran before we 'spoke' today and picked up the correct proxy settings (!) but still went into a black hole trying to resolve things with Ivy.

This is not looking good.
Comment 25 Denis Roy CLA 2010-09-23 10:31:47 EDT
Created attachment 179463 [details]
Screenshot

So I've set all the options to different values, as shown in the attached screenshot.  Can you please try again?  If your build is not picking up one of these, then we need to track down where exactly your build are getting the settings from.
Comment 26 Steve Powell CLA 2010-09-23 10:45:02 EDT
OK -- don't touch anything; I'll re-run test just once...
Comment 27 Steve Powell CLA 2010-09-23 10:47:39 EDT
Build virgo.test.snapshot/230 exhibits the bad proxy behaviour and the values are NOT as you have just 'set' them.  

--8<----
check-proxy:

BUILD FAILED
/opt/users/hudsonbuild/workspace/virgo.test.snapshot/virgo-build/multi-bundle/common.xml:54: Property 'http.proxyHost' set to 'proxy.eclipse.org-Dhttp.proxyPort=9898' but 'http.proxyPort' not set.
--------

over to you.
Comment 28 Steve Powell CLA 2010-09-23 10:49:24 EDT
(Let's get rid of the proxies altogether... I'm certain there are firewall solutions that do not require a proxy host setup for apps running behind it.)
Comment 29 Denis Roy CLA 2010-09-23 11:01:32 EDT
I found it!

So apparently 'global' doesn't mean 'all of hudson'.  hudson-slave2's configuration page showed another set of ANT_OPTS and there was indeed a space missing in the options.

Please try now.
Comment 30 Steve Powell CLA 2010-09-23 11:03:38 EDT
OK (One more go...)
Comment 31 Denis Roy CLA 2010-09-23 12:59:29 EDT
on slave2, I noticed that the -Dhttp.nonProxyHosts did not have the double-quotes, whereas the slave1 and master do have quotes.

-Dhttp.nonProxyHosts="*.eclipse.org"

I have set the double quotes for slave2.
Comment 32 Steve Powell CLA 2010-09-24 07:11:51 EDT
I've noticed on the net that there are issues that were around Ant 1.7.1  around a couple years back, and we are running Ant 1.7.1 (2008) version on hudson.  Could I prevail upon you to upgrade (or provide a later version of) Ant -- preferably 1.8.1 (2010) which we are using?
Comment 33 Denis Roy CLA 2010-09-24 11:10:57 EDT
I've added ant 1.8.1 to the list of available ants.  Let me know how that works out.
Comment 34 Steve Powell CLA 2010-09-24 11:17:36 EDT
OK Thanks -- just tried running the job on slave-1 -- and the eclipse timeout occurs.  Have you made progress on the build.eclipse.org  access problem?
Comment 35 Steve Powell CLA 2010-09-24 11:25:20 EDT
Well, that appears to make no difference.

I tried with and without -autoproxy virgo.test.snapshot on slave2.

There is an interesting line I've just spotted on the ivy.init target:

ivy.init:
parsing buildfile jar:file:/opt/users/hudsonbuild/workspace/virgo.test.snapshot/virgo-build/lib/ivy/ivy.jar!/org/apache/ivy/ant/antlib.xml with URI = jar:file:/opt/users/hudsonbuild/workspace/virgo.test.snapshot/virgo-build/lib/ivy/ivy.jar!/org/apache/ivy/ant/antlib.xml from a zip file
[antlib:org.apache.ivy.ant] Could not load definitions from resource org/apache/ivy/ant/antlib.xml. It could not be found.

so I'll chase that up to see if it is a problem.
Comment 36 Steve Powell CLA 2010-09-24 12:04:21 EDT
Found this in hudson job log:


+GConf Error: Failed to contact configuration server; some possible causes are that you need to enable TCP/IP networking for ORBit, or you have stale NFS locks due to a system crash. See http://www.gnome.org/projects/gconf/ for information. (Details -  1: Not running within active session)

See if it makes any sense to you...
Comment 37 Denis Roy CLA 2010-09-24 14:23:10 EDT
"GConf is a system used by the GNOME desktop environment for storing configuration settings for the desktop and applications."

We don't use GNOME and/or GConf, so I'm not quite sure why it's in your logs.
Comment 38 Steve Powell CLA 2010-09-27 06:07:41 EDT
(In reply to comment #35)
The ivy.init line message is benign -- happens here, too. Not a problem.

The GConf message indicates that there might be an NFS problem or TCP/IP networking issue; though the message itself appears to be irrelevant.

I've run out of things to check up on.

Running the build locally and comparing it with the hudson build (exactly the same targets, but one on mac and one on Linux, and one proxied and one not proxied) The only significant discrepancy is the GConf line (which occurs immediately before the failing IVY dependency resolutions).

Any suggestions?
Comment 39 Denis Roy CLA 2010-09-27 09:57:23 EDT
I'm not sure what gconf is trying to connect to, or why.  I have not set anything up specifically for it.
Comment 40 Steve Powell CLA 2010-09-27 10:48:02 EDT
OK I've set up a ubuntu VM image her and note that there are 'network proxy' System/Preferences.

If I set the manual proxy config (same for all protocols) and press the 'Apply System-wide' button I get a password/authority prompt TWICE -- one the system one and one for gconf (?!?).  Clearly gconf is doing something in the background by default. 

(The build fails on this system -- with and without http,proxy... property settings -- because I don't have a valid proxy to point it to -- if I choose proxy.eclipse.org (which doesn't know me, of course) then it times out: 

INFO: I/O exception (java.net.SocketException) caught when processing request: Can't connect to SOCKS proxy:Connection refused

and takes a long time in a retry loop; rather like one of our inaccessible errors.)

It might be worth checking the settings on the linux slaves.  Have they been properly 'propagated'?

This sort of setup looks like ant/ivy shouldn't have to do anything at all to use the proxy.
Comment 41 Steve Powell CLA 2010-09-28 07:04:01 EDT
OK I have reset my local Linux Ubuntu image to proxy through 'proxy.vmware.com' port '3128'.

Set up Network Proxy under system/preferences and set these as 'system wide' (this may mean /etc/lynx.cfg).

Wget responds thus:

--8<----
11:52:05 ~ : wget http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
--2010-09-28 11:52:15--  http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
Resolving proxy.vmware.com... 10.16.67.187
Connecting to proxy.vmware.com|10.16.67.187|:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1041 (1.0K) [application/octet-stream]
Saving to: `ivy-4.7.0.xml'

100%[======================================>] 1,041       --.-K/s   in 0s      

2010-09-28 11:52:16 (127 MB/s) - `ivy-4.7.0.xml' saved [1041/1041]
--------

although I cannot ping repository.springsource.com.

The build command:

ant -file build.xml -Dhttp.proxyHost=proxy.vmware.com -Dhttp.proxyPort=3128 -Declipse.buildId=hudsonbuild -Dbuild.stamp=CI-${BUILD_ID} -Dci.build=true check-proxy clean-ivy clean clean-integration test javadoc-api javadoc-all package -v >../antlog.txt 2>1

(which ought to be what we get on slave-2)

runs successfully!  The resolve messages don't mention the proxy, but without the http.proxy... properties it fails.

--8<----
[ivy:cachepath] external-lookup: Checking cache for: dependency: org.junit#com.springsource.org.junit;4.7.0 {*=[*]}
[ivy:cachepath] 		tried /tmp/local-repository/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] 	local-external-repository: no ivy file found for org.junit#com.springsource.org.junit;4.7.0
[ivy:cachepath] 		tried http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] using commons httpclient 3.x helper
[ivy:cachepath] 	bundle-external-repository-s3.read.resolver: found md file for org.junit#com.springsource.org.junit;4.7.0
[ivy:cachepath] 		=> http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml (4.7.0)
[ivy:cachepath] downloading http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml ...
[ivy:cachepath] 	bundle-external-repository-s3.read.resolver: downloading http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] 	bundle-external-repository-s3.read.resolver: downloading http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml.sha1
[ivy:cachepath] sha1 OK for http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] 	[SUCCESSFUL ] org.junit#com.springsource.org.junit;4.7.0!ivy.xml(ivy.original) (470ms)
--------


So, I think your linux slaves are not set up for proxies correctly.  Please ensure that your system-wide proxy settings are set (I'm afraid I don't know how you are supposed do this, nor do I know what distro of linux you are running with).
Comment 42 Steve Powell CLA 2010-09-28 07:41:10 EDT
Denis,
If you want any info about or files from my linux vm set-up, please just ask -- I have frozen the VM with the proxy settings so I can refer to it later.
Steve
Comment 43 Denis Roy CLA 2010-09-28 08:26:37 EDT
> So, I think your linux slaves are not set up for proxies correctly.

I agree.

I adapted your command and ran it on hudson-slave2:
hudsonbuild@hudson-slave2:~ant -file workspace/virgo.test.snapshot/build-test-framework/build.xml -Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9080 -Declipse.buildId=hudsonbuild -Dbuild.stamp=CI-${BUILD_ID} -Dci.build=true check-proxy clean-ivy clean clean-integration test javadoc-api javadoc-all package -v >antlog.txt 2>1


It seems to have succeeded.  I will attach the antlog.txt file.
Comment 44 Denis Roy CLA 2010-09-28 08:27:56 EDT
Created attachment 179733 [details]
Ant log file showing successful proxy access
Comment 45 Denis Roy CLA 2010-09-28 09:04:19 EDT
I re-ran the command, omitting these two parameters:

 -Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898

It ran successfully, fetching all the remote dependencies.

Using the 'export' command, I can see the set environment variables:

hudsonbuild@hudson-slave2:~> export
declare -x ANT_ARGS="-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org"
declare -x ANT_OPTS="-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org"

[snip]
declare -x JAVA_ARGS="-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org"
declare -x JVM_OPTS="-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org"

[snip]
declare -x http_proxy="http://proxy.eclipse.org:9898"
declare -x https_proxy="http://proxy.eclipse.org:9898"
declare -x no_proxy="localhost, 127.0.0.1, 172.30.206.0, eclipse.org"

I've also checked the hudson-slave2 Hudson configuration for the node, and all the above variables are set in the node's config.

When I start the Hudson slave node, this is what I see in the log:

[09/28/10 08:52:18] [SSH] Opening SSH connection to 172.30.206.142:22.
[09/28/10 08:52:18] [SSH] Authenticating as hudsonbuild with /opt/users/hudsonbuild/.ssh/id_rsa.
[09/28/10 08:52:18] [SSH] Authentication successful.
[09/28/10 08:52:18] [SSH] The remote users environment is:
ANT_ARGS='-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org'
ANT_OPTS='-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org'
BASH=/bin/bash

[snip]
HOME=/opt/users/hudsonbuild
HOST=hudson-slave2
HOSTNAME=hudson-slave2
HOSTTYPE=x86_64

[snip]
JAVA_ARGS='-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org'
JVM_OPTS='-Dhttp.proxyHost=proxy.eclipse.org -Dhttp.proxyPort=9898 -Dhttps.proxyHost=proxy.eclipse.org -Dhttps.proxyPort=9898 -Dhttp.nonProxyHosts=*.eclipse.org -Dhttps.nonProxyHosts=*.eclipse.org'

[snip]
http_proxy=http://proxy.eclipse.org:9898
https_proxy=http://proxy.eclipse.org:9898
no_proxy='localhost, 127.0.0.1, 172.30.206.0, eclipse.org'


I'll look into the GConf error more closely.  It seems like it's the only thing that is different.
Comment 46 Steve Powell CLA 2010-09-28 09:09:04 EDT
Denis you ran with http.proxyPort=9080, but the setting you have always told us is http.proxyPort 9898

Can you explain the difference, please?

I see in your antlog that 9080 was used and seems to have worked.
Comment 47 Steve Powell CLA 2010-09-28 09:09:45 EDT
Also, can you ensure that the ant command is run under the correct user from the slave (compared to your hand tests)?
Comment 48 Denis Roy CLA 2010-09-28 10:52:52 EDT
(In reply to comment #46)
> Denis you ran with http.proxyPort=9080, but the setting you have always told us
> is http.proxyPort 9898
> 
> Can you explain the difference, please?
> 
> I see in your antlog that 9080 was used and seems to have worked.

9080 was a typo, but it seems the parameter is ignored when the environment specifies proxy settings since it worked.  When I removed the parameters it worked anyway.
Comment 49 Denis Roy CLA 2010-09-28 10:54:24 EDT
(In reply to comment #47)
> Also, can you ensure that the ant command is run under the correct user from
> the slave (compared to your hand tests)?

As per comment 43 and comment 44 I run all this as user 'hudsonbuild', which is the same user id the Hudson slave process is running under.
Comment 50 Steve Powell CLA 2010-09-28 11:04:51 EDT
(In reply to comment #49)
This is precisely why I asked.  Though I can tell that you are running your tests under hudsonbuild, I have no evidence that the ant jobs kicked off on slave-2 are running as user hudsonbuild.  How do we know? There is nothing I can see in the Console Log.
Comment 51 Denis Roy CLA 2010-09-28 11:13:20 EDT
The slave is launched via ssh as user hudsonbuild.  ps aux | grep java reveals that all the java processes are being run as hudsonbuild.  You could print the value of System.getenv("USER") in your job -- it should output hudsonbuild.
Comment 52 Denis Roy CLA 2010-09-28 11:27:14 EDT
I am unable to find anything conclusive with GConf.  Oddly, your job #247 does not mention the GConf error, but it still fails to fetch the dependencies.

There is either a problem with your job(s), or something intricate with Hudson/Ivy.  The proxy works, the ant job works from the command line.

At this time I've exhausted my troubleshooting options.  From here, may I suggest fetching dependencies another way?  Other projects seem to be building just fine while fetching external dependencies.

I'm sorry -- but there's not much more I can do here.
Comment 53 Glyn Normington CLA 2010-09-28 11:53:14 EDT
(In reply to comment #52)
> At this time I've exhausted my troubleshooting options.  From here, may I
> suggest fetching dependencies another way?  Other projects seem to be building
> just fine while fetching external dependencies.

I'm afraid that option isn't really viable. We are about to release Virgo 2.1.0 next month and this is not a good time to rearchitect the build. I feel we are making steady progress towards identifying the problem, so let's not give up hope now.

Remember all was well before we had a proxy...
Comment 54 Steve Powell CLA 2010-09-28 12:26:47 EDT
Cannot find gconftool (or gconftool-2) commands; nor etc/lynx.cfg files (another suggestion).

On my linux gconftool -R /system
lists a lot of system properties, including the proxy settings.  maybe you should install gconf.

What distro of linux are you running?

How do you test what the linux firewall/proxy settings are for your distro.  How are you setting them?
Comment 55 Denis Roy CLA 2010-09-29 13:53:10 EDT
(In reply to comment #54)
> On my linux gconftool -R /system
> lists a lot of system properties, including the proxy settings.  maybe you
> should install gconf.

gconf is installed.  I am curious to know what part of your build is trying to access GNOME configuration?

> What distro of linux are you running?

SuSE Linux Entreprise 11

I've configured GConf for the hudsonbuild user on hudson-slave2.  The set values seem to persist in local files.

hudsonbuild@hudson-slave2:~> gconftool-2 -R /system/http_proxy
 port = 8080
 authentication_password = 
 ignore_hosts = [localhost,127.0.0.0/8]
 authentication_user = 
 host = 
 use_authentication = false
 use_http_proxy = false

hudsonbuild@hudson-slave2:~> gconftool-2 --type bool --set /system/http_proxy/use_http_proxy true
hudsonbuild@hudson-slave2:~> gconftool-2 --type string --set /system/http_proxy/host proxy.eclipse.org
hudsonbuild@hudson-slave2:~> gconftool-2 --type int --set /system/http_proxy/port 9898
hudsonbuild@hudson-slave2:~> gconftool-2 --type string --set /system/http_proxy/host proxy.eclipse.org

hudsonbuild@hudson-slave2:~> gconftool-2 -R /system/http_proxy
 port = 9898
 authentication_password = 
 ignore_hosts = [localhost,127.0.0.0/8]
 authentication_user = 
 host = proxy.eclipse.org
 use_authentication = false
 use_http_proxy = true

hudsonbuild@hudson-slave2:~> ls -l .gconf/system/http_proxy/
total 4
-rw------- 1 hudsonbuild callisto-dev 365 2010-09-29 13:49 %gconf.xml

hudsonbuild@hudson-slave2:~> cat .gconf/system/http_proxy/%gconf.xml 
<?xml version="1.0"?>
<gconf>
        <entry name="port" mtime="1285782570" type="int" value="9898">
        </entry>
        <entry name="host" mtime="1285782572" type="string">
                <stringvalue>proxy.eclipse.org</stringvalue>
        </entry>
        <entry name="use_http_proxy" mtime="1285782570" type="bool" value="true">
        </entry>
</gconf>


I'm not sure if I need to restart the slave process for this new environment to take effect.  Do you feel like giving this a try?
Comment 56 Steve Powell CLA 2010-10-01 04:36:31 EDT
With the gconftool additions (thanks) I see a timeout response rather than an immediate failure:

--8<----
[ivy:cachepath] external-lookup: Checking cache for: dependency: org.junit#com.springsource.org.junit;4.7.0 {*=[*]}
[ivy:cachepath] 		tried /tmp/local-repository/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] 	local-external-repository: no ivy file found for org.junit#com.springsource.org.junit;4.7.0
[ivy:cachepath] 		tried http://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] Oct 1, 2010 4:31:35 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
[ivy:cachepath] Oct 1, 2010 4:31:35 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: Retrying request
--------

I'm going to issue gconftool /R system to see if these are set like my local Ubuntu settings are when executing the ant job.

Can you look to see if we are still sending bad HEAD requests to the proxy?
Comment 57 Steve Powell CLA 2010-10-01 04:39:43 EDT
(In reply to comment #56)
typo: gconftool -R /system
Comment 58 Steve Powell CLA 2010-10-01 05:42:06 EDT
Results:

I find no significant difference between my Ubuntu settings (using vmware proxy) and the hudson.ecliupse.org slave-2 settings (as revealed by gconftool).

I see the following settings on hudson (sorted for comparison):

     [exec]  authentication_password = 
     [exec]  authentication_user = 
     [exec]  autoconfig_url = 
     [exec]  ftp_host = 
     [exec]  ftp_port = 0
     [exec]  host = proxy.eclipse.org
     [exec]  ignore_hosts = [localhost,127.0.0.0/8]
     [exec]  mode = none
     [exec]  port = 9898
     [exec]  secure_host = 
     [exec]  secure_port = 0
     [exec]  socks_host = 
     [exec]  socks_port = 0
     [exec]  use_authentication = false
     [exec]  use_http_proxy = true
     [exec]  use_system_settings = only_if_mode_not_set

and am running with the following settings locally:

     [exec]  authentication_password = 
     [exec]  authentication_user = 
     [exec]  autoconfig_url = 
     [exec]  ftp_host = proxy.eclipse.org
     [exec]  ftp_port = 9898
     [exec]  host = proxy.eclipse.org
     [exec]  ignore_hosts = [localhost,127.0.0.0/8,*.local]
     [exec]  mode = manual
     [exec]  old_ftp_host = ftp-proxy.vmware.com
     [exec]  old_ftp_port = 3128
     [exec]  old_secure_host = proxy.vmware.com
     [exec]  old_secure_port = 2121
     [exec]  old_socks_host = proxy.vmware.com
     [exec]  old_socks_port = 3128
     [exec]  port = 9898
     [exec]  secure_host = proxy.eclipse.org
     [exec]  secure_port = 9898
     [exec]  socks_host = proxy.eclipse.org
     [exec]  socks_port = 9898
     [exec]  use_authentication = false
     [exec]  use_http_proxy = true
     [exec]  use_same_proxy = true

successfully.

I tried switching off the socks host etc but I still build successfully here; this is not showing a discrepancy.

The ball is still in your court Denis. If you can build manually on Slave_2 but the hudson server cannot kick off a build OF EXACTLY THE SAME FORM successfully, it must be the environment of the slave job that is wrong somehow -- but I don't know how.

The symptoms have changed since the gconf installation -- the builds now timeout waiting for a resolution response (and retry making the job last hours, potentially) and this implies that the proxy is not passing on some of the requests correctly.  We need to see if they are well-formed, and if they are the problem might be in the proxy server, perhaps 'authentication' of the requestor is what is happening (different from your manual test).

This problem stays at critical.
Comment 59 Steve Powell CLA 2010-10-01 05:43:38 EDT
(In reply to comment #58)
NOTE: the local settings listed are not successful, they are set to use the eclipse proxy (to aid my comparison) -- however, they work correctly with the vmware proxy replacing the eclipse one.
Comment 60 Denis Roy CLA 2010-10-13 11:55:29 EDT
Working with Steve and Glyn, I have come to conclude that our Hudson instance, including the master and both slaves, as well as our Proxy environment, are all working as expected. I was able to create a job which uses the Virgo Git repo and build.xml file to fetch external dependencies successfully using Ivy.

https://hudson.eclipse.org/hudson/job/z_denis_test2/

I'll move/reassign this to the Virgo team since it is not a Hudson or webmaster/infra issue.
Comment 61 Steve Powell CLA 2010-10-13 19:25:32 EDT
Denis:

You took it almost to the point of diagnosing exactly what was wrong; and then didn't do the comparison.

The issue is simply that we specify java options (permgen space is too small by default for many of our builds).  This, in the end, was the only difference between our job and your test one that made a difference.  It took me an hour of painstaking examination to find this.

The Virgo builds are now going through except where we need greater permgen space (e.g. kernel build).  We need to make some more experiments (half an hour each, perhaps?) to see what value we can set on these options and whether this allows our jobs to run.

I'll be closing this bug -- but there are still issues:

1) We need more permgen space and trial and error to discover what we can specify is too slow and error-prone.

2) Deleting  workspace makes the next job fail because of a badly constructed workspace (from git).

I'll rationalise these problems in other bugzillas when I can spare the time.
Comment 62 Steve Powell CLA 2010-10-14 10:40:22 EDT
*** Bug 325820 has been marked as a duplicate of this bug. ***
Comment 63 Steve Powell CLA 2010-10-18 11:04:28 EDT
Closing, as promised.