Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 322636

Summary: All Virgo builds are failing
Product: Community Reporter: Steve Powell <zteve.powell>
Component: CI-JenkinsAssignee: CI Admin Inbox <ci.admin-inbox>
Status: CLOSED FIXED QA Contact:
Severity: major    
Priority: P3 CC: d_a_carver, eclipse, webmaster
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Mac OS X - Carbon (unsup.)   
Whiteboard:

Description Steve Powell CLA 2010-08-13 06:03:04 EDT
-----------------------
builds on master
-----------------------
virgo.util.snapshot runs on master, and fails trying to download dependencies...

[ivy:cachepath] resolving dependencies for configuration 'runtime'
[ivy:cachepath] == resolving dependencies for org.springframework.build#org.springframework.build.ant-caller;working [runtime]
[ivy:cachepath] == resolving dependencies org.springframework.build#org.springframework.build.ant-caller;working->org.springframework.build#org.springframework.build.ant;1.1.0.RELEASE [runtime->runtime]
[ivy:cachepath] spring-portfolio-lookup: Checking cache for: dependency: org.springframework.build#org.springframework.build.ant;1.1.0.RELEASE {runtime=[runtime]}
[ivy:cachepath] 		tried /opt/public/jobs/virgo.util.snapshot/workspace/org.eclipse.virgo.util.common/../integration-repo/org.springframework.build/org.springframework.build.ant/1.1.0.RELEASE/ivy-1.1.0.RELEASE.xml
[ivy:cachepath] 	integration: no ivy file found for org.springframework.build#org.springframework.build.ant;1.1.0.RELEASE
[ivy:cachepath] 		tried /tmp/local-repository/org.springframework.build/org.springframework.build.ant/1.1.0.RELEASE/ivy-1.1.0.RELEASE.xml
[ivy:cachepath] 	local: no ivy file found for org.springframework.build#org.springframework.build.ant;1.1.0.RELEASE
[ivy:cachepath] 		tried s3://repository.springsource.com/ivy/bundles/release/org.springframework.build/org.springframework.build.ant/1.1.0.RELEASE/ivy-1.1.0.RELEASE.xml
[ivy:cachepath] Aug 13, 2010 5:46:46 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: I/O exception (org.apache.commons.httpclient.ConnectTimeoutException) caught when processing request: The host did not accept the connection within timeout of 60000 ms

Which appears to be the first dependency resolution IVY is doing in the build -- is there an access issue?
-------------------------------
other master builds appear to have the same issue.
-------------------------------
builds on Build2
-------------------------------
virgo.test.snapshot fails earlier with git issues:

Started by user spowell
Building remotely on hudson-slave2
Checkout:virgo.test.snapshot / /opt/users/hudsonbuild/workspace/virgo.test.snapshot - hudson.remoting.Channel@77ae9de4:hudson-slave2
Using strategy: Default
Last Built Revision: Revision e9f4e4b42536389ae3b648b6e0451d8795750970 (test/master)
Checkout:virgo.test.snapshot / /opt/users/hudsonbuild/workspace/virgo.test.snapshot - hudson.remoting.LocalChannel@332437f7
GitAPI created
Cloning the remote Git repository
Cloning repository test
$ git clone -o test git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.test.git /opt/users/hudsonbuild/workspace/virgo.test.snapshot
ERROR: Error cloning remote repo 'test' : Could not clone git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.test.git
ERROR: Cause: Error performing git clone -o test git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.test.git /opt/users/hudsonbuild/workspace/virgo.test.snapshot
Trying next repository
ERROR: Could not clone from a repository
FATAL: Could not clone
hudson.plugins.git.GitException: Could not clone
	at hudson.plugins.git.GitSCM$2.invoke(GitSCM.java:587)
	at hudson.plugins.git.GitSCM$2.invoke(GitSCM.java:535)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:1899)
	at hudson.remoting.UserRequest.perform(UserRequest.java:114)
	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
	at hudson.remoting.Request$2.run(Request.java:270)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)

-----------------------------------
This seems the same for all build2 builds.
Comment 1 Eclipse Webmaster CLA 2010-08-13 13:43:17 EDT
Hmmm, I think that's the result of Ivy not getting the proxy info when it started(due to a typo in the Global var name).  I've fixed that, but it seems to be 'stuck' in the compile.init phase.  Can you tell me more about what Ivy is trying to reach at that time?

And the Git problem should be fixed.

-M.
Comment 2 Steve Powell CLA 2010-08-16 07:40:11 EDT
Thanks for the Git fix...

build: https://hudson.eclipse.org/hudson/job/virgo.util.snapshot/82/ shows the compile.init failures with option -v on the ant command (I included a section of that in my original bug description).

This is all the information I can get from this distance.  It is definitely trying to download a dependency -- maybe it isn't using the proxy properly?  I guess this is happening on both the master and build2 now?

Steve Powell
Comment 3 Eclipse Webmaster CLA 2010-08-16 14:48:17 EDT
It sure looks like the ivy S3 resolver is ignoring the proxy.  I went digging with losf with the build was running and found this:

java    8636 hudsonbuild   40r  IPv6 142777      0t0        TCP hudson.eclipse.org:50736->72.21.203.146:https (SYN_SENT)

In looking at the ps listing I can see that the http.proxy options are set as I expect.

Is it possible to use the http/https resolver instead of S3?


-M.
Comment 4 Steve Powell CLA 2010-08-17 04:20:27 EDT
I don't think we can modify the resolution mechanisms -- these are pretty crucial in our set-up. However, I'll copy Chris (our build-meister) on this bug.
Comment 5 Steve Powell CLA 2010-08-17 04:38:56 EDT
The most recent failures are Git-related...


-----------------------------------------
Started by user mward
Building on master
Checkout:workspace / <https://hudson.eclipse.org/hudson/job/virgo.util.snapshot/ws/> - hudson.remoting.LocalChannel@6493a8fa
Using strategy: Default
Last Built Revision: Revision f4a3300f3c8d755282559d5c5df002d0f907ef67 (util/master)
Checkout:workspace / <https://hudson.eclipse.org/hudson/job/virgo.util.snapshot/ws/> - hudson.remoting.LocalChannel@6493a8fa
GitAPI created
Cloning the remote Git repository
Cloning repository util
$ git clone -o util git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.util.git <https://hudson.eclipse.org/hudson/job/virgo.util.snapshot/ws/>
Fetching upstream changes from git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.util.git
[workspace] $ git fetch -t git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.util.git +refs/heads/*:refs/remotes/util/*
[workspace] $ git ls-tree HEAD
GitAPI created
Fetching upstream changes from git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.virgo-build.git
[virgo-build] $ git fetch -t git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.virgo-build.git +refs/heads/*:refs/remotes/util/*
warning: no common commits
From git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.virgo-build
+ f4a3300...37d00ac master     -> util/master  (forced update)
* [new tag]         1.0        -> 1.0
* [new tag]         1.1        -> 1.1
* [new tag]         1.10       -> 1.10
* [new tag]         1.11       -> 1.11
* [new tag]         1.12       -> 1.12
* [new tag]         1.13       -> 1.13
* [new tag]         1.14       -> 1.14
* [new tag]         1.15       -> 1.15
* [new tag]         1.16       -> 1.16
* [new tag]         1.17       -> 1.17
* [new tag]         1.18       -> 1.18
* [new tag]         1.19       -> 1.19
* [new tag]         1.2        -> 1.2
* [new tag]         1.20       -> 1.20
* [new tag]         1.21       -> 1.21
* [new tag]         1.22       -> 1.22
* [new tag]         1.23       -> 1.23
* [new tag]         1.24       -> 1.24
* [new tag]         1.3        -> 1.3
* [new tag]         1.4        -> 1.4
* [new tag]         1.5        -> 1.5
* [new tag]         1.6        -> 1.6
* [new tag]         1.7        -> 1.7
* [new tag]         1.8        -> 1.8
* [new tag]         1.9        -> 1.9
[workspace] $ git submodule init
[workspace] $ git submodule update
[workspace] $ git tag -l master
[workspace] $ git rev-parse util/master
Commencing build of Revision 37d00acfba71bc5d90a8997ae20ae8897bd3af5c (util/master)
GitAPI created
Checking out Revision 37d00acfba71bc5d90a8997ae20ae8897bd3af5c (util/master)
[workspace] $ git checkout -f 37d00acfba71bc5d90a8997ae20ae8897bd3af5c
[workspace] $ git tag -a -f -m "Hudson Build #96" hudson-virgo.util.snapshot-96
Recording changes in branch util/master
[workspace] $ git whatchanged --no-abbrev -M --pretty=raw f4a3300f3c8d755282559d5c5df002d0f907ef67..37d00acfba71bc5d90a8997ae20ae8897bd3af5c
Cleaning workspace
[workspace] $ git clean -fdx
FATAL: Error performing git clean -fdx
Command returned status code 1: warning: failed to remove '.nfs000000003f0fe63400000002'
Removing .nfs000000003f0fe63400000002
Removing virgo-build/

hudson.plugins.git.GitException: Error performing git clean -fdx
Command returned status code 1: warning: failed to remove '.nfs000000003f0fe63400000002'
Removing .nfs000000003f0fe63400000002
Removing virgo-build/

	at hudson.plugins.git.GitAPI.launchCommandIn(GitAPI.java:356)
	at hudson.plugins.git.GitAPI.launchCommand(GitAPI.java:321)
	at hudson.plugins.git.GitAPI.launchCommand(GitAPI.java:331)
	at hudson.plugins.git.GitAPI.clean(GitAPI.java:187)
	at hudson.plugins.git.GitSCM$4.invoke(GitSCM.java:935)
	at hudson.plugins.git.GitSCM$4.invoke(GitSCM.java:860)
	at hudson.FilePath.act(FilePath.java:753)
	at hudson.FilePath.act(FilePath.java:735)
	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:860)
	at hudson.model.AbstractProject.checkout(AbstractProject.java:1038)
	at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:479)
	at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:411)
	at hudson.model.Run.run(Run.java:1248)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:129)
Caused by: hudson.plugins.git.GitException: Command returned status code 1: warning: failed to remove '.nfs000000003f0fe63400000002'
Removing .nfs000000003f0fe63400000002
Removing virgo-build/

	at hudson.plugins.git.GitAPI.launchCommandIn(GitAPI.java:351)
	... 15 more
Comment 6 Steve Powell CLA 2010-08-18 06:16:06 EDT
Deleted Workspace and tried again -- identical error as before.

I don't know where the files .nfs* are coming from -- is this related to your proxy fix?

In any case they are probably created with the wrong permissions, and cannot be removed by git -clean

I'm going to try a build without the clean step (which we normally do regardless) and see if we get any further.
Comment 7 Steve Powell CLA 2010-08-18 06:24:35 EDT
After clearing workspace and remnoving the clean -fdx step I get this:

FATAL: Unable to find build script at /opt/users/hudsonbuild/.hudson/jobs/virgo.util.snapshot/workspace/build-util/build.xml

after what looks like a clean clone of the repo.

The problem is that the build script should be in

/opt/users/hudsonbuild/.hudson/jobs/virgo.util.snapshot/workspace/util/build-util/build.xml

The configuration of the job hasn't changed -- and the specification of the repository directory as util is still there (the clone command is:

$ git clone -o util git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.util.git /opt/users/hudsonbuild/.hudson/jobs/virgo.util.snapshot/workspace

which clearly states the output directory is util)

so this must be a change in the git retrieval mechanism.

Raising the severity to major since none of our CI builds have been working for several days now.
Comment 8 Steve Powell CLA 2010-08-18 06:26:03 EDT
(In reply to comment #7)
PS: the workspace is incorrectly set up apparently.  Has virgo-build (the submodule) overwritten the parent git repo?
Comment 9 Steve Powell CLA 2010-08-18 06:31:09 EDT
(In reply to comment #8)
PPS:  Just checked configuration and saw this (optional) configuration property:

Local subdirectory for repo (optional)		
Specify a local directory (relative to the workspace root) where the Git repository will be checked out. If left empty, the workspace root itself will be used.

which I have now set to util.  This appears to be new (or else the default behaviour has changed).  

Where are the options for the submodule disposition?  (We normally expect a git submodule update --init to be performed at the root of the (parent) git repo - which will pull in a (read-only) copy of the submodule at the correct commit level.)

Thank you for your attention.
Comment 10 Steve Powell CLA 2010-08-18 06:36:27 EDT
(In reply to comment #9)
The addition of the util option seems to work better and we are back to the download from s3 problems.  Have you updated the job configuration in your testing of the problem?

If the s3 download problem is is a proxy issue the solution might mean us modifying our virgo-build ant/ivy tasks and settings.  Please apprise us of the status of this problem.

Thank you.
Comment 11 Chris Frost CLA 2010-08-18 07:44:25 EDT
I don't know why our s3 resolver isn't picking up any proxy settings? It is possible to switch it to a https type resolver but this really should be considered a last resort as it will be a big performance hit, not just for the Hudson builds but for anyone.
Comment 12 Steve Powell CLA 2010-08-18 09:06:45 EDT
Although some of the Virgo builds are running, this is because they are building fine against back-level dependencies (git clean-fdx is not used pre-build).  If their dependencies change they will fail in the same way as the other builds.

The s3 resolver failure is a show-stopper.
Comment 13 Eclipse Webmaster CLA 2010-08-18 09:33:00 EDT
The git sub-module option is part of the Git plugin update that was done, if you leave it blank it should behave as it used to.

The .nfs files exist because some process(presumably the clean) has opened and 'unlinked' the file, but hasn't terminated so it's being held open(since it's on nfs).

If there is a way to specify the Ivy proxy via a maven style .m2/settings.xml file I'm willing to try that.  But it has proven resistant to my efforts to make it use a proxy.  Perhaps it's an issue within the S3 resolver itself.

-M.
Comment 14 Steve Powell CLA 2010-08-18 10:08:37 EDT
(In reply to comment #13)
Thank you.
Sub-modules appear to be working now that I set the 'local subdirectory for repo' option (which I didn't have to do before).

The git clean -fdx command was failing the first time it was running trying to delete .nfs files so it is unlikely that clean caused these files to be created in the first place -- but now we are using the correct directories (see above) this problem has gone away, so it is not significant.

The s2 resolver is universally used for ant spring-builds, in many locations and configurations, and is unlikely to be at fault.  If you have to modify anything I suspect it is in the slave configuration.

Is the new  hudson slave configuration considerably different from the old one?  These builds all worked cleanly on that.
Comment 15 Eclipse Webmaster CLA 2010-08-18 10:24:30 EDT
Hardware aside, the only real difference should be the proxy.

I suggested an issue within the s3 resolver only because I've found little documentation on using it with a proxy(all of the Ivy proxy examples I found used http), and it's been resistant to any proxy information I've tried set for Ivy.

If you can give me an ivy command line I can run repeatedly(without involving Hudson) I'm willing to give it another try.

-M.
Comment 16 Chris Frost CLA 2010-08-18 10:52:07 EDT
Hi,

In each repo there is a 'build-xxxx' folder and a number of project directories,  the command 'ant resolve' can be run in any of these locations and will cause Ivy to pull some things down using Ivy S3 repo and not do anything else. It will pull some other things down using other repos but it should be obvious which ones are using S3 as they start with 's3' at the beginning of the path as it lists what it is doing. 

Chris.
Comment 17 Eclipse Webmaster CLA 2010-08-19 16:01:40 EDT
When I run the ant task in the builds/ subdirectories I get the following error:

BUILD FAILED
/opt/public/jobs/virgo.util.snapshot/builds/2010-08-18_13-01-05/build.xml:2: Unexpected element "{}build" {antlib:org.apache.tools.ant}build

So I tried in the workspace directory and while that doesn't generate the above error it still fails to connect.  At this time I've tried different JVMs, different Ants, editing the build.xml and ivysettings.xml files, but nothing seems make any difference.

Is there a way for us to figure out whether it's ant or ivy that isn't setting the proxy correctly?

-M.
Comment 18 David Carver CLA 2010-08-19 16:42:56 EDT
It sounds like they may need to use the setproxy task.

http://stackoverflow.com/questions/2921364/proxy-settings-with-ivy
Comment 19 Steve Powell CLA 2010-09-10 10:43:38 EDT
Bug 324976 raised during testing to try to set/use proxy settings.
Comment 20 Steve Powell CLA 2010-09-10 11:08:07 EDT
We put the target :

<target name="testproxy">  
		                    <property name="proxy.host" value="206.191.52.34"/>  
		                    <property name="proxy.port" value="9898"/>  
		                   <setproxy proxyhost="${proxy.host}" proxyport="${proxy.port}"/>  
		                </target>

in our root build file and invoked the target from ant.

This produced:  

testproxy:
 [setproxy] Setting proxy to 206.191.52.34:9898

in the log and the problem we have with proxy downloads from s3 remains:

[ivy:cachepath] external-lookup: Checking cache for: dependency: org.junit#com.springsource.org.junit;4.7.0 {*=[*]}
[ivy:cachepath] 		tried /tmp/local-repository/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] 	local-external-repository: no ivy file found for org.junit#com.springsource.org.junit;4.7.0
[ivy:cachepath] 		tried s3://repository.springsource.com/ivy/bundles/external/org.junit/com.springsource.org.junit/4.7.0/ivy-4.7.0.xml
[ivy:cachepath] Sep 10, 2010 11:01:25 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: I/O exception (org.apache.commons.httpclient.ConnectTimeoutException) caught when processing request: The host did not accept the connection within timeout of 60000 ms
[ivy:cachepath] Sep 10, 2010 11:01:25 AM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
[ivy:cachepath] INFO: Retrying request

(repeating up to 60 secs). Never works.

The s3 client implementation we use does not respect the proxy as set by ant task.  Suggestions please?

We question the need for a proxy on the slaves....
Comment 21 Steve Powell CLA 2010-09-10 11:08:42 EDT
Bug 324976 has been resolved and no longer blocks this bug.
Comment 22 Steve Powell CLA 2010-09-10 11:19:16 EDT
We also tried this:

/shared/common/apache-ant-1.7.1/bin/ant -file build.xml -Declipse.buildId=hudsonbuild -Dbuild.stamp=CI-${BUILD_ID} -Djavaplugin.proxy.config.list=http=206.191.52.34:9898 -Dci.build=true clean-ivy clean clean-integration testproxy test -v

and ran again (our s3 implementation might look in the javaplugin... variable)

but it made no difference.
Comment 23 Eclipse Webmaster CLA 2010-09-10 11:32:22 EDT
I don't have any suggestions other that switching to http/https.  But this sounds like a bug for the S3 implementation team, as proxies are a fact of life.  

While I understand the questioning of proxy usage, with our network setup the only way to allow outbound connections was via a proxy. 

-M.
Comment 24 Steve Powell CLA 2010-09-10 12:34:50 EDT
We can get our implementation to work with a proxy if we hand-modify a properties file (in our build submodule). It is not possible to introduce the properties in this file by any other means.

The 'solution', if we do not switch to http/s is to introduce a special 'enableproxy' target in our build groups which not only uses setproxy, but also explicitly modifies this properties file in the workspace before resolving with S3.

This would only 'work' if the proxy settings can be obtained from the environment (e.g. from environment variables) programmatically at run-time.

Can you document here exactly what proxy settings are available in the environment for the slaves?  No values need be put here, though the names of all the variables set needs to be given.

Is this information documented on the Wiki (or somewhere)?  We cannot find explicit enough information, bar the fact that ANT_OPTS, ANT_ARGS and JVM_OPTS have "proxy data" -- whatever that is.

I suggest it is given explicitly.
Comment 25 Eclipse Webmaster CLA 2010-09-10 13:37:12 EDT
The ANT and JVM environment variables (ANT_OPTS,ANT_ARGS,JVM_OPTS,JAVA_ARGS) listed on the wiki contain the proxy data as specified in their documentation( -Dhttp.proxyPort=# -Dhttp.proxyHost=#.#.#.# -Dhttps.nonProxyHosts=*.eclipse.org ).

The shell also sets 'http_proxy','https_proxy' and 'no_proxy' .  The first 2 are set as: http(s)://#.#.#.#:port , and the no_proxy value is a comma separated list of ips and names.

-M.
Comment 26 Steve Powell CLA 2010-09-15 12:42:27 EDT
Currently we are modifying a build properties file to force the correct use of the proxy information -- this is untenable for the long term so we will eventually stop using the s3 resolver in this way and use http instead.  This will be slower but works in more environments.

The proxy issue still causes problems, however -- see Bug 325354!
Comment 27 Steve Powell CLA 2010-09-22 10:30:55 EDT
Bug 325824 documents that we (the virgo project team) have removed set-hudson-proxy and reverted to http protocol access to our dependencies. This ought to work normally.  However, url resolution for IVY appears not to work (reliably at least) on hudson.
Comment 28 Denis Roy CLA 2010-09-22 10:36:05 EDT
(In reply to comment #27)
> Bug 325824 documents that we (the virgo project team) have removed
> set-hudson-proxy and reverted to http protocol access to our dependencies. This
> ought to work normally.  However, url resolution for IVY appears not to work
> (reliably at least) on hudson.

DO we need to discuss the same issue on two bugs?
Comment 29 Steve Powell CLA 2010-10-21 12:23:12 EDT
Builds now working (except for javadoc access to language lists in some builds).