Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 321680 - All build2 builds failing (repeatedly) due to GIT failures
Summary: All build2 builds failing (repeatedly) due to GIT failures
Status: CLOSED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: CI-Jenkins (show other bugs)
Version: unspecified   Edit
Hardware: PC Mac OS X - Carbon (unsup.)
: P3 major (vote)
Target Milestone: ---   Edit
Assignee: CI Admin Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-04 04:18 EDT by Steve Powell CLA
Modified: 2010-08-06 04:16 EDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Powell CLA 2010-08-04 04:18:22 EDT
For example on virgo.apps.snapshot:
-8<---------
Fetching changes from the remote Git repository
Fetching upstream changes from git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.apps.git
[virgo.apps.snapshot] $ /usr/local/bin/git fetch -t git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.apps.git +refs/heads/*:refs/remotes/apps/*
git.eclipse.org[0: 206.191.52.51]: errno=Connection refused
fatal: unable to connect a socket (Connection refused)
ERROR: Problem fetching from apps / apps - could be unavailable. Continuing anyway
------------
followed by:
-8<---------
FATAL: Error performing /usr/local/bin/git clean -fdx
hudson.plugins.git.GitException: Error performing /usr/local/bin/git clean -fdx
	at hudson.plugins.git.GitAPI.launchCommandIn(GitAPI.java:330)
	at hudson.plugins.git.GitAPI.launchCommand(GitAPI.java:295)
	at hudson.plugins.git.GitAPI.launchCommand(GitAPI.java:305)
	at hudson.plugins.git.GitAPI.clean(GitAPI.java:186)
	at hudson.plugins.git.GitSCM$4.invoke(GitSCM.java:777)
	at hudson.plugins.git.GitSCM$4.invoke(GitSCM.java:707)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2018)
	at hudson.remoting.UserRequest.perform(UserRequest.java:114)
	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
	at hudson.remoting.Request$2.run(Request.java:270)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:453)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:315)
	at java.util.concurrent.FutureTask.run(FutureTask.java:150)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at hudson.remoting.Engine$1$1.run(Engine.java:58)
	at java.lang.Thread.run(Thread.java:736)
Caused by: hudson.plugins.git.GitException: Command returned status code 1: warning: failed to remove 'build-apps/target/'
------------

All my other build2 builds are failing in the same way. Builds on the master seem to be OK -- using the same git repository server.
Comment 1 Denis Roy CLA 2010-08-04 10:23:14 EDT
This is a dupe of bug 321647 (which is now fixed).

*** This bug has been marked as a duplicate of bug 321647 ***
Comment 2 Steve Powell CLA 2010-08-04 13:02:37 EDT
This is not a duplicate of 321647; the jobs are not trying to WRITE to a git repository, just read from one.  Not even doing a clone.

After the other problem was 'fixed' the jobs are still failing with precisely the same error as reported here originally.

Raising to major.
Comment 3 Denis Roy CLA 2010-08-04 13:08:56 EDT
How is this not a dupe?  Your error is exactly the same:

Bug 321680:
git.eclipse.org[0: 206.191.52.51]: errno=Connection refused

Bug 321647:
git.eclipse.org[0: 206.191.52.51]: errno=Connection refused


Are you still getting a Connection Refused trying to connect to git://git.eclipse.org?
Comment 4 Eclipse Webmaster CLA 2010-08-04 13:17:41 EDT
I've just tried and I can't this replicate from the command line:

hudsonbuild@build2:/tmp/g> git clone git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.apps.git
Initialized empty Git repository in /tmp/g/org.eclipse.virgo.apps/.git/
remote: Counting objects: 1929, done.
remote: Compressing objects: 100% (1479/1479), done.
remote: Total 1929 (delta 548), reused 1617 (delta 324)
Receiving objects: 100% (1929/1929), 1.85 MiB, done.
Resolving deltas: 100% (548/548), done.
hudsonbuild@build2:/tmp/g> cd /tmp/g/org.eclipse.virgo.apps
hudsonbuild@build2:/tmp/g/org.eclipse.virgo.apps> git fetch -t git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.apps.git +refs/heads/*:refs/remotes/apps/*
From git://git.eclipse.org/gitroot/virgo/org.eclipse.virgo.apps
 * [new branch]      master     -> apps/master
hudsonbuild@build2:/tmp/g/org.eclipse.virgo.apps> 

-M.
Comment 5 Steve Powell CLA 2010-08-05 05:00:59 EDT
I have 70-odd attempts at builds on build2 initiated by the git changes, and all of them fail in precisely the same way as before, and all of them failed like that before and after the other bug (apparent duplicate) was fixed.  Apparently Hudson can detect that a build is necessary (changes in git repo) but is not able to get those changes.

I'm sorry--I realise the same error is reported. (My remarks earlier were due to a mix up between various bugs in this area.)

My problem has not gone away when the other was 'fixed'. It is not clear how the 'duplicate' was fixed, but if the fix doesn't clear both problems then this one has to be re-opened.

(reply to comment 4)
Thank you for trying this. What can we deduce from this discrepancy?  What permissions/authority are you running with when you 'try this from the command line'?

I'll leave this at Major, though, because a majority of our integration builds are failing.
Comment 6 Steve Powell CLA 2010-08-05 05:03:24 EDT
build.eclipse.org is inaccessible this morning.  The host is not responding (according to Safari).

ping works:
PING build.eclipse.org (206.191.52.57): 56 data bytes
64 bytes from 206.191.52.57: icmp_seq=0 ttl=56 time=95.278 ms
64 bytes from 206.191.52.57: icmp_seq=1 ttl=56 time=91.838 ms
64 bytes from 206.191.52.57: icmp_seq=2 ttl=56 time=91.932 ms

(fyi)
Comment 7 Denis Roy CLA 2010-08-05 08:01:49 EDT
(In reply to comment #6)
> build.eclipse.org is inaccessible this morning.  The host is not responding
> (according to Safari).


Apache had died because of a bad config I had made.  I've fixed the config and restarted it.
Comment 8 Steve Powell CLA 2010-08-05 09:20:04 EDT
Thank you -- now I see the failures have changed (still all failing).

E.G:
-8<--------
[iajc] error at /opt/users/hudsonbuild/workspace/virgo.web.snapshot/org.eclipse.virgo.web.dm/target/classes/org/eclipse/virgo/web/dm/ServerOsgiBundleXmlWebApplicationContext.class::0 unable to write out class file: 'org/eclipse/virgo/web/dm/ServerOsgiBundleXmlWebApplicationContext.class' - reason: /opt/users/hudsonbuild/workspace/virgo.web.snapshot/org.eclipse.virgo.web.dm/target/classes/org/eclipse/virgo/web/dm/ServerOsgiBundleXmlWebApplicationContext.class (Permission denied)
     [iajc] warning at /opt/users/hudsonbuild/workspace/virgo.web.snapshot/ivy-cache/repository/org.eclipse.virgo.medic/org.eclipse.virgo.medic/2.1.0.D-20100805112401/org.eclipse.virgo.medic-2.1.0.D-20100805112401.jar!org/eclipse/virgo/medic/log/EntryExitTrace.class:49::0 advice defined in org.eclipse.virgo.medic.log.EntryExitTrace has not been applied [Xlint:adviceDidNotMatch]
     [iajc] warning at /opt/users/hudsonbuild/workspace/virgo.web.snapshot/ivy-cache/repository/org.eclipse.virgo.medic/org.eclipse.virgo.medic/2.1.0.D-20100805112401/org.eclipse.virgo.medic-2.1.0.D-20100805112401.jar!org/eclipse/virgo/medic/log/EntryExitTrace.class:54::0 advice defined in org.eclipse.virgo.medic.log.EntryExitTrace has not been applied [Xlint:adviceDidNotMatch]
     [iajc] warning at /opt/users/hudsonbuild/workspace/virgo.web.snapshot/ivy-cache/repository/org.eclipse.virgo.medic/org.eclipse.virgo.medic/2.1.0.D-20100805112401/org.eclipse.virgo.medic-2.1.0.D-20100805112401.jar!org/eclipse/virgo/medic/log/EntryExitTrace.class:59::0 advice defined in org.eclipse.virgo.medic.log.EntryExitTrace has not been applied [Xlint:adviceDidNotMatch]
     [iajc] MessageHolder:  (8 info)  (3 warning)  (1 error) 
     [iajc] [error   0]: error at /opt/users/hudsonbuild/workspace/virgo.web.snapshot/org.eclipse.virgo.web.dm/target/classes/org/eclipse/virgo/web/dm/ServerOsgiBundleXmlWebApplicationContext.class::0 unable to write out class file: 'org/eclipse/virgo/web/dm/ServerOsgiBundleXmlWebApplicationContext.class' - reason: /opt/users/hudsonbuild/workspace/virgo.web.snapshot/org.eclipse.virgo.web.dm/target/classes/org/eclipse/virgo/web/dm/ServerOsgiBundleXmlWebApplicationContext.class (Permission denied)
-8<--------
Apparently the iajc (aspectj compiler) can't write out the results of its work.
Permission denied implies the Workspace permissions are wrong.
Comment 9 Steve Powell CLA 2010-08-05 09:24:24 EDT
Other jobs exhibit other errors; all to do with writing to the workspace area. Permissions problem, probably.
Comment 10 Denis Roy CLA 2010-08-05 09:29:53 EDT
> /opt/users/hudsonbuild/workspace/virgo.web.snapshot/org.eclipse.virgo.web.dm

That path looks odd, because the /opt/users/hudsonbuild/workspace directory looks about empty:

build:/opt/users/hudsonbuild/workspace # ls -al
total 4
drwxrwsr-x  3 hudsonbuild callisto-dev   80 Oct 17  2009 .
drwxrwsr-x 79 hudsonbuild callisto-dev 4280 Aug  5 09:27 ..
drwxrwsr-x  3 hudsonbuild callisto-dev  416 Aug  4 10:44 .metadata
build:/opt/users/hudsonbuild/workspace

The permissions look right, assuming what is being written is being done by the hudsonbuild user.
Comment 11 Steve Powell CLA 2010-08-05 10:08:49 EDT
/opt/users/hudsonbuild/workspace/virgo.web.snapshot
is nothing to do with me... the rest of the path is (although I expected it to be under a directory called 'web'.

Haven't changed anything.  Other jobs are failing trying to do a git clean -fdx (having possibly failed to write changes with a git fetch).

Perhaps it isn't permissions, but instead some build configuration parameters aren't set up correctly.  The git repository should go into a directory called (in this case) web, even though the git repo is actually called org.eclipse.virgo.web.git.

I would expect this to be put in the root of the jobs' workspace.

Are there any configuration parameters/environment vars that haven't been set up correctly?
Comment 12 Steve Powell CLA 2010-08-05 10:09:45 EDT
Attemptefd to wipe workspace of virgo.apps.snapshot. Get this:

-------8<----------------
hudson.util.IOException2: remote file operation failed: /opt/users/hudsonbuild/workspace/virgo.apps.snapshot at hudson.remoting.Channel@7f637f63:build2
	at hudson.FilePath.act(FilePath.java:743)
	at hudson.FilePath.act(FilePath.java:729)
	at hudson.FilePath.deleteRecursive(FilePath.java:813)
	at hudson.model.AbstractProject.doDoWipeOutWorkspace(AbstractProject.java:1563)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:48)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
	at java.lang.reflect.Method.invoke(Method.java:600)
	at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:169)
	at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:101)
	at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:54)
	at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:74)
	at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:30)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:519)
	at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:181)
	at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:30)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:519)
	at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:181)
	at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:30)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:519)
	at org.kohsuke.stapler.MetaClass$12.dispatch(MetaClass.java:319)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:519)
	at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:181)
	at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:30)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:519)
	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:435)
	at org.kohsuke.stapler.Stapler.service(Stapler.java:123)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:45)
	at winstone.ServletConfiguration.execute(ServletConfiguration.java:249)
	at winstone.RequestDispatcher.forward(RequestDispatcher.java:335)
	at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:378)
	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:94)
	at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:51)
	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:97)
	at hudson.plugins.audit_trail.AuditTrailFilter.doFilter(AuditTrailFilter.java:64)
	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:97)
	at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:86)
	at winstone.FilterConfiguration.execute(FilterConfiguration.java:195)
	at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:368)
	at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:47)
	at winstone.FilterConfiguration.execute(FilterConfiguration.java:195)
	at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:368)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
	at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at org.acegisecurity.ui.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:166)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at org.acegisecurity.ui.basicauth.BasicProcessingFilter.doFilter(BasicProcessingFilter.java:173)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
	at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:66)
	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
	at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164)
	at winstone.FilterConfiguration.execute(FilterConfiguration.java:195)
	at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:368)
	at winstone.RequestDispatcher.forward(RequestDispatcher.java:333)
	at winstone.RequestHandlerThread.processRequest(RequestHandlerThread.java:244)
	at winstone.RequestHandlerThread.run(RequestHandlerThread.java:150)
	at java.lang.Thread.run(Thread.java:736)
Caused by: java.io.IOException: Unable to delete /opt/users/hudsonbuild/workspace/virgo.apps.snapshot/.git/objects/32/aafe3ca585a25b871ad55d52d2da04a9f7d0ae
	at java.lang.Throwable.(Throwable.java:67)
	at hudson.Util.deleteFile(Util.java:228)
	at hudson.Util.deleteRecursive(Util.java:290)
	at hudson.Util.deleteContentsRecursive(Util.java:219)
	at hudson.Util.deleteRecursive(Util.java:289)
	at hudson.Util.deleteContentsRecursive(Util.java:219)
	at hudson.Util.deleteRecursive(Util.java:289)
	at hudson.Util.deleteContentsRecursive(Util.java:219)
	at hudson.Util.deleteRecursive(Util.java:289)
	at hudson.Util.deleteContentsRecursive(Util.java:219)
	at hudson.Util.deleteRecursive(Util.java:289)
	at hudson.FilePath$9.invoke(FilePath.java:815)
	at hudson.FilePath$9.invoke(FilePath.java:813)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2018)
	at hudson.remoting.UserRequest.perform(UserRequest.java:114)
	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
	at hudson.remoting.Request$2.run(Request.java:270)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:453)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:315)
	at java.util.concurrent.FutureTask.run(FutureTask.java:150)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at hudson.remoting.Engine$1$1.run(Engine.java:58)
	... 1 more
----------------------------

Looks like I can't write/delete to the Workspace of the job!
Comment 13 Steve Powell CLA 2010-08-05 10:31:02 EDT
(In reply to comment #10)
Could it look empty because you don't have permission to see it?

According to Hudson the workspace files are all there (look in Workspace for the job), but jobs cannot update files in it.
Comment 14 Eclipse Webmaster CLA 2010-08-05 10:37:34 EDT
Odd, so /opt/users/hudsonbuild/workspace/virgo.apps.snapshot/.git/objects/32 is actually owned by root and not the Hudson user.

I'm running a script right now to find and correct the ownership of everything under  /opt/users/hudsonbuild/workspace .

-M.
Comment 15 Steve Powell CLA 2010-08-05 11:04:11 EDT
Wow -- this seems to do the trick.

How did this happen?  This has been a problem for several days; and sounds like some sort of recovery glitch.

I'll monitor the other jobs to see if they are all now working.
Comment 16 Steve Powell CLA 2010-08-05 11:26:06 EDT
Reporting: all builds now work (except apps).  Did your script miss out that one?

job virgo.apps.snapshot
apparently cannot write to workspace.

Thanks.
Comment 17 Eclipse Webmaster CLA 2010-08-05 11:46:49 EDT
The apps workspace looks clean:

/opt/users/hudsonbuild/workspace/virgo.apps.snapshot # find . -not -user hudsonbuild
/opt/users/hudsonbuild/workspace/virgo.apps.snapshot # find . -not -group callisto-dev
/opt/users/hudsonbuild/workspace/virgo.apps.snapshot #              

-M.
Comment 18 Steve Powell CLA 2010-08-05 11:49:31 EDT
Hmm; doesn't appear to be the same sort of error -- the git repository cannot
be fetched into -- this is because the workspace is apparently badly formed -- no .git
directory??

I'm going to try deleting the apps workspace (to force a re-clone).
Comment 19 Steve Powell CLA 2010-08-05 11:51:51 EDT
That seemed to clear it. I guess that the .git directory was not there.

This mess seems to be caused by corrupted file systems on build2.  Any idea when and how this might have occurred?
Comment 20 Steve Powell CLA 2010-08-06 04:09:59 EDT
I guess the cause of this problem is now academic -- unless it can happen again on the new 'stable' hudson.
Comment 21 Steve Powell CLA 2010-08-06 04:16:35 EDT
Closing this to celar the decks for the new hudson instance.