Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 332945 - new job jgit.gerrit and new plugin gerrit-trigger
Summary: new job jgit.gerrit and new plugin gerrit-trigger
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: CI-Jenkins (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-20 09:12 EST by Matthias Sohn CLA
Modified: 2011-03-09 17:16 EST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Sohn CLA 2010-12-20 09:12:51 EST
We want to use the gerrit-trigger hudson plugin to build and test jgit changes
submitted for code review to Gerrit *before* they reach the master branch. 

This requires installation of the Gerrit Trigger plugin
http://wiki.hudson-ci.org/display/HUDSON/Gerrit+Trigger

So we probably start this on the sandbox hudson until we gathered experience about its influence on hudson stability.

Since this job needs a different configuration we want a new job here. 

If this works out well we will later follow the same approach for egit.

The plugin needs some configuration which is only accessible to hudson admins. 
Could you please configure that in the following way:

under Manage Hudson > Gerrit Hudson Trigger set
- Hostname: egit.eclipse.org
- Frontend URL: http://egit.eclipse.org/r/
- SSH Port: 29418
- Username: HudsonVoter
- SSH Keyfile: private SSH key
- SSH Keyfile Password: keyphrase for private SSH key

please generate a new SSH keypair (see [1]) and fill the latter two parameters
accordingly and send us back the public key so that we can configure Gerrit
to accept it.

Click "Test Connection" to verify the connection.

When everything seems ok, save the settings and restart 
the connection in the "Control" section at the bottom of the page.

[1] http://help.github.com/linux-key-setup/
Comment 1 Matthias Sohn CLA 2010-12-20 09:15:20 EST
The following users should have job admin permissions (same like in jgit job):
- caniszczyk
- msohn
- spearce
Comment 2 Eclipse Webmaster CLA 2010-12-20 11:25:29 EST
Ok, I've created the job and installed the plugin.  I've sent the public key to Matthias so let me know if things aren't working.

-M.
Comment 3 Matthias Sohn CLA 2010-12-20 15:44:48 EST
Maven build is failing with 

FATAL: command execution failed
java.io.IOException: Cannot run program "mvn" (in directory "/opt/users/hudsonbuild/workspace/jgit.gerrit"): java.io.IOException: error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
	at hudson.Proc$LocalProc.<init>(Proc.java:192)
	at hudson.Proc$LocalProc.<init>(Proc.java:164)
	at hudson.Launcher$LocalLauncher.launch(Launcher.java:638)
	at hudson.Launcher$ProcStarter.start(Launcher.java:273)
	at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:793)
	at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:767)
	at hudson.remoting.UserRequest.perform(UserRequest.java:114)
	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
	at hudson.remoting.Request$2.run(Request.java:270)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
	at java.lang.ProcessImpl.start(ProcessImpl.java:65)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
	... 15 more

any idea what's wrong here ?
Comment 4 Matthias Sohn CLA 2010-12-20 15:45:16 EST
full log is here https://hudson.eclipse.org/sandbox/job/jgit.gerrit/2/console
Comment 5 Matthias Sohn CLA 2010-12-21 06:07:04 EST
I missed to select the maven version ;-)

The newest Maven version on sandbox hudson is 3-beta1.
In the meantime Maven 3 has been released.
Could you please install the latest release 3.0.1 ?
Comment 6 Matthias Sohn CLA 2010-12-21 06:30:58 EST
Retrying with Maven 3.0-beta1 I now face the problem that the Hudson job seems to be unable to reach the internet, downloading artifacts from maven central repository times out [1] (tried multiple times). I can reach maven central from other build jobs running on main hudson.
I tried adding the proxy configuration as described in [2] but this doesn't solve the problem.

[1] https://hudson.eclipse.org/sandbox/job/jgit.gerrit/9/console
     https://hudson.eclipse.org/sandbox/job/jgit.gerrit/10/console
[2] http://wiki.eclipse.org/Hudson#Accessing_the_Internet_using_Proxy
Comment 7 Eclipse Webmaster CLA 2010-12-21 11:45:35 EST
I've added the Maven 3.0 install.

I created a test job to verify that the sandbox can connect to the outside world and everything seems happy.

-M.
Comment 8 Matthias Sohn CLA 2010-12-21 18:16:33 EST
now it looks better, builds now succeed. But the build job only reports back that it started the build but not that it succeeded, see the comments of Hudson CI on [1].

Could you please check the settings under Hudson > Manage Hudson > Gerrit Trigger ?

in section "Gerrit Reporting Values" the following voting rules should be set:

Verify
- Started 0
- Successful 1
- Failed -1
- Unstable 0

Code Review
- Started 0
- Successful 0
- Failed 0
- Unstable -1

[1] http://egit.eclipse.org/r/#change,2118
Comment 9 Eclipse Webmaster CLA 2010-12-22 09:53:05 EST
(In reply to comment #8)

All of the values on the sandbox match those you listed.

-M.
Comment 10 Matthias Sohn CLA 2010-12-22 19:16:54 EST
That's strange, something is wrong here. I can trigger a build from
https://hudson.eclipse.org/sandbox/gerrit_manual_trigger/?
but the other direction (gerrit notifying hudson) seems not to work.

Could you restart the gerrit connection from the main gerrit plugin configuration page
under "Manage Hudson > Gerrit Trigger" ? Maybe it needs a kick.
Comment 11 Eclipse Webmaster CLA 2010-12-23 10:20:59 EST
I pushed the 'restart' button and then the test connection button and it seemed happy.

-M.
Comment 12 Matthias Sohn CLA 2011-03-01 07:44:37 EST
Gerrit still doesn't trigger the builds when new changes arrive in the code review queue.

I filed a bug for the gerrit-trigger plugin [1] and they recommend to try restarting Hudson
to see if this helps. Could you try that at a time which fits for sandbox hudson ?
Could you also check if there are any log entries about Hudson failing to connect to Gerrit ?

Thx, Matthias

[1] http://issues.jenkins-ci.org/browse/JENKINS-8913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=146342#comment-146342
Comment 13 Eclipse Webmaster CLA 2011-03-01 10:57:39 EST
There are a couple of untrapped servlet errors on the 21st and 25th, but they seem to be related the the update engine.

I only see a few mentions of gerrit like this in the log:

Feb 28, 2011 11:11:18 AM com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener
onTriggered
INFO: Project [jgit.gerrit] triggered by Gerrit: [[ManualPatchsetCreated Change: Change: 2171 PatchSet: Pat
chSet: 1]]
Feb 28, 2011 11:11:18 AM com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger schedul
e
INFO: Project jgit.gerrit Build Scheduled: true By event: 2171/1
Feb 28, 2011 11:11:21 AM com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener
onStarted
INFO: Gerrit build [jgit.gerrit #16] Started for cause: [GerritCause: [ManualPatchsetCreated Change: Change
: 2171 PatchSet: PatchSet: 1] silent: false].
Feb 28, 2011 11:11:21 AM com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener
onStarted
INFO: MemoryStatus:
  Project/Build: [jgit.gerrit]: [#16: null] Completed: false

From Hudsons audit log I see only one entry:

Feb 28, 2011 11:11:21 AM hudson.plugins.audit_trail.AuditTrailPlugin$AuditTrailRunListener
CONFIG: job/jgit.gerrit/ #16 Manually triggered by user anonymous for Gerrit: http://egit.eclipse.org/r/2171

I've restarted the Hudson sandbox process.

-M.
Comment 14 Matthias Sohn CLA 2011-03-03 06:26:49 EST
I pushed some new changes to gerrit but the plugin still doesn't trigger a build for these
changes. Could you have another look in the server logs if there are any logs from the gerrit-trigger plugin ?
Comment 15 Eclipse Webmaster CLA 2011-03-03 09:49:27 EST
I'm not seeing anything captured in the system logs or in hudsons eventlog(https://hudson.eclipse.org/sandbox/log/all) that seem to be related to gerrit.

Using the gerrit plugins test function 'works'(it returns: success).  I hit the 'restart' button for the plugin to see if that produced anything but it seemed fine.

If you can tell me what logger 'class' to look for I'm willing to add another hudson eventlog.

-M.
Comment 16 Matthias Sohn CLA 2011-03-03 10:22:14 EST
Looks like restarting hudson and kicking "restart" did the trick. I pushed another change and now gerrit-trigger picked up the change :-)

Thanks for the help, I keep my fingers crossed that it keeps working now.
Comment 17 Matthias Sohn CLA 2011-03-03 10:30:26 EST
Found one more problem with this setup:

the link the gerrit-trigger puts in the comments in Gerrit 
    http://egit.eclipse.org/r/#change,2632
point to the wrong Hudson
    https://hudson.eclipse.org/hudson/job/jgit.gerrit/17/
instead this should be
   https://hudson.eclipse.org/sandbox/job/jgit.gerrit/17/
as this job runs on sandbox hudson.

Is there a Hudson configuration parameter which can fix this ?
Comment 18 Eclipse Webmaster CLA 2011-03-03 10:41:53 EST
I've updated the 'hudsonurl' setting in the sandbox config(the only option that seemed to fit).  Let me know if that fixes this.

-M.
Comment 19 Matthias Sohn CLA 2011-03-03 10:58:36 EST
yeah, this fixed the problem
Comment 20 Matthias Sohn CLA 2011-03-04 04:39:30 EST
Unfortunately this only worked a couple of times, now again it's not picking up new changes.
Could you kick it again ? I will look into the gerrit-trigger sources to find out what it does in case it's experiencing connection problems which could be caused by e.g. spurious network failures.
Comment 21 Eclipse Webmaster CLA 2011-03-04 09:56:39 EST
I've 're-started' the Gerrit plugin.

-M.
Comment 22 Matthias Sohn CLA 2011-03-05 17:19:40 EST
Thanks. 

Again it worked a couple of times and then stopped working. 

Talked to Robert again [1], looks like we are suffering from [2]. Robert suggested to deploy the latest nightly which has a bug fix for [2]. Could you install that from [3] ? 

[1] http://issues.jenkins-ci.org/browse/JENKINS-8913
[2] http://issues.jenkins-ci.org/browse/JENKINS-6965
[3] http://ci.jenkins-ci.org/job/plugins_gerrit-trigger-plugin/lastSuccessfulBuild/com.sonyericsson.hudson.plugins.gerrit$gerrit-trigger/
Comment 23 Denis Roy CLA 2011-03-07 09:48:12 EST
We don't typically run 'nightly'-quality code on eclipse.org servers.  Is this plugin not stable enough for production use?
Comment 24 Matthias Sohn CLA 2011-03-07 16:47:18 EST
We use this plugin at SAP very successfully in at least hundred build jobs, we don't face these connection problems there. Something seems to be different in the eclipse.org environment. According to Robert the mentioned patch [1] (setting keep-alive for ssh connection to 30 secs) fixed the problem for other users. 

Due to the Hudson / Jenkins fork there was no new release for a while. Robert is waiting for some more patches which are under way before he will create a new release. 

To our experience this plugin behaves well and is well maintained. We also tried the other gerrit plugin [2] and found that it provides much less functionality and that it puts a much higher load on hudson since it's polling gerrit for new changes whereas gerrit-trigger is listening on events sent by gerrit via a permanently kept open ssh connection. It looks like it's this connection which is facing the connection timeout problem described in [3]. Could you check the logs if there are similar log entries ?

At SAP we recently switched to the same nightly and the patches we wanted had the promised effect. Could we agree on giving this nightly a try on sandbox hudson at least to verify if the contained patch fixes our problem ? Otherwise we'll wait until the next release of gerrit-trigger is available.

[1] https://github.com/jenkinsci/gerrit-trigger-plugin/commit/3640a2470d36df5b143489698583ec9b34757692
[2] http://wiki.jenkins-ci.org/display/JENKINS/Gerrit+Plugin
[3] http://issues.jenkins-ci.org/browse/JENKINS-6965
Comment 25 Denis Roy CLA 2011-03-07 16:59:09 EST
(In reply to comment #23)
> We don't typically run 'nightly'-quality code on eclipse.org servers.  Is this
> plugin not stable enough for production use?

Apologies -- I thought we were talking about a plugin that was running on the production Hudson instance.


> Could we agree on giving this nightly a try on sandbox
> hudson at least to verify if the contained patch fixes our problem ?

I discussed with Matt, and he agrees this is the way to go.
Comment 26 Matthias Sohn CLA 2011-03-08 09:39:00 EST
(In reply to comment #25)
> (In reply to comment #23)
> > We don't typically run 'nightly'-quality code on eclipse.org servers.  Is this
> > plugin not stable enough for production use?
> 
> Apologies -- I thought we were talking about a plugin that was running on the
> production Hudson instance.
> 
> 
> > Could we agree on giving this nightly a try on sandbox
> > hudson at least to verify if the contained patch fixes our problem ?
> 
> I discussed with Matt, and he agrees this is the way to go.

thanks, this helps. Let me know when the new version is deployed and just to be on the safe side also restart the connection to Gerrit after deploying the new version.
Comment 27 Eclipse Webmaster CLA 2011-03-08 13:34:12 EST
I've installed the snapshot plugin, restarted hudson and hit the gerrit restart button.  The connection test returns success.

-M.
Comment 28 Matthias Sohn CLA 2011-03-09 17:16:39 EST
Looks like the nightly brought the fix we needed.