Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 366696 - Revert recent Hudson change
Summary: Revert recent Hudson change
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: CI-Jenkins (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 363222
  Show dependency tree
 
Reported: 2011-12-14 09:22 EST by Eclipse Webmaster CLA
Modified: 2012-01-03 11:02 EST (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eclipse Webmaster CLA 2011-12-14 09:22:15 EST
The change made last week to hudson(http://dev.eclipse.org/mhonarc/lists/cross-project-issues-dev/msg06832.html) seems to be causing more pain than it fixed.

So I put this to the community: which setup do you prefer?

-M.
Comment 1 Jesse McConnell CLA 2011-12-14 10:07:02 EST
Denis asked for use case descriptions so I figured I would toss in how Jetty uses and abuses the hudson setup.

---

Typical Jetty:  Our normal jetty build (stock maven usage) simply does a git checkout followed by a maven build.  It polls the scm for updates and rebuilds as necessary.  We have two builds that operate like this, one for jetty7 and one for jetty8.  These builds operate on any machine and are self contained.

---

Atypical Jetty:  To produce our jetty bundles and jetty products (the p2 contributed stuff) we have two chains of 3 builds each.

jetty7:
jetty-nightly -> jetty-rt-bundles -> jetty-rt-products

jetty8:
jetty-nightly-8 -> jetty-rt-bundles-8 -> jetty-rt-products-8

These builds chains have to operate on the same filesystem and to keep a clean local repository I have the build massaged to put the local maven repository under /tmp/jetty-builds/jetty[7|8]/localRepo and they all run on Fastlane because that was the only machine that seemed to work when I was trying to get it working a while back.

Anyway, the nightly job installs all of the jetty artifacts into the local maven repository.  The rt-bundles job then consumes the maven artifacts out of the local repository and converts them into a signed p2 repository containing the conditioned jetty bundles.  This p2 repository is then pushed over to our download space under jetty/updates/jetty-bundles-[7|8].x/development so the product build (and anyone needed nightly jetty snapshots) can consume them.  The product build then generates the jetty products and conditions those, shoving those under the download site under jetty/updates/jetty-products-[7|8].x/development for safe keeping.

When we want to release these bundles and products we use the build configuration of those jobs to specify the exact version we want to release and it operates in the same fashion only instead of shoving under development on the download site, it puts it under a version specific url.

Yes, this is a house of cards and its a pain to maintain through the changes that have been going on.  Some we are isolated from but others like that corrupt artifacts issue have torpedoed our ability to consistently and _calmly_ keep this system working.
Comment 2 Dennis Huebner CLA 2011-12-14 10:43:36 EST
(In reply to comment #0)
> The change made last week to
> hudson(http://dev.eclipse.org/mhonarc/lists/cross-project-issues-dev/msg06832.html)
> seems to be causing more pain than it fixed.
> 
> So I put this to the community: which setup do you prefer?
> 
> -M.
I would like to explain what our needs are:
First requirement: 
Our Xtext-nightly-HEAD job depends on two other (Xpand and MWE). We need artifacts archived by these jobs (last successful or with an exact job number). It's normally done using http/-s URL but it's error prone and slow, so we switched to the file URIs.

Second requirement:
Hudson archive some artifacts that our job e.g Xtext-nightly-HEAD produces. This are some zip files, test reports, a p2 repository, a promotion ant script and some build dependent property files. We keep the latest five and some release builds.
This artifacts will be promoted/published once a day using the archived promotion script above. It's important that not only the latest, but all archived builds are accessible in order to be able to redeploy some builds using a build number (hudson-jobs-location/Xtext-nightly-HEAD/builds/1234/). I launch this promotion script with my user id manually or as a cron job (nightly builds). Promotion is nothing else as copy some files to location X, expand some zips to location Y using properties under location Z, clean up some old files. This all have to be done with my userid or our project group id (project's download location is write protected for "other")

Sounds simple, doesn't it? :)

I hope it helps somehow to find a good solution for this NFS dilemma. 

By the way, the "could not delete .nfs file" error also came up in some of our unit test. During junit workspace clean up.
Comment 3 Bouchet Stéphane CLA 2011-12-14 10:55:48 EST
My turn to explain our build process.

I am managing all the obeo modeling jobs [1] almost the same way.

- one job for project branch/trunk
- job have parameters like buildType (NISR), build_alias for promotion and
artifact creation
- each job is separated and produce necessary files to be promoted ( no chain )

we use then our ssh account on build.eclipse.org to execute an ant script to
promote the artifacts into download area. these scripts are also called from
cron.

the main problem is accessing the job workspace from outside hudson, because
the ant scripts are using /shared to get the artifacts and not http:/

of course, we can change ALL our scripts to do a GET from hudson, but accessing
directly via the filesystem is clearly faster and secure. 

as a side note, i am managing builds using tycho, buckminster and old CBI
athena that all relies on /shared/jobs paths.

[1] https://hudson.eclipse.org/hudson/user/sbouchet/my-views/view/Obeo/
Comment 4 Eike Stepper CLA 2011-12-14 11:19:58 EST
(In reply to comment #0)
> So I put this to the community: which setup do you prefer?

In our case the Hudson build is also an important card in the house, but not the one that's impacted by this Hudson change. We have a very powerful automatic promotion service, run as a cron job. It consists of a small bash script that's supposed to reliably determine in minimum time whether new Hudson builds are available for promotion and only if so to start the promoter. This script uses the "internal" nextBuildNumber files of the Hudson jobs. If a change is detected a Java application is started to copy over the build and do lots of things with it.

Back to your question: 

We prefer the old setup because it always worked well for us and the new setup causes lots of (otherwise unnecessary) efforts for us.
Comment 5 David Williams CLA 2011-12-14 11:53:36 EST
The recent changes broke the aggregation build we do for Simultaneous Release, so I'll explain what I was doing, and why, and changes I made (am making) to "fix". 

I was "writing" directly to my job's "build area" by specifying that area for the "eclipse workspace" (for part of the build that uses Eclipse) .. on Eclipse's -data argument.  

In short, this was accomplished with a property similar to 

        <property
            name="buildWorkarea"
            value="${jobsHome}/jobs/${jobName}/builds/${buildTimestamp}"/> 

Where "jobsHome" was the one thing that was "hard coded" elsewhere in my scripts (the others being derived from "normal" Hudson variables, publically "advertised" for use. 

The _reason_ I did this was, I'd swear, I saw it recommended somewhere, years ago, in a bug report or something, but could not find when I went to look for it, as an easy way to a) save a copy of your eclipse workspace easily (in case I needed to look at "log" later, for errors, etc.) and b) then I didn't have to worry about "cleaning it", Hudson would do that automatically as it erased job history, etc. 

So, in my case, I temporarily "moved" the log to a shared area with other build output goes /shared/juno
That's not a bad solution, but I need to worry about cleaning/managing/saving that myself. I think there are better ways, to simply use "the current workspace", but then I need to tell Hudson explicitly what to save from there, for each job (e.g. the .log). I'm sure I can do that, its just a little more work for me. So ... just providing data for this bug entry. I think even if you "reverted" the change, I'd try not to write to 
${jobsHome}/jobs/${jobName}/builds/${buildTimestamp}
since, when I tried, I could not find where I originally thought that was recommended ... I may have misunderstood or "over interpreted" the variables that Hudson provides.
Comment 6 David Williams CLA 2011-12-14 13:59:50 EST
BTW, no one has mentioned "when" to revert, but, since this is the very end of M4, I'd recommend waiting until after M4 to revert back. I think those that were having issues with this have "manually" done what they needed for M4 and are not blocked. If all goes as planned, the M4 bits will be done/ready by Friday morning, so technically the revert could be done on Friday ... or, first of next week might be better? While the revert is being done to "make things like they were", I would not be too surprised if some (new?) glitches were exposed (that might interfere with delivering M4, if done like right now). IMHO
Comment 7 Denis Roy CLA 2011-12-14 14:01:56 EST
(In reply to comment #6)
> BTW, no one has mentioned "when" to revert, but, since this is the very end of
> M4, I'd recommend waiting

I just read this three minutes too late.  I was under the impression that the change was blocking many from contributing to m4, so I've asked Matt to "undo" the change while he's waiting for a full restart of Hudson.
Comment 8 David Williams CLA 2011-12-14 14:20:32 EST
(In reply to comment #7)
> (In reply to comment #6)
> 
> I just read this three minutes too late.  I was under the impression that the
> change was blocking many from contributing to m4, so I've asked Matt to "undo"
> the change while he's waiting for a full restart of Hudson.

Ok ... keep your fingers crossed :/ ... and could have been blocking someone ... maybe I lost track. 

Thanks for your support.
Comment 9 Eclipse Webmaster CLA 2011-12-14 15:54:46 EST
Ok, I've finished the data replication and Hudson has been restarted.

-M.
Comment 10 Matthias Sohn CLA 2011-12-14 18:05:51 EST
EGit was coming across the problem not being able to wipe the workspace several times.
We always needed to wipe the workspace in order to delete the job private maven repository
to recover from corrupt artifacts cached there. If the job private maven repository would be
mounted on a local filesystem and we would have a way to erase that separately this would 
avoid the need to wipe the entire workspace when the private maven repository was corrupted. When referring to build results from other build jobs (e.g. JGit) we do this over HTTP.
Comment 11 Eclipse Webmaster CLA 2012-01-03 11:02:51 EST
Closing.

-M.