| Summary: | Set up a high-priority slave | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Community | Reporter: | Denis Roy <denis.roy> | ||||
| Component: | CI-Jenkins | Assignee: | Denis Roy <denis.roy> | ||||
| Status: | RESOLVED FIXED | QA Contact: | |||||
| Severity: | normal | ||||||
| Priority: | P3 | CC: | david_williams, d_a_carver, kim.moir, webmaster | ||||
| Version: | unspecified | ||||||
| Target Milestone: | --- | ||||||
| Hardware: | PC | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Attachments: |
|
||||||
|
Description
Denis Roy
Only issue I can see here...when Release time comes around, every project that is on the release train is going to try and use Slave 3. So you will get a backlog happening there. Additional concerns, how many concurrent executors will Slave3 have? If a job takes 6 hrs to run or longer, it'll tie an executor up for that entire time. It would be great to have another slave. Not sure how you will enforce "fast lane" use given that by default all slaves in the install are available for use. Another regular slave might just solve the resource issue. I agree with Kim, just add another regular slave to the existing "build" group, and if projects are setup right, their jobs will try to run on any available slave in that group. That's all swell, but there is concern that during a release, while everyone is trying to put bits out, a bunch of nightlies or CI builds (which could perhaps otherwise wait a while) will just occupy the queue and processing resources. > Not sure how you will enforce "fast lane"
I didn't intend to. I figured social convention would take care of that.
Closing as RESOLVED/STUPID IDEA.
It wasn't a stupid idea :-) I just wasn't sure how you were going to enforce it. If I really needed a slave during a release and the queue was full of nightly builds, I would personally just send a note to cross project and ask if a nightly build could be stopped to allow me to build :-) Not sure if this is do able, but if release time is a concern, then you might actually want a separate farm of slaves, just for release. They can be managed by the same hudson instance, but a new release specific job can be created and tied to that particular farm. It's like having test, qa, and release environments but with build servers. it all depends on how complex we want to get. I actually like this idea. I'm not sure it would solve the entire problem of "crunch time", but I think it might be a step in the right direction. We committers and release engineers must start to develop a culture that includes priority and limited limited resources. I know it usually seems like we have unlimited resources ... because Denis does such a good job of accommodating us :) ... but, every year we end up filling up all available resources at some point, and at some point, it will make us miss our dates, or at least add a lot of last minute stress, confusion and uncertainty. As one example, almost no one sets "niceness" ... I understand no one wants to "be last", and also, I've heard, there's nothing in Hudson that helps with priorities ... so, seems to do priority-by-slave might be a start. And, I'd say it'd be fine to have some strict rules, in addition to "social conventions". For example, I think we could say high priority slave is for builds completing in under 20 minutes, at most, and jobs over that time will be automatically killed. (Then, we'd probably set the the auto-time-limit to 30 minutes, just to accomiate variability). And then there would still be social conventions to say should not do nightlies there (even if short and quick). I think the benefit of this approach is that there is some "reward" to encourage people to do the right/best thing ... short quick builds can be done on HOV fast lane server. If we don't do something like this, I think we'll end up with the stick instead of the carrot ... such as we'll tell that pesky webtools team their builds just take too much resource and we'll have to publically shame them into making improvements :) Or, we'll scurry around at last minute asking people if we can kill those equinox nightly tests :) So, again, I don't think this would completely solve the whole problem of having traffic jams at releases and milestones ... but, might help get people thinking in terms of the resources they use and how to prioritize their work. Another approach -- to increase awareness -- might be to track how much each project uses the build server's CPU or resources ... similar to how we track disk space usage. But ... not sure how to do that sort of tracking, exactly. Might be hard to implement anything meaningful. Created attachment 187424 [details]
screenshot
Matt has set up a slave called 'fastlane' which resides on the same host as the master, so it has plenty of CPU cycles. However, since most jobs are configured to use any slave in the group, some nightly jobs are using the fastlane as depicted in the screenshot.
Matt, is there a way we can mark fastlane as not being part of the regular node group, so that jobs must be tied to it explicitly?
I've set it's availability to: 'Leave this machine for tied jobs only' , which seems to be our only option. I've taken a look at the jobs that have run, and I don't see any indication that they are 'tied' to this node, so they should simply run elsewhere. -M. We're done here. Thanks Matt. |