Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 479460

Summary: Would appreciate help with performance machine ...
Product: Community Reporter: David Williams <david_williams>
Component: CI-JenkinsAssignee: CI Admin Inbox <ci.admin-inbox>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: denis.roy, mikael.barbero, nobody, webmaster
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard:

Description David Williams CLA 2015-10-09 15:02:19 EDT
I'd appreciate help with 3 things. 

This is the performance machine with URL, 
https://hudson.eclipse.org/perftests/
(if there's any confusion). 

1. Our jobs were running out of disk space. I cleaned up as much as I think is reasonable, and still shows 75% full. Appears 55G is allocated. Not sure where it all goes, but as after my cleaning the "job use", is as below, which does not appear anywhere near 55G (Guess the rest is OS and/or infrastructure?). Note also, if you go to the "manage hudson" page, it warns about "low on disk space".  

29M	/home/hudson/hudsonbuild/.hudson/jobs/ep44MLR-perf-lin64
33M	/home/hudson/hudsonbuild/.hudson/jobs/ep44MLR-perf-lin64-baseline
30M	/home/hudson/hudsonbuild/.hudson/jobs/ep44M-perf-lin64
28M	/home/hudson/hudsonbuild/.hudson/jobs/ep44M-perf-lin64-baseline
38M	/home/hudson/hudsonbuild/.hudson/jobs/ep45ILR-perf-lin64
37M	/home/hudson/hudsonbuild/.hudson/jobs/ep45ILR-perf-lin64-baseline
30M	/home/hudson/hudsonbuild/.hudson/jobs/ep45I-perf-lin64
29M	/home/hudson/hudsonbuild/.hudson/jobs/ep45I-perf-lin64-baseline
1.3G	/home/hudson/hudsonbuild/.hudson/jobs/ep45MLR-perf-lin64
1.4G	/home/hudson/hudsonbuild/.hudson/jobs/ep45MLR-perf-lin64-baseline
1.3G	/home/hudson/hudsonbuild/.hudson/jobs/ep45M-perf-lin64
1.3G	/home/hudson/hudsonbuild/.hudson/jobs/ep45M-perf-lin64-baseline
38M	/home/hudson/hudsonbuild/.hudson/jobs/ep45N-perf-lin64
28M	/home/hudson/hudsonbuild/.hudson/jobs/ep45N-perf-lin64-baseline
16K	/home/hudson/hudsonbuild/.hudson/jobs/ep46ILR-perf-lin64
277M	/home/hudson/hudsonbuild/.hudson/jobs/ep46ILR-perf-lin64-baseline
1.3G	/home/hudson/hudsonbuild/.hudson/jobs/ep46I-perf-lin64
1.3G	/home/hudson/hudsonbuild/.hudson/jobs/ep46I-perf-lin64-baseline
1.3G	/home/hudson/hudsonbuild/.hudson/jobs/ep46N-perf-lin64
1.3G	/home/hudson/hudsonbuild/.hudson/jobs/ep46N-perf-lin64-baseline
410K	/home/hudson/hudsonbuild/.hudson/jobs/ep-collectResults
20K	/home/hudson/hudsonbuild/.hudson/jobs/ep-perf-lin64
48K	/home/hudson/hudsonbuild/.hudson/jobs/ep-probe
16K	/home/hudson/hudsonbuild/.hudson/jobs/ep-unit

2. Can you please update the Hudson to version to latest? Version 3.2.2. 

3. From what I can tell, that Hudson instance is "running" under the "root" userid. Not sure if that's intentional, or an oversight, but doesn't seem like a good idea? Well, and maybe not. If I use "whoami", I get 
whoami:  hudsonbuild
but if I print all env variables, I see 
USER=root. 

So ... maybe that "USER" is some fake thing? 

As always, I defer to your expertise. 

Thanks,
Comment 1 Mikaël Barbero CLA 2015-10-10 04:48:56 EDT
1) your numbers match what 'du -sh' gives on the commandline. So jobs are using ~11G currently. /tmp is 704M so the space is not used there neither. However /opt/users/hudsonbuild is using about 25GB and there are files in there that have been modified in the last 30 days. I don't want to clean it by myself as it seems still relevant. Let me know what I should do with this folder. FYI, /opt/users/hudsonbuild is "user.home" / $HOME for hudsonbuild user.

2) I can do the update to 3.3.1 if you want (the very latest Hudson version). I can do the update on Monday. Is it ok for you?

3) Fixed. Thanks.
Comment 2 David Williams CLA 2015-10-11 15:57:14 EDT
(In reply to Mikael Barbero from comment #1)
> 1) your numbers match what 'du -sh' gives on the commandline. So jobs are
> using ~11G currently. /tmp is 704M so the space is not used there neither.
> However /opt/users/hudsonbuild is using about 25GB and there are files in
> there that have been modified in the last 30 days. I don't want to clean it
> by myself as it seems still relevant. Let me know what I should do with this
> folder. FYI, /opt/users/hudsonbuild is "user.home" / $HOME for hudsonbuild
> user.

Remember that $HOME and $HUDSON_HOME overlap, so some data counted twice. But, not sure it matters. I'd leave 'hudsonbuild' home alone, for now. 
I think that 55G is too small for /dev/xvda1.  The more we run, the larger is will need to be (to a point). I think it would have to be around 85G to maintain a "steady, minimum state" of about 50% full (and, that steady state would grow, at times so it might end up 70 or 80% used, before being reduced again to 50% used. Not sure if /dev/xvda1 is a literally hard disk partition (I think so) or some of of virtual drive that'd be easy to adjust. But, if can not be done "right now", please put it in your plans to increase. (As always, more is better, if that can be worked out! since, again, the more we do, and longer we do it, the more disk space will be used. 


> 2) I can do the update to 3.3.1 if you want (the very latest Hudson
> version). I can do the update on Monday. Is it ok for you?

Yes, please. Monday would be fine. I will warn you though, when I made the change on my local test machine, I "lost" all my configurations. So, be sure to make a back up of everything in 'jobs', that can be restored if necessary. Be sure all existing plugins are installed/updated, before starting the new version (with the old configs) I think that's where I went wrong? But, I was changing "lots of stuff", so I might have made other errors too. Luckily wasn't too hard to restore config.xml's ... but, I'd hate to have to do it again!
Comment 3 Mikaël Barbero CLA 2015-10-12 08:26:32 EDT
(In reply to David Williams from comment #2)
> (In reply to Mikael Barbero from comment #1)
> > 1) your numbers match what 'du -sh' gives on the commandline. So jobs are
> > using ~11G currently. /tmp is 704M so the space is not used there neither.
> > However /opt/users/hudsonbuild is using about 25GB and there are files in
> > there that have been modified in the last 30 days. I don't want to clean it
> > by myself as it seems still relevant. Let me know what I should do with this
> > folder. FYI, /opt/users/hudsonbuild is "user.home" / $HOME for hudsonbuild
> > user.
> 
> Remember that $HOME and $HUDSON_HOME overlap, so some data counted twice.

Actually, not on the perf tests machine. It is true on HIPP machine but for some reasons, here it is not.

> But, not sure it matters. I'd leave 'hudsonbuild' home alone, for now. 
> I think that 55G is too small for /dev/xvda1.  The more we run, the larger
> is will need to be (to a point). I think it would have to be around 85G to
> maintain a "steady, minimum state" of about 50% full (and, that steady state
> would grow, at times so it might end up 70 or 80% used, before being reduced
> again to 50% used. Not sure if /dev/xvda1 is a literally hard disk partition
> (I think so) or some of of virtual drive that'd be easy to adjust. But, if
> can not be done "right now", please put it in your plans to increase. (As
> always, more is better, if that can be worked out! since, again, the more we
> do, and longer we do it, the more disk space will be used. 

Denis, Matt, can we do something about this?
 
> > 2) I can do the update to 3.3.1 if you want (the very latest Hudson
> > version). I can do the update on Monday. Is it ok for you?
> 
> Yes, please. Monday would be fine. I will warn you though, when I made the
> change on my local test machine, I "lost" all my configurations. So, be sure
> to make a back up of everything in 'jobs', that can be restored if
> necessary. Be sure all existing plugins are installed/updated, before
> starting the new version (with the old configs) I think that's where I went
> wrong? But, I was changing "lots of stuff", so I might have made other
> errors too. Luckily wasn't too hard to restore config.xml's ... but, I'd
> hate to have to do it again!

I am doing that now. Of course, I will do a full backup before touching anything.
Comment 4 David Williams CLA 2015-10-12 08:42:22 EDT
(In reply to Mikael Barbero from comment #3)
> (In reply to David Williams from comment #2)
> > (In reply to Mikael Barbero from comment #1)
> > > 1) your numbers match what 'du -sh' gives on the commandline. So jobs are
> > > using ~11G currently. /tmp is 704M so the space is not used there neither.
> > > However /opt/users/hudsonbuild is using about 25GB and there are files in
> > > there that have been modified in the last 30 days. I don't want to clean it
> > > by myself as it seems still relevant. Let me know what I should do with this
> > > folder. FYI, /opt/users/hudsonbuild is "user.home" / $HOME for hudsonbuild
> > > user.
> > 
> > Remember that $HOME and $HUDSON_HOME overlap, so some data counted twice.
> 
> Actually, not on the perf tests machine. It is true on HIPP machine but for
> some reasons, here it is not.
> 

Are we looking at the same thing? 
Can you see 
https://hudson.eclipse.org/perftests/view/Eclipse%20and%20Equinox/job/ep-probe/38/console

I see there, when running 'env', 

USER_HOME=/home/hudson/hudsonbuild
HOME=/home/hudson/hudsonbuild
HUDSON_HOME=/home/hudson/hudsonbuild/.hudson

Are there some sort of symbolic links, or something, I don't see?
Comment 5 Mikaël Barbero CLA 2015-10-12 08:47:59 EDT
(In reply to David Williams from comment #4)
> Are we looking at the same thing? 
> Can you see 
> https://hudson.eclipse.org/perftests/view/Eclipse%20and%20Equinox/job/ep-
> probe/38/console
> 
> I see there, when running 'env', 
> 
> USER_HOME=/home/hudson/hudsonbuild
> HOME=/home/hudson/hudsonbuild
> HUDSON_HOME=/home/hudson/hudsonbuild/.hudson
> 
> Are there some sort of symbolic links, or something, I don't see?

It may be changed when hudson is started, but the home folder (regarding its unix account) of the user hudsonbuild is /opt/users/hudsonbuild while hudson is started from /home/hudson/hudsonbuild and the jobs are in /home/hudson/hudsonbuild/.hudson/jobs.

I will check on that after the update.
Comment 6 Mikaël Barbero CLA 2015-10-12 09:14:19 EDT
The upgrade is now completed. I updated all the plugins. Let me know if you see anything weird.

Regarding the HOME, it is what I suspected, the init script that run hudson reset the HOME folder. So the 25G in /opt/users/hudsonbuild really are old data. Esp. I see a 15G	/opt/users/hudsonbuild/.hudson/jobs.droy/ dating of 2012. 

I think I can safely remove a lot of things from there. That would buy us some time before upgrade the disk space of this machine (see comment #3). WDYT?
Comment 7 David Williams CLA 2015-10-12 09:33:26 EDT
(In reply to Mikael Barbero from comment #6)
> The upgrade is now completed. I updated all the plugins. Let me know if you
> see anything weird.
> 
> Regarding the HOME, it is what I suspected, the init script that run hudson
> reset the HOME folder. So the 25G in /opt/users/hudsonbuild really are old
> data. Esp. I see a 15G	/opt/users/hudsonbuild/.hudson/jobs.droy/ dating of
> 2012. 
> 
> I think I can safely remove a lot of things from there. That would buy us
> some time before upgrade the disk space of this machine (see comment #3).
> WDYT?

Ok, by me, if that's who you are asking  ... I always "tar up" old stuff, and give it a name like "jobs.droy_toBeDeletedOn<someDateAboutAMonthAway>.tar.gz" just in case it's discovered something is needed.
Comment 8 David Williams CLA 2015-10-12 09:44:26 EDT
Since I see no way to restart this instance myself, 
can you also install 3 plugins for me? And, restart. 
These are all in the "other" category. (I've used them 
quiet a bit on older versions, so assume "solid", but 
admit, I've not use them with 3.3.1 (my own 3.3.1 test install
isn't quiet running, yet). 

Priority Sorter Plugin
Rebuild Plugin
Timestamper

Thanks,
Comment 9 Mikaël Barbero CLA 2015-10-12 09:47:36 EDT
(In reply to David Williams from comment #8)
> Since I see no way to restart this instance myself, 
> can you also install 3 plugins for me? And, restart. 
> These are all in the "other" category. (I've used them 
> quiet a bit on older versions, so assume "solid", but 
> admit, I've not use them with 3.3.1 (my own 3.3.1 test install
> isn't quiet running, yet). 
> 
> Priority Sorter Plugin
> Rebuild Plugin
> Timestamper
> 
> Thanks,

Done.
Comment 10 Mikaël Barbero CLA 2015-10-12 13:09:09 EDT
(In reply to David Williams from comment #7)

> Ok, by me, if that's who you are asking  ... I always "tar up" old stuff,
> and give it a name like
> "jobs.droy_toBeDeletedOn<someDateAboutAMonthAway>.tar.gz" just in case it's
> discovered something is needed.

I created a bunch of tar file with things to remove in a month.
Comment 11 David Williams CLA 2015-10-13 03:03:39 EDT
(In reply to Mikael Barbero from comment #10)
> (In reply to David Williams from comment #7)
> 
> > Ok, by me, if that's who you are asking  ... I always "tar up" old stuff,
> > and give it a name like
> > "jobs.droy_toBeDeletedOn<someDateAboutAMonthAway>.tar.gz" just in case it's
> > discovered something is needed.
> 
> I created a bunch of tar file with things to remove in a month.

Thanks, but from what I see, using 
df . -sh

That did not seem to help much: 

Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       55G   41G   15G  74% /

Well, I didn't measure the tars that will be deleted in a month. 

I can limp along until then, but I do suggest you keep it on your "issues list" (or, equipment list, or whatever you have) because if it fills up again soon, 
then I will be panicking, and requesting a fast solution. :) 

Thanks for your help.
Comment 12 Mikaël Barbero CLA 2015-10-13 04:23:41 EDT
The tar files take 23GB. I will move them to another server in order to free some space and to buy us a bit more time before we come up with a solution about disk space on this machine. We will keep track of this need on our list ;)
Comment 13 Mikaël Barbero CLA 2015-10-13 04:33:52 EDT
(In reply to Mikael Barbero from comment #12)
> The tar files take 23GB. I will move them to another server in order to free
> some space and to buy us a bit more time before we come up with a solution
> about disk space on this machine. We will keep track of this need on our
> list ;)

Done.

Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       55G   20G   36G  36% /
Comment 14 Eclipse Webmaster CLA 2015-10-13 16:35:52 EDT
(In reply to Mikael Barbero from comment #3)
> Denis, Matt, can we do something about this?

Not really.  While the perfmaster instance is virtualized, we've had issues with trying to 're-size' the images to add space.  Our usual solution here would be to create another larger disk image and then simply mount it, however the host only has ~20G free so that's not an option.

However when we turn the old hudson master off we could reclaim it's disk space (~60G) for use here.

-M.
Comment 15 David Williams CLA 2015-10-13 17:59:34 EDT
(In reply to Eclipse Webmaster from comment #14)
> (In reply to Mikael Barbero from comment #3)
> > Denis, Matt, can we do something about this?
> 
> Not really.  While the perfmaster instance is virtualized, we've had issues
> with trying to 're-size' the images to add space.  Our usual solution here
> would be to create another larger disk image and then simply mount it,
> however the host only has ~20G free so that's not an option.
> 
> However when we turn the old hudson master off we could reclaim it's disk
> space (~60G) for use here.
> 
> -M.

Assuming Mikael's clean-up holds, we are below 50% now and suspect we'll be good for at 6 months, if not twice that. 

Thanks,
Comment 16 Mikaël Barbero CLA 2015-11-25 05:11:18 EST
As nothing broke during last month and half, I've deleted the backup of the files I've removed from the perfmaster machine.