Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 428561 - HIPP Help - poor performance on https://hudson.eclipse.org/xtext/
Summary: HIPP Help - poor performance on https://hudson.eclipse.org/xtext/
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: CI-Jenkins (show other bugs)
Version: unspecified   Edit
Hardware: PC Mac OS X
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: CI Admin Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 422764
Blocks:
  Show dependency tree
 
Reported: 2014-02-19 11:17 EST by Dennis Huebner CLA
Modified: 2014-03-03 05:40 EST (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dennis Huebner CLA 2014-02-19 11:17:43 EST
I've noticed performance regression on https://hudson.eclipse.org/xtext/ instance. Sometimes, it's incredible slow. Jobs needs more than five time more to finish a build. 

See https://hudson.eclipse.org/xtext/view/Xtext-Xtend/job/xtext-xtend/buildTimeTrend
Comment 1 Denis Roy CLA 2014-02-19 11:34:33 EST
We're investigating this via bug 426734 - Investigate super high server loads

We don't yet know what's causing the problem, but there are some HIPP processes (including Xtext's) which seem to consume large amounts of CPU cycles.

Thanh, I've noticed Xtext and Sirius on hipp2 that tend to behave badly... Perhaps in the short term we could deploy hipp4 and move one of them over?
Comment 2 Denis Roy CLA 2014-02-19 12:03:20 EST
Right now, one of the Xtext jobs is using 17/24 CPU cores @ 100%.

https://hudson.eclipse.org/xtext/job/xtend-xtext-playground/30/console

On the console, I see this message:

buckminster-resolve:
     [echo] IMPORTANT: Populating an empty target platform may took over 10 minutes.

That message seems to indicate that whatever it about to do, it will take a long time...  Does that happen on each build?  And does that job run at the same time as https://hudson.eclipse.org/xtext/view/Xtext-Xtend/job/xtext-xtend ?
Comment 3 Denis Roy CLA 2014-02-19 12:04:12 EST
In other words, if you have two jobs triggered by the same SCM change, and one takes 10 minutes just to create an environment, you might want to optimize that.
Comment 4 Denis Roy CLA 2014-02-19 12:13:43 EST
https://hudson.eclipse.org/xtext/job/xtend-xtext-playground/30/console

It "hung" for about 18 minutes at this line:
[java] INFO:  resolve '/home/hudson/genie.modeling.tmf.xtext/.hudson/jobs/xtend-xtext-playground/workspace/git-repo/releng/org.eclipse.xtext.releng/releng/xtext-platform.mspec'

During that time, the process ran for 25 minutes, but consumed 205 minutes of CPU time (multiple cores).

Do we know what it's doing during that step?
Comment 5 Thanh Ha CLA 2014-02-19 12:20:20 EST
(In reply to Denis Roy from comment #1)
> Thanh, I've noticed Xtext and Sirius on hipp2 that tend to behave badly...
> Perhaps in the short term we could deploy hipp4 and move one of them over?

HIPP4 has already been deployed and we've been using it for awhile now (I think it has over 10 instances now). I've also already moved a few of the HIPP2 projects over to HIPP4.

Maybe it's worth deploying HIPP5 and putting xtext there?

This way we can keep the load balanced and hopefully don't reproduce the issues happening on hipp2 on hipp4 instead.
Comment 6 Denis Roy CLA 2014-02-19 13:56:58 EST
+1

I still think there's a lot of black magic happening in some builds that leads to resource waste.
Comment 7 Dennis Huebner CLA 2014-02-20 02:43:26 EST
(In reply to Denis Roy from comment #2)
> Right now, one of the Xtext jobs is using 17/24 CPU cores @ 100%.
p2 target platform resolution 

> https://hudson.eclipse.org/xtext/job/xtend-xtext-playground/30/console
> 
> On the console, I see this message:
> 
> buckminster-resolve:
>      [echo] IMPORTANT: Populating an empty target platform may took over 10
> minutes.
> That message seems to indicate that whatever it about to do, it will take a
> long time...
Don't believe everything you read :) In this special job, we reuse the same script that is also used to build locally. "May" means, that sometimes, we in Europe, getting eclipse mirrors from Australia instead of e.g. Germany, so refreshing the p2 repositories state and downloading some jars may really take a while here. On an eclipse machine downloading artifacts is not a big deal.

> Does that happen on each build? And does that job run at the
> same time as
> https://hudson.eclipse.org/xtext/view/Xtext-Xtend/job/xtext-xtend ?
Sure creating an eclipse target platform is the first step to produce a build, no matter which build system you use buckminster or tycho. Resolving a target platform consumes CPU, that is true, but this issue can not be addressed to a special build framework or even job, it's the p2 functionality we all rely on.
Comment 8 Dennis Huebner CLA 2014-02-20 02:51:00 EST
(In reply to Thanh Ha from comment #5)
> (In reply to Denis Roy from comment #1)
> > Thanh, I've noticed Xtext and Sirius on hipp2 that tend to behave badly...
> > Perhaps in the short term we could deploy hipp4 and move one of them over?
> 
> HIPP4 has already been deployed and we've been using it for awhile now (I
> think it has over 10 instances now). I've also already moved a few of the
> HIPP2 projects over to HIPP4.
> 
> Maybe it's worth deploying HIPP5 and putting xtext there?
> 
> This way we can keep the load balanced and hopefully don't reproduce the
> issues happening on hipp2 on hipp4 instead.
+1
I think moving to hipp5 is a good idea. Xtext HIPP hosts 4 projects, so we need more resources to waste than others! :)
Comment 9 Thanh Ha CLA 2014-02-20 14:33:30 EST
Matt setup HIPP5 so I will try to find a time today when xtext isn't running a build and move the instance over.
Comment 10 Thanh Ha CLA 2014-02-20 15:35:33 EST
xtext HIPP is now running on hipp5.
Comment 11 Dennis Huebner CLA 2014-02-21 04:02:25 EST
(In reply to Thanh Ha from comment #10)
> xtext HIPP is now running on hipp5.

Cool. Could you also check the permissions?

[ant] Queueing site_121034503.zip for signing
ERROR: org.eclipse.core.runtime.CoreException: /opt/public/common/buckminster-4.2/configuration/org.eclipse.osgi/bundles/25/1/.cp/org/eclipse/buckminster/jarprocessor/antscript/signing.ant:208: Directory /shared/download-staging.priv/modeling/m2t/xpand creation was not successful for an unknown reason
org.eclipse.core.runtime.CoreException: /opt/public/common/buckminster-4.2/configuration/org.eclipse.osgi/bundles/25/1/.cp/org/eclipse/buckminster/jarprocessor/antscript/signing.ant:208: Directory /shared/download-staging.priv/modeling/m2t/xpand creation was not successful for an unknown reason

See Bug 422764
Comment 12 Thanh Ha CLA 2014-02-21 08:46:57 EST
(In reply to Dennis Huebner from comment #11)
> (In reply to Thanh Ha from comment #10)
> > xtext HIPP is now running on hipp5.
> 
> Cool. Could you also check the permissions?
> 

It wasn't a permissions problem. Something deleted the entire directory so I recreated it. Might be worth double checking your jobs to make sure none of them are accidently deleting this directory in it's entirety.
Comment 13 Dennis Huebner CLA 2014-03-03 05:40:34 EST
Works better now.