Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 454902

Summary: Make Gerrit run a sub-set of our tests suites when validating a change
Product: [Modeling] Sirius Reporter: Pierre-Charles David <pierre-charles.david>
Component: CoreAssignee: Pierre-Charles David <pierre-charles.david>
Status: CLOSED FIXED QA Contact: Pierre-Charles David <pierre-charles.david>
Severity: normal    
Priority: P1 CC: florian.barbin
Version: 2.0.0Keywords: triaged
Target Milestone: 3.0.0M7   
Hardware: All   
OS: All   
See Also: https://bugs.eclipse.org/bugs/show_bug.cgi?id=445371
https://git.eclipse.org/r/48925
https://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=e61f8736172eb84d66b6016eaa5a44a53fe49bc3
https://git.eclipse.org/r/49181
https://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=a74170d807cbed69ad82ea001558ee9291f9a9a0
Whiteboard:

Description Pierre-Charles David CLA 2014-12-11 10:31:54 EST
We do not have the time right now to fully understand and fix the systematic test failures and slowdowns when running our full suites on the Sirius HIPP (see bug #445371). However, recent regressions have shown once again that not having the tests run systematically is too dangerous, so I propose an iterative, lower cost approach to get some benefits quickly.

* For each of our 3 kinds of tests (JUnit, SWTBot Sequence and SWTBot), we will create a new suite, say *GerritTestsSuite, which only contains a subset of tests which are known to be reasonably fast and to succeed on the Sirius HIPP.
* Create new Maven profiles to run these suites instead of the complete ones.
* Configured the Gerrit jobs to launch these as part of the validation of each change. Gerrit normally builds Sirius for all supported platforms, but we probably want to only execute these tests for the reference platform (Luna for now, soon Mars). If any test fails with the new change, Gerrit will vote Verified-1 and forbid the commit to be merged.

For this to be practical, we need a subset of the suites which is:
* representative (i.e. with a relatively good coverage)
* reliable (i.e. we must be able to trust the verdict of the tests)
* fast (i.e. we do not want a permanent backlog of dozens of gerrit jobs which would introduce a lag of hours of days before a commiters gets a feedback).

We'll start slow, with almost empty suites at first, just to test the overal process, and then add more and more tests to the *GerritSuites (in addition to the complete ones) until we reach a point where we feel it takes too long for results to be available (I think we can shoot for a "budget" of about an hour for now).
Comment 1 Florian Barbin CLA 2014-12-18 11:35:39 EST
The new test suites with the corresponding maven profiles have been created: http://git.eclipse.org/c/sirius/org.eclipse.sirius.git/commit/?id=6fb1939a347798fe9f2ca6c3e090568e90fc5c72
Comment 2 Pierre-Charles David CLA 2014-12-22 05:16:29 EST
After a few mistakes (e.g. forgot to enable Xvnc, then forgot to start a window manager...), this is starting to work, on a small subset of the test suites. See https://hudson.eclipse.org/sirius/view/gerrit/job/sirius.gerrit/3199/PLATFORM=luna/consoleText:

Running org.eclipse.sirius.tests.suite.GerritJUnitSuite
Tests run: 90, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.225 sec

Running org.eclipse.sirius.tests.suite.tree.AllSiriusTestSuite
[...]
Tests run: 49, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.673 sec

Running org.eclipse.sirius.tests.swtbot.suite.GerritSWTBotSuite
[...]
Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 51.168 sec

Running org.eclipse.sirius.tests.swtbot.suite.GerritSequenceSWTBotSuite
[...]
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 57.343 sec
Comment 3 Pierre-Charles David CLA 2014-12-23 05:01:13 EST
More tests added by commit a42b85fa1d51b9935757e960aadaf441c97cdace. Complete Gerrit validation of the change that added these took 27 minutes, but the current scope does not include any non-sequence SWTBot tests, and in the JUnit suite only the standalone subsets (which take less than 5 seconds) are included.

Given how the Gerrit job is currently structured, with no parallelism, we will very quickly attain the initial budget of about 1h. After that the options are:
1. Leave the situation as is, with only a relatively small subset of the tests run by Gerrit. It's still better than the situation before.
2. Add more tests, at the cost of longer feedback from Gerrit. 
3. Add more tests but invest the time required to make them run faster.
4. Restructure the Gerrit jobs to run the separate suites in parallel.

Options 3 and 4 are non-exclusive, and ideally we should try to do both.
Comment 4 Florian Barbin CLA 2014-12-30 12:05:28 EST
This gerrit patch adds more SWTBot: https://git.eclipse.org/r/#/c/38850/4

I selected quick and reliable tests to add. The last gerrit took 31min. That lets time to add new JUnit and serveral other SWTBot.
Comment 5 Pierre-Charles David CLA 2014-12-31 09:20:25 EST
Currently the JUnit suite is the one with the less tests run by Gerrit. On the Eclipse HIPP, it takes about 39 minutes, but:
* Two tests fail systematically on the HIPP: BorderMarginTest.testAutoSize and DiagramMigrationTestCampaign10.testAllCustomisationsKeeped[0]. We've already seen these two fail systematically on some machines and pass realiably on others, with no hint on the actual reasons (maybe a difference in the system fonts installed?).
* Two tests (AcceleoMTInterpreterOnPackageImportTests and SiriusLayoutDataManagerForSemanticElementsApplyWithPredefinedDataTest) seem responsible for a disproportionate amount of the total time (resp. 12 minutes and 9 minutes).

I'll move all the JUnit tests except the 4 mentioned above into the Gerrit JUnit suite, and this should give use something close to  1h of Gerrit-triggered validation tests. We'll see after that if we cann add some more of the SWTBot ones or if we need to remove a few of the JUnit.
Comment 6 Pierre-Charles David CLA 2014-12-31 11:15:21 EST
(In reply to Pierre-Charles David from comment #3)
> 4. Restructure the Gerrit jobs to run the separate suites in parallel.
> 
> Options 3 and 4 are non-exclusive, and ideally we should try to do both.

For option 4, it might be possible to use https://wiki.jenkins-ci.org/display/JENKINS/Parameterized+Trigger+Plugin to launch sub-jobs in parallel one for each suite.

http://strongspace.com/rtyler/public/gerrit-jenkins-notes.pdf might also contain some hints and tips.
Comment 7 Pierre-Charles David CLA 2015-01-09 04:41:37 EST
Just an update: we have switched to a matrix job whith two dimensions: PLATFORM(juno,kepler,luna)×SUITE(gerrit-junit,gerrit-sequence,gerrit-swtbot).

The tests are only executed when building for Luna for now, in parallel on 3 different slots/slaves. For Juno and Kepler, we use the "gerrit-junit" SUITE to perform a simple build, but do not do anything for the two other suites (there is no point to re-build the same thing 3 times).

We also now publish the tests results in a form that Hudson can present properly. For the matrix elements where we do not actually execute any tests we publish an empty test report to make Hudson happy (otherwise it considers it as a failure).

The shell code which does this different behavior depending on the branch, the platform, and the suite is starting to be a little complex. It should probably be moved into the repo itself, and the job could simply fetch it with curl and execute it. At least it would be properly versioned.

We also increased the number of executors to 9. This corresponds to the number of jobs launched in parallel by the sirius.gerrit matrix, even though 4 of the concrete jobs will do nothing and return in just a few seconds.

With all this, we are down to about 25minutes (from 1h before) to get the "Verified" vote on a push to Gerrit. The individual suites time are:
* junit: 23min. This includes almost all the JUnit tests we have  (see comment 5).
* swtbot-sequence: 24min, with 140 tests on 440 available.
* swtbot: 15min, with only 242 tests on 1319 available.

25 minutes looks like a sustainable feedback time for now. We can start to add more tests to the Gerrit SWTBot suite until it reaches runtimes similar to the others.
Comment 8 Pierre-Charles David CLA 2015-03-17 04:47:08 EDT
I'm tempted to close this, as the system mostly works and it seems from now on it will only need small adjustments, but we still only run a relatively small subset of our complete suites, so moving to M7 instead: if time permits, we'll have one more look at what can be done to speed up the tests enough to include more tests in the Gerrit suites.
Comment 9 Pierre-Charles David CLA 2015-03-23 10:03:12 EDT
The gerrit-verify.sh script has proven to be unreliable, with builds passing green even with Tycho/p2 crashes during target platform resolutions for example.

Trying to handle all the cases with a single job is too complex and fragile. The sirius.gerrit job is being retired and replaced with two, more focused jobs:
* sirius.gerrit.build: a simple matrix job on PLATFORM={juno×kepler×luna×mars}, which only builds the code (Core and Tests) on all the platforms we support.
* sirius.gerrit.tests: a matrix job on PLATFORM={luna×mars} and SUITE={gerrit-junit,gerrit-swtbot,gerri-sequence} which builds and executes the "Gerrit Tests Suites" only for the current and next reference platform.
Comment 10 Pierre-Charles David CLA 2015-04-20 08:20:47 EDT
While not perfect, the current solution works fine for now. Further improvements in feedback speed, tests coverage and additional checks (e.g. CheckStyle) will be handled separately.
Comment 11 Pierre-Charles David CLA 2015-05-21 08:36:22 EDT
The actual content of the suites executed by Gerrit will evolve over time, but the overall organization works fine and have already proved its value.
Comment 12 Eclipse Genie CLA 2015-05-29 04:26:20 EDT
New Gerrit change created: https://git.eclipse.org/r/48925
Comment 14 Eclipse Genie CLA 2015-06-02 05:53:34 EDT
New Gerrit change created: https://git.eclipse.org/r/49181
Comment 16 Pierre-Charles David CLA 2015-06-24 11:13:26 EDT
Available in Sirius 3.0.0. See https://wiki.eclipse.org/Sirius/3.0.0.