Running functional tests in parallel with Codeception

Using functional tests in Codeception you can verify that components within your application work well together and it is perhaps the most worthwhile way to test controllers. The downside of these tests are that they can take a long time to run and consume a significant amount of memory.

At some point this can become a problem as the people of Future500 B.V. had found out. Their build pipeline took approximately 40 minutes to run and consumed over 1 Gigabyte of memory while doing so. And this was even after they had done their own optimizations and split the execution of their test suite into three parts.

By changing the way they run their test suites to use 4 parallel threads I was able to reduce their execution time by less than half of the original duration.

Disclaimer: the benefits of parallellization differ significantly based on your chosen set-up, software stack and the hardware of your continuous integration server. In this instance the tests ran within a Vagrant provisioned virtual machine (using Virtualbox) with 2 Cores and 2 GB of RAM.

Overview (TL;DR)

In order to be able to run your tests in parallel you will need to perform the following steps:

Decide how many threads you want to have running simultaneously.
Define as many environments as you will have threads running simultaneously.
Set up as many database schemes as will have threads running simultaneously.
Split your tests equally in as many groups as you will have threads running simultaneously.
Use a task runner (such as Ant, Phing or Robo) to execute a Codeception for each environment/group combination in parallel.
Merge the reports of all executions together into a single report if you want your CI environment to interpret the results.

In the rest of this post I am going to make use of Robo as our task runner.

Why Robo? Codeception offers the package codeception/robo-paracept that provides traits and helper methods/classes that you can use out-of-the-box. When using Ant or Phing you would need to do more effort.

Getting Started

In order to get started you will need to install the codeception/robo-paracept package using composer. This will automatically install the codegyre/robo package containing the Robo task runner that we will be using.

composer require --dev "codeception/robo-paracept: dev-master@dev"

Robo uses a file called RoboFile.php that contains the tasks that can be executed by it. The easiest way to create this file is by running the Robo application once. It will ask whether to create a skeleton RoboFile.php to which you only have to answer Yes.

./vendor/bin/robo

In Robo we will be defining three tasks:

parallel:split – this task will create a series of group definition files that Codeception can use to distribute tests automatically between the given number of threads.
parallel:functional – this task is responsible for preparing the threads and executing them simultaneously.
parallel:merge – this task will merge the reports generated in the previous task into a single output file named report.xml in your Codeception log folder.

But before we do this we should first edit the configuration files used by Codeception with the correct number of environments and groups.

Setting up separate databases

Each thread must use its own database scheme; if you don’t do this then you will encounter locking issues. In order to tell Codeception which database to use you will have to create an environment per thread and set the database in it.

Important: If you have been using environments in your tests suites then things are going to get a whole lot messier. Codeception does not support changing configuration options outside of environments and so you will need to get creative to still be able to switch databases.

So let’s assume that you have a suite called functional. This means that you have a configuration file called functional.suite.yml in your tests folder that looks roughly like this:

class_name: FunctionalTester
modules:
enabled: [Db]
  config:
    Db:
      dsn: 'mysql:host=127.0.0.1;dbname=myDatabase'
      user: 'root'

Now let’s assume that we are going to use four (4) simultaneous threads called p1, p2, p3 and p4. This will mean that we are going to add a new section to our test suite configuration file for each of these environments like this:

env:
  p1:
    modules:
      config:
        Db:
          dsn: 'mysql:host=127.0.0.1;dbname=myDatabase1'
  p2:
    modules:
      config:
        Db:
          dsn: 'mysql:host=127.0.0.1;dbname=myDatabase2'
  p3:
    modules:
      config:
        Db:
          dsn: 'mysql:host=127.0.0.1;dbname=myDatabase3'
  p4:
    modules:
      config:
        Db:
          dsn: 'mysql:host=127.0.0.1;dbname=myDatabase4'

Of course we don’t show in this example how to populate the individual databases but this can be done using the dump field of the Db module.

Please note: one of the challenges that you might face here is that you are using a framework and cannot rely on the Db module. This was the case with the project that I had done with Future500. We had to create a workaround where we injected the database name in the framework’s configuration right before Doctrine was initialized using a custom-built module.

Dividing tests into groups

Codeception’s Robo package provides a convenient way to divide all tests evenly among a fixed set of groups. In the previous chapter we have assumed there are going to be 4 threads so we will be continuing that assumption here as well.

For this to work you will need to change two files:

Your codeception configuration file
Your newly created RoboFile

Configuration

Once we have split our files into groups Codeception will not automagically know where to find these. You will need to provide a configuration option that will inform Codeception of that.

You can do this using the groups option, like this:

groups:
    p*: tests/_log/p*

When you add this to your codeception.yml configuration file codeception will be able to locate a group file that matches the pattern above.

RoboFile

After setting up your configuration the next step in dividing your tests is by adding the \Codeception\Task\SplitTestsByGroups trait to your RoboFile like this:

class RoboFile extends \Robo\Tasks
{
    use \Codeception\Task\SplitTestsByGroups;
    ...
}

This trait will, among other things, provide a method named taskSplitTestFilesByGroups that will allow you to split the tests into a number of groups of your choosing. The number of groups should of course match your number of threads and thus in this case it will be four.

The following snippet demonstrates the usage of the taskSplitTestFilesByGroups method:

    public function parallelSplit()
    {
        $this->taskSplitTestFilesByGroups(self::AMOUNT_OF_THREADS)
            ->projectRoot('.')
            ->testsFrom('tests/functional')
            ->groupsTo('tests/_log/p')
            ->run();
    }

What we see here is that we provide the taskSplitTestFilesByGroups call with the number of threads, where to find the project’s root and which test folder to divide (in this case ‘tests/functional’). As one of the last steps you will also have to provide the prefix for a path where the group files will be written; in the example above you can see that we write them to the tests/_log folder where each group file will start with a p. The splitting method will automatically append a thread number to the given path.

Once you have invoked the command robo parallel:split from the command line you will see that four files are written in the tests/_log folder named p1 through p4. In those files you can find the names of the tests that will be executed by each thread.

Running the tests in parallel

In order to run each group of tests in parallel we need to prepare a number of Codeception runs equal to the number of threads that you want to use.

You can do this using the taskParallelExec task that is enabled by default with Robo. What this task does is expose a method named process with which you can set up a number of other Robo tasks that you want to execute as soon as you call the run method that is exposed via the taskParallelExec task.

Let’s take a look at an example:

public function parallelFunctional()
{
    $parallel = $this->taskParallelExec();
    for ($i = 1; $i <= self::AMOUNT_OF_THREADS; $i++) {
        $parallel->process(
            $this->taskCodecept()   // use built-in Codecept task
                ->suite("functional")   // run functional tests
                ->group("p{$i}")        // for all p* groups
                ->configFile("codeception.yml") // Using the codeception config file
                ->env("p{$i}")          // each in its own environment
                ->xml('../tests/_log/result_p{$i}.xml") // save XML results
        );
    }
     
    return $parallel
        ->printed(true) // comment this to hide the output
        ->run();
}

In the above code you can see how we loop a number of times equal to the number of threads that we want and prepare a Codeception run by using the Codecept task of Robo. With this task we define:

which suite to run (remember that we named it functional?).
to use a group named p1 through p4 (those are our generated group files).
that we want to use the configuration file with name ‘codeception.yml’.
to apply the settings for environment p1 through p4 and thus select the right database.
and to write an xml report in our tests/_log folder named result_p1.xml through result_p4.xml.

Once this task has been ran, using the command robo parallel:functional, you will have 4 result files in your Codeception’s log folder that are just dying to be merged into one XML report.

Merging the results

And here we are: the Great Finale.

After all the changes that we have done above we have one last step remaining to come to a usable piece of output for your Continuous Integration environment: the XML file containing your test violations.

For this to work we are going to add another command to our RoboFile named parallel:merge. But before that we need to add the trait \Codeception\Task\MergeReports to our RoboFile like this:

class RoboFile extends \Robo\Tasks
{
    use \Codeception\Task\SplitTestsByGroups;
    use \Codeception\Task\MergeReports;
    ...
}

Once this is done we can add our merge task. Let’s show an example again that we can examine:

    public function parallelMerge()
    {
        $merger = $this->taskMergeXmlReports();

        for ($i = 1; $i <= self::AMOUNT_OF_THREADS; $i++) {
            $fileName = "tests/_log/result_p{$i}.xml";
            if (file_exists($fileName)) {
                $merger->from($fileName);
            }
        }

        $merger->into("tests/_log/report.xml")->run();
    }

What happens here is that we use the task taskMergeXmlReports and provide it with a series of source file names (which match the output file names of the previous chapter) using the from() method. Once we have collected all files we tell the task to which file they must be merged into using the into() method and finally perform the actual merger using the run() method.

Something to take into account is that if anything goes wrong during the execution of your tests, such as a parser error, that no result_p*.xml is generated. The merger will throw an exception if you try to merge a non-existent file and then your CI environment will probably not get what is going on.

Another thing to take into account is that if there are no successful runs that the report.xml file will still be created but it will be empty, and your CI environment will error saying that the XML is not well-formed (because an empty file is malformed XML).

Conclusion

At last you will have a report.xml that you can pass directly into your Continuous Integration environment (because it is based on the xUnit output format) and you have a working, fast series of tests that use less memory in the process as well.

It does take some effort to set this up and you will probably hit a few bumps along the way but in the end it is worth the effort if you want to keep your Continous Integration running smoothly and fast.