Hi, I'm Tanin (@tanin). I live in Seattle and currently work at GIVE.asia. This is my technical blog which contains tricks and code nuggets I've discovered. My main blog is here. Enjoy!

Parallelize tests in SBT on Circle CI

In my previous post on parallelising tests in SBT, it doesn’t work well in practice or, at least, on CircleCI. The main disadvantage is that it doesn’t balance the tests by their run time. Balancing tests by their run time would reduce the total time significantly.

CircleCI offers the command-line tool, named circleci, obviously, for splitting tests. One mode of splitting is splitting tests based on how long individual tests take. If scala_test_classnames contains a list of test classes, we can split using circleci tests split --split-by=timings --timings-type=classname scala_test_classnames. The output is a list of test classes that should be run according to the machine numbered CIRCLE_NODE_INDEX out of the CIRCLE_NODE_TOTAL machines. circleci conveniently reads these two env variables automatically.

At a high level, we want sbt to print out all the test classes. We feed those classes to circleci. Then, we feed the output of circleci to sbt testOnly. Finally, we use CircleCI’s store_test_results to store the test results which includes time. circleci uses this info to split tests accordingly in subsequential runs.

Now it’s time to write an SBT task again. This task is straigtforward because, when googling “sbt list all tests”, the answer is one of the first items. Here’s how I do it in my codebase:

val printTests = taskKey[Unit]("Print full class names of tests to the file `test-full-class-names.log`.")

printTests := {
  import java.io._

  println("Print full class names of tests to the file `test-full-class-names.log`.")

  val pw = new PrintWriter(new File("test-full-class-names.log" ))
  (definedTests in Test).value.sortBy(_.name).foreach { t =>
    pw.println(t.name)
  }
  pw.close()
}

Then, in .circleci/config.yml, we can use the below commands:

...
  - run: sbt printTests
  - run: sbt "testOnly  $(circleci tests split --split-by=timings --timings-type=classname test-full-class-names.log | tr '\n' ' ') -- -u ./test-results/junit"
...

Please notice that:

Finally, we need to store_test_results. It looks like this in our .circleci/config.yml:

...
  - store_test_results:
      path: ./test-results
...

Please note that store_test_results requires the xml file to be in a subdirectory of ./test-results (See reference).

And there you go! Now your SBT tests are parallelised on CircleCI with sensible balancing.

Parallelize tests in SBT with frustration

SBT, the official build tool for Scala, is a very complex build tool. It’s one of those things that makes me wonder if I am stupid or the tool’s complexity surpasses average human intelligence.

I’ve done a few things with SBT (e.g. printing the list of all tests), usually using the trial-and-error approach, and this time I want to add test parallelism to my project.

The requirement is straightforward. Heroku CI or CircleCI can run our tests on multiple machines. In each machine, 2 environment variables, say, MACHINE_INDEX and MACHINE_NUM_TOTAL, are set. We can use these 2 environment variables to shard our tests.

Read more