The piece of software at work is comprised of different tech stacks that compose different applications. Each part has its own unit tests. As we have thousands of tests, they were taking a long time. It was time to optimize them to improve feedback loop.
Language stacks and applications
The main part of the software is written in Java, comprised of different modules (libraries) shared between 3 applications in a mono-repository.
Each application is independent and also has its code depending on shared libraries:
L1 -- base library
/ \
/ \
L2 L3 L4 -- shared libraries, can be in different languages
/ \ | /
A1 A2 A3 -- applications
The applications are also comprised of other language stack organized in the same way:
- Python libaries and scripts
- front-end stack in Javascript and Typescript
Goals
The goals are pretty straightforward:
- reduce unit tests time by eliminating the ones that are not needed to run
- do not lose in test coverage: we should have the same confidence in unit tests either if they all run or just a subset. A regression should not happen because some tests did not run
- manual override: we still want to be able to run all tests if needed
Module dependencies
Because modules have dependencies between each others, tests need to be triggered depending on what has changed.
Taking Java as example:
- if a file has changed in the base module common to all applications, unit tests will be triggered for the whole Java codebase
- If a file has only changed in a single application, only tests in that applications will be triggered
Change detection
The change detection is pretty straightforward. As unit tests run on a Pull Request, the file changes between the base branch and the target branch is found by using git: git diff --name-only ${TARGET_BRANCH_NAME} -- '*.gradle'
(here gradle files are used as a file type)
Files are caregorized by their type and / or module:
.py
for python.ts
or.js
for the front-end modules.java
or.scala
for java
Exclusions
Some files that are excluded from filters, they are the ones that will always retrigger a full test of the module / project, depending on scope if one has changed:
build.gradle
: test the whole project as this is the base of the build pipelinepackage.json
andpackage-lock.json
: trigger a module for front-end testsrequirements.txt
: trigger a module retest for python modules
Execution
The execution in the Pull Request pipeline is pretty simple:
- get the list of Java files that has changed between the base and the target branch:
git diff --name-only ${TARGET_BRANCH_NAME} -- '*.java' '*.scala'
- identify the Gradle modules those files belong to
- if a Java module has changed (for example
L2
from above, only applicationsA1
andA2
are impacted) and therun-all-unit-tests
tag is not present: execute./gradlew :L2:tests :A1:tests :A3:tests
- if a build file has changed or the
run-all-unit-tests
tag is present, execute everything:./gradlew tests
- re-do the same logic for python and front-end modules
The above changes were made in the main Pull Request pipeline. They were tested by modifying a file in different modules and checking the unit tests result.
At first build files were forgotten, so when adding or modifying a dependency didn’t trigger unit tests and the build broke. The same thing happened for resource files which are not code.
That was easily fixed for all modules.
Conclusion
By applying those changes, the feedback of running unit tests is much faster depending on the module that has changed. As the base libraries are less changed than the code from the main applications.
The worst case is just like before with all tests being executed. So no losses, only gains!