Reducing unit tests time

2024-11-23

The piece of software at work is comprised of different tech stacks that compose different applications. Each part has its own unit tests. As we have thousands of tests, they were taking a long time to run. It was time to optimize them to improve the feedback loop.

Language stacks and applications

The main part of the software is written in Java, comprised of different modules (libraries) shared between three applications in a mono-repository.
Each application is independent and also has its own code depending on the shared libraries:

        L1  -- base library
       / \
      /   \
     L2   L3  L4 -- shared libraries, can be in different languages
    / \    | /
   A1  A2  A3  -- applications

The applications are also comprised of other language stacks organized in the same way:

Python libraries and scripts
front-end stack in JavaScript and TypeScript

Goals

The goals are pretty straightforward:

reduce unit test time by eliminating the ones that are not needed to run
do not lose test coverage: we should have the same confidence in unit tests whether all of them run or just a subset. A regression should not happen because some tests did not run
manual override: we still want to be able to run all tests if needed

Module dependencies

Because modules have dependencies between each other, tests need to be triggered depending on what has changed.
Taking Java as an example:

if a file has changed in the base module common to all applications, unit tests will be triggered for the whole Java codebase
if a file has only changed in a single application, only tests in that application will be triggered

Change detection

The change detection is pretty straightforward. As unit tests run on a Pull Request, the file changes between the base branch and the target branch are found using Git: git diff --name-only ${TARGET_BRANCH_NAME} -- '*.gradle' (here Gradle files are used as a file type)

Files are categorized by their type and/or module:

.py for Python
.ts or .js for the front-end modules
.java or .scala for Java

Exclusions

Some files are excluded from filters; they are the ones that will always retrigger a full test of the module/project, depending on the scope if one has changed:

build.gradle: test the whole project as this is the base of the build pipeline
package.json and package-lock.json: trigger a module for front-end tests
requirements.txt: trigger a module retest for Python modules

Execution

The execution in the Pull Request pipeline is pretty simple:

get the list of Java files that have changed between the base and the target branch: git diff --name-only ${TARGET_BRANCH_NAME} -- '*.java' '*.scala'
identify the Gradle modules those files belong to
if a Java module has changed (for example L2 from above, only applications A1 and A2 are impacted) and the run-all-unit-tests tag is not present: execute ./gradlew :L2:tests :A1:tests :A3:tests
if a build file has changed or the run-all-unit-tests tag is present, execute everything: ./gradlew tests
re-do the same logic for Python and front-end modules

The above changes were made in the main Pull Request pipeline. They were tested by modifying a file in different modules and checking the unit test result.
At first, build files were forgotten, so when adding or modifying a dependency did not trigger unit tests and the build broke. The same thing happened for resource files which are not code.
That was easily fixed for all modules.

Conclusion

By applying those changes, the feedback from running unit tests is much faster depending on the module that has changed. As the base libraries are less frequently changed than the code from the main applications.

The worst case is just like before with all tests being executed. So no losses, only gains!