Write Fewer Unit Tests

April 8, 2025

Testing bigger chunks is better in the long run.

History

In 1994 I started my first real programming job, Trilogy. Eight engineers supported a huge amount of revenue. Besides overworking the engineers, rigorous automated testing was crucial to success. Our C++ tests wrote to stdout and diffed against expected output. Every change required a test so we could detect regressions. I was told at the time this was novel, but I don’t know if that’s really true.

Around 1999 I started working in Java. We used JUnit, which was better than diff. We talked about testing individual methods, but we weren’t using dependency injection and Mockito was still years away. As a result, the tests tended to exercise large pieces of code.

A few years after that “writing tests” became “writing unit tests.” This is when I started encountering zealots who loved to argue about whether each test was a “unit test,” “functional test,” or “integration test.” The implication was that unit tests were superior. The adoption of inversion of control (IoC) and mocking made this possible. Since then I have found IoC and mocking useful, but writing mostly unit tests leads to a maintenance nightmare.

Test Inputs and Outputs

All software has input and output. Example inputs are a REST call, mouse click, time trigger, and queue message. Example outputs are REST response, screen update, log message, and file dumped to S3. The user doesn’t care how the software arrives at its output, they just expect that their input results in the expected output. I visualize it like this:

Unit tests verify each little box in the drawing. However, the user just cares about the output. Imagine the software was rewritten:

From the user’s perspective, the software behaves the same, but from the programmer’s perspective, the guts changed dramatically. Tests that focus on the input and output are still valid after this rewrite and confirm the software still works. Tests that verify all the little boxes inside are worthless after the rewrite and don’t serve their intended purpose of making sure the software works over the long term.

Generating inputs and verifying outputs requires fewer tests, exercises the software more completely, and is easier to maintain.

Test Big Chunks

Sometimes (often?) testing the true inputs and outputs is difficult. However, you can still big chunks. Continuing with our example:

The test exercises everything in the grey polygon and ignores the little blue boxes. The blue boxes ideally have as little code as possible.

Test the biggest reasonable chunks.

Mocks

Mocks are invaluable for generating edge cases and simulating remote services, but I use as few mocks as possible.

Tests often need to generate situations that rarely happen, for example race conditions or exceptions that “aren’t supposed to happen.” Mocks are a great way to generate these conditions.

Likewise, mocks are good for removing dependency on remotes services. If my tests connect to a remote service, then I usually make a client interface that has as little code as possible. Then I mock the client in the tests. This allows the tests to run in isolation, so they don’t fail when the remote service is not available.

I would mock the blue box, which allows everything else to be tested.

If my application is heavily dependent on a database, then I don’t usually consider that a “remote service” because I want my tests to verify that all the database interactions function correctly.

I have seen tests that mock every single dependency. Tests like this do not survive refactoring well and usually don’t do effective validation. Test in bigger chunks.

Unit Tests

I usually end up with just a few true unit tests. For example, if I need to strip all whitespace and control characters from input, I would write unit tests for the strip method, not send all the possible inputs through the entire system. Small tests like this for situations with a lot of edge cases often run much faster than feeding data through the whole system.

Write Automated Tests

Some engineers like to argue about whether a test is a “unit test,” “functional test,” or “integration test.” The fact we use a tool called JUnit only complicates the discussion. I try to use the term “automated test” to not waste time in these arguments.

Write automated tests. Test the biggest chunks you can. Save unit tests and mocks for edge cases.