Embrace Test Composability: A Dive into Test Desiderata
It's difficult to talk about good vs bad tests without some axis of measure and discipline in what those terms mean. Luckily we have quite a bit of empirical data to rely on for common cases.
Desiderata
A few years ago
established a cohesive set of principles known as Test Desiderata. I know… I know… I struggle pronouncing it as well. The marketing team is still out on that one.The desiderata serve as a guiding light for crafting effective tests. I highlighted effective in that sentence because it heavily depends on the business what exactly effective means. Think of it as Effective<T>
, substitute T for business value in the context of your strategy. I digress, nerding out here. Back to the rules!
These principles encapsulate the essence of what makes tests valuable in the realm of Test-Driven Development (TDD). Principles and qualities that provide a roadmap for developers to navigate the intricacies of testing.
Edit [2024]: Kent Beck replied to this post with an important distinction
Thank you for the careful reading of the Test Desiderata. I agree the name is a mouthful.
One disagreement--I don't think the desiderata have any necessary connection to TDD. Yes, when you are TDDing you want the tests to be fast & predictive, but you also make tradeoffs between fast & predictive in a test suite run during a deployment build.
My bias shows here and I’m glad we get to explore this early on. I agree that the concepts are orthogonal. I failed to highlight context, I am guiding this conversation through the lense of a engineer adopting XP practices with TDD.
He or she are exploring with a team member the limits of the quality trade-offs to the point where they may need a test outside the bounds of developer tests, ie. acceptance tests, end-to-end tests, UI tests. I sense we’ll have to go into some amount of detail to explore what developer tests are in the future—note to self.
This article’s focus is a challenge in perspective. Rather than thinking of developer tests as “this and that” or “this runs on XUnit”, consider the nature of the qualities first: “I need a fast test that is outside-in and predictive of production behavior.”
Like a chef, you’re balancing saltiness and acidicity, rather than putting mayo on everything (I’m from Europe).
Thank you for the response, Kent.
Your engineering team calibrates software design deliberately to emphasise certain qualities to ensure that each test contributes meaningfully to the software's robustness and reliability.
Think of it like a recipe for a spice mix that makes up the flavour of your engineering organisation’s testing style.
Key Qualities for Composability
In the quest for test composability we zero in on specific qualities that form the bedrock of effective testing.
These qualities are akin to the essential ingredients in a chef's pantry:
structure dependence—how likely refactoring will cause tests to fail
readability—how likely an engineer is to guess the correct intent of the code
isolation—how clear separation from out-of-process dependencies is
composability itself—the ability to talk about your system in terms of rules
Just as a chef carefully selects and combines ingredients to create a harmonious dish, developers cultivate these qualities to ensure that their tests blend seamlessly into a cohesive suite.
You may notice composability repeating in the wider sense and as a quality. This is neither a trick nor an oversight.
Composability comes in a few contexts:
business domain. How clear are the natural rules of your underlying business processes?
implementation. How composable are your atomic pieces, given the details of the programming language and coding paradigm you have chosen?
tests. How strongly is the composability of your business and implementation reflected in your tests?
Composability Example
Take an e-commerce system that has two types of customers and four checkout flows:
customers: personal and business
checkout flows: credit card, apple pay, google wallet and crypto
There are 2 customer rules, and 4 payment rules, giving us 6 rules in total and 8 unique combinations.
As a rough goal outline, high composability allows us to write seven tests to isolate each rule plus an integration test to check contracts.
In contrast, low composability would require us to write a test for each permutation, a minimum of eight, plus integrations.
The numbers may seem trivial with this given example. However, imagine scenarios where you add axis to this equation: product type, country of residence, gifts… This number grows exponentially. The number of tests can remain on the power side (millions) or logarithmic side (hundreds) of this equation. DB Tests that require deeply nested database fixtures are a notorious symptom of low composability.
Other examples
The other eleven qualities are either trivial to explain, or require code examples to demonstrate so I’ll be following up with you with future posts this week. Pinky promise.
Trade-offs
There is no free lunch. You’d be misguided and overwhelmed in maximising each quality individually. Ah, the eternal dance of trade-offs in the world of software development—where every decision carries its weight in consequences.
Navigating these trade-offs requires a delicate balance of priorities and compromises. It's like walking a tightrope, where maximizing one quality often comes at the expense of another. Such an example is often fast vs. predictive in case of developer tests compared to acceptance tests running in a production-like environment.
Understanding and embracing these trade-offs is the hallmark of a seasoned developer. It's about making informed choices that align with the overarching goals of the project and the unique needs of the business.
I wish I could tell you a simple formula to follow. As you can see in our latest stream with several guest participants, every individual developer (and even developer pair!) optimises for different levels of compromises.
Thank you for the helpful write-up! I like that you used metaphor. Great pun in there also after using cooking for the metaphor - “seasoned developer.” 😉
Thank you for the careful reading of the Test Desiderata. I agree the name is a mouthful.
One disagreement--I don't think the desiderata have any necessary connection to TDD. Yes, when you are TDDing you want the tests to be fast & predictive, but you also make tradeoffs between fast & predictive in a test suite run during a deployment build.