Simplify Your Control Flow Code, Part 1
Refactoring Tricks to Help you Design Better Ifs and Loops
This issue of Crafting Tech Teams is a Tactical one! There’s code examples, there’s theory. And more importantly— advice on how to approach this matters with your team should you spot the obvious code smells.
We will explore a few examples where I’ve hand drawn a few flow examples. If you want to go into more detail you should read till the end for the 2 hour video by Kevlin Henney with an example in C#.
Indentation
The purpose of refactoring is to change the shape and readability of the code while preserving its behavior. Usually this is performed with the safety net of automated tests—granted that they are testing behavior, not the underlying structure.
Tests also have to be refactored to maintain their cohesion. This can often be paired with a micro-step when you know the code is in a working state and you refactor the tests to check if they pass— and if they fail correctly via mutation testing.
Today’s issue won’t cover mutation testing, we are focusing on code. Specifically control flow and indentation.
But why?
Code Smell
Control flow causes indentation. Complex indentation is a sign of complex control flow. Chaotically indented code is a code smell— “Bumpy Road.”
Control-flow code smells are the most pervasive type of readability and understanding impediment. It’s what makes working with complex, legacy code bases YUCK!
Here is a list of a few that you may know—that are also picked up automatically by Code Health integration systems like Sonarqube and Codescene:
Bumpy Road
Long Method
Long Class
God Method/God Class
Brainy Method
Speculative Abstractions (notably Inheritance)
Nested Conditionals
We are exploring these with two examples, covering part 1 today:
Part 1: A CSV export that filters rows
Part 2: An e-commerce order checkout that automatically tries to apply an optional discount
Loops
Brute force quick implementation
Notice how the important information is at the end and deeply nested. This is also expressive of the imperative—or push—form of aggregation due to its operation-flow-oriented style.
This is in opposition to a pull-style, declarative (or functional) approach where the data is expressed in its leaf form and folded into its output format.
const keepRow = (r:any) => true;
function exportToCsv(selectedColumns: string[]): string {
const criteria = new Criteria('some criteria');
const data = await DataWarehouse.fetch(criteria);
const csvCells: string[][] = [];
for (const row of data) {
if (!keepRow(row))
continue;
const csvRow = [];
for (const column of selectedColumns) {
csvRow.push(row[column]);
}
csvCells.push(csvRow);
}
return csvCells.map(r => r.join(',')).join('\n');
}
Let’s check out what’s going on. I’m adding some commentary on the different concerns. Using different colors for what appear to me as different boundaries.
So it seems there are 3 separate concerns at work here. Hm… interesting isn’t it? But it’s just a CSV export!
Filtering the input data
Honoring the user’s wish to only include selected columns (and having them in the right order!). The order is implicit and may not be visible without a test.
Formatting it to a csv string output
We can analyse two more things before we take an incision: overlap of concerns and control flow.
Concern Overlap
Let’s draw boxes around the three concerns so the top of the box touches the input and the bottom of the box touches its output.
Aha! Now we are getting somewhere. This is certainly drawing a map for our potential refactoring. Think of the map as such:
A large box shows an opportunity to extract
An overlap of boxes shows an opportunity for interface segregation
Small boxes that do one thing speak of high cohesion— signalling a lack of naming
Control flow analysis
Let’s hide all the curly braces, highlight control flow operators and methods and focus on the lines that change state.
for (const row of data)
if (!keepRow(row))
continue;
for (const column of selectedColumns)
csvRow.push(row[column]);
csvCells.push(csvRow);
return csvCells // Split to highlight the flow
.
map(r => r.
join(','))
.
join('\n');
We can notice an immediate symmetry to mapping and collection, especially if we take the last step into account.
Overall we have 5 loops, an if statement (two if our keepRow was more complex), a continue and array manipulation.
Focusing only on this section a naive reader may think that each row has different columns or even data fields. We know that not to be the case from context but it doesn’t translate well to the code.
If we were to add sanitisation to each row to escape special characters that may violate the CSV format we may also decide to do that on the row-level rather than the entire document.
We may highlight the refactor to these box shapes:
Putting it all Together
So the actual behavior of the code may be expressed in the push format:
Filter data input (outside)
Construct the row (leaf)
Return: Join the rows (outside)
And from the pull format, this ends up being:
Define the leaf and filter
Return Folded in…
… filtered data
Giving this shape:
Editorial note: I picked a simple example that has very few smells on the clean-stinky-STANK scale. The code is too trivial to cover complex issues like performance. We’ll be doing that in future issues. You can refer to the video below for a deep-dive.
Also, I’ve been making updates to the example in this article so I’ve made a few changes that may no longer match the images. I’m learning!
You may find my original code scribbles on before and after.
Ok Denis, I’m hooked— I want more!
Here’s the video by Kevlin Henney on the famous Gilded Rose Kata in C#.