A Philosophy of Software Design
“Complexity is more apparent to readers than writers.”
“Code is harder to read than to write.” I first came across this perplexing idea some 20 years ago, from Joel Spolsky. I remember how profound this struck me.
Why is code harder to read than write? How can this even be?
We read (human language) every day, seemingly effortlessly. Reading books, whether novels or nonfiction, is common, but writing books is not. Or if we consider media more generally, the ratio between consuming and producing media skews heavily towards consumption. For example, most of us listen to a lot more music than we play — let alone compose — ourselves.
Could it be that programmers just don’t read enough code? That if our read-to-write ratio were similar to our other media consumption, we’d find it much easier to do?
Maybe it’s how it’s (not) taught. Paul Graham compared hackers to painters: “part of the traditional education of painters to copy the works of the great masters, because copying forces you to look closely at the way a painting is made.”
Or maybe this comparison just doesn’t work: maybe programming is not like reading and writing or the spoken language or listening to music.
But what if we could write code in a way that made reading it easier? This is a central tenet of John Ousterhout’s A Philosophy of Software Design.
Coming in at 170 pages, A Philosophy is a delightfully small book. Concise and matter of fact, it eschews formal design patterns and instead focuses on design principles.
When contemplating the act of programming and the nature of code, my inclination is to squint my eyes and pattern-match, trying to conjure up illustrative analogies; Ousterhout sticks to concrete principles and examples.
I’m tempted to list out all of Ousterhout’s 15 principles and 14 red flags, if even just to be a handy, searchable reference for myself. But sharing these alone, without the full explanations and examples behind them, wouldn’t do them justice or really make sense.
Instead, I’ve picked a few of Ousterhout’s principles and used them as a starting point for my own musing about code.
Design principle 12: “Design it twice.”
Over the years I’ve come up with a handful of rules of thumb, such as “hardcode it first,” “never rebuild your front-end and backend at the same time,” and “you have to build something three times before you can systematize it.”
When writing code, it is tempting to attempt to parametrize the code by adding variables along the way that alter how the code behaves. Premature systemization is an attempt to avoid the need to modify the code for future use. Instead, we should try to create as simple of a system as we can and only modify it once we actually have a need for some alternative behavior.
The problem with premature systemization is that you are trying to predict future use, which is about as successful as predicting the future generally is. Doing this not only slows you down (pondering potential use cases) but also makes the code harder to understand and test. And in my experience, these alternate behaviours are rarely used, so this added complexity is for nought.
When designing a system we should be open to exploring an approach and then revising it as we’ve learned more about the system’s form and function. One cannot know the best design choices up front. Or, more simply put: you’re not going to get it right first time.
This is a lesson we could take from writers: they write first and edit later. Writers expect to go through many drafts before reaching the finished article.
Writing something quick and dirty is a great way to understand the nature and boundaries of a system. “Hardcoding it first” is an acknowledgement that we don’t yet know what the best level of abstraction is or what the parameters of change are.
Once you have something working, you are much better placed to make informed decisions about the design of your system.
Design principle 14: “Software should be designed for ease of reading, not ease of writing.”
When you write code, you start from an idea of what you want or how something should work — you have a discrete goal. This allows you to figure out what you need to do, in code, step by step. Each step, when completed, gives you a clear sense of achievement and progression.
Contrast this with the challenge of trying to understand how a system works by reading its source code: the task is loaded with uncertainty and the effort required can feel unbounded. And often this is accompanied by an irrational fear: what if, even after reading every last line of code, you still don’t understand how it all works?
To get a sense of what this feels like, imagine having the task of trying to find a specific phone number in a (paper) phone book. The book is organised alphabetically by name, as phone books are, and you don’t know the name associated with the number. Your only option is to read each number, line by line. Your only sign of progress is when you occasionally turn the page. And all the while, you have to contend with a looming sense of jeopardy, of knowing that if you lose your focus, if you go too fast, you might miss the number you are looking for (or perhaps you already have?).
The challenge in reading code line by line is that you are working at the lowest level at which a system operates. Understanding what specific lines of code are doing is not difficult, but understanding how these lines fit into and shape the overall behaviour is. It’s like trying to put together a puzzle without knowing what the finished picture looks like.
Design principle 4: “Modules should be deep.”
One way to make a system easier to comprehend is to structure it into discrete parts, or modules. This way, the reader needs only examine how the modules interact with each other. Alternatively, if there is a need to understand how a specific module works, that module can be examined in isolation. It’s like being given whole sections of a puzzle pre-assembled.
Design principle 5: “Interfaces should be designed to make the most common usage as simple as possible.”
Defining modules with clear utility allows the reader to reason about it and its role in the system at a higher level.
This is why structuring code is so important: abstractions and modules help software engineers to not only reason about — and test the operation of — code, but also to comprehend and use the code correctly.
Design principle 6: “It’s more important for a module to have a simple interface than a simple implementation.”
I’ve always disliked the “tiny functions, each in their own file” trend. I find the need to keep jumping from location to location jarring. For me, this makes the code less readable.
The smallness of a function, class or module has no intrinsic value. In fact, overly small modules might be worse than large ones. The value of a module is that you reason about it as a bigger, discrete “thing”. This allows you to reason at a higher level than individual lines of code.
The trend for tiny functions, like all trends du jour, is justified as “best practice”. However, I think this stems from an overly simplistic interpretation of other aspects of good software design. (Mechanistic rules to improve code quality don’t work.)
Testability, encapsulation, reuse. These are the benefits we’re after. But is unit testing a three-line function really worth it? And how often will it really be reused?
Why bad code begets bad code
When faced with the unfamiliar terrain of a poorly structured codebase, it can be tempting to just figure out the smallest part that needs to be changed and change that. While I’m all for the principle of making the smallest possible change necessary, when taking this approach you risk missing the bigger picture and compounding the complexity of a codebase.
When modifying code, there is always the risk of introducing a bug. Catching undesirable behaviour is usually pretty straightforward when it manifests locally, where the change is being made. But if another part of the system relies on the code being written in a certain way (as opposed to relying on a defined interface), a change (even in form, with no actual change to the code’s behaviour) can cause it to break. These kinds of bugs can be difficult to immediately notice, as they manifest elsewhere in the system, in a place that the developer is not paying attention to.
For example, say you are working on software to manage a restaurant. You have a function that accepts a menu item and returns its price. In the hope of drumming up more business, the restaurant decides to create an offer where all items are half price on Mondays. No problem, just check within the function if today is Monday, and if it is, halve the price before returning it. What you didn’t know is that the code that generates a printable menu also uses this function. Guess what will happen if someone happens to print new copies of the menu on a Monday…
Software engineers are well-versed in the ways to avoid this kind of side-effectful code or tight coupling. The problem, when dealing with an unfamiliar codebase, is one of trust: can you trust those before you have observed these practices?
When faced with the prospect that modifying code could cause potentially undiscoverable side-effects, it can feel safest to “leave well enough alone” and just add some new code to cover the case you need. This leads to more conditions, more cases, more branching, more duplicated code. In other words, the codebase becomes bigger, more complicated, and harder to test.
The harder it is to understand code, the scarier it is to modify, thus compounding the problem. When we are scared of changing something that already exists, we add more, bloating the codebase.
No wonder developers are constantly crying out for rewrites!
Reading code is hard, but it’s not that hard
Once you understand the gist of a system, it is easy to examine it from different angles, to think through different options, to consider things to their logical conclusions. I tend to think about this as loading the problem into working memory: once it’s in your head, it’s cheap and easy to manipulate. But “booting it up” isn’t free, it takes mental effort!
While reading code does take effort, it is an effort that we developers tend to overestimate. When faced with making sense of an unfamiliar codebase, I’ve seen developers give up quickly and declare that it is a mess (swiftly followed by the declaration that the whole thing needs to be rewritten).
I’ve found two techniques useful in overcoming this reticence. The first is to read the code as if it was poetry, not trying to grasp each line as you go, but rather to let it wash over you. The second is to timebox the activity: “I will read this code for two hours and then, if I still don’t understand it, I can try something different.”
Learning how to overcome the discomfort of reading code is a skill every software engineer would do well to master.
Even better, software engineers would do well to read A Philosophy of Software Design and to put its principles into practice, making their code easier to read, understand and maintain.