The Tower of Abstraction

When in doubt, build a wrapper 🍬

Abstraction is a powerful too, but like any tool it can be misused. Let’s discuss.

The Goal of Abstraction

Abstraction is everywhere in the modern world, and it makes our lives easier. Web sites don’t need to worry about which browser you’re using,* you can open a file without thinking about how the hard drive is formatted,* and the gas pedal means go, regardless of who made your car and whether it runs on gas, diesel, or batteries.*

Abstraction is a form of ideatic removal. It’s basically the concept of “here’s a box, it does a thing, don’t worry about how.*” This is useful because it lets you use technology (and lets technology use other technology) without having to consider every little thing that’s going on under the hood.

Layers upon Layers

The modern web is a good example of abstraction put to use. Traditionally, the web is HTML over HTTP over TCP/IP. We could look further up and see what the HTML supports: multimedia and JavaScript and the frameworks built on that. Or we could look down and see how the kernel handles TCP over IP over Ethernet. Further down and we see how the hardware sends and receives Ethernet over the wire; or how the WiFi card, after adding its own layer of encryption, modulates the signal to transmit on a radio band.

That is a lot of abstraction. This result is highly effective, but the modern web is also a very big thing. Most systems don’t require quite as many layers.

As another example, consider what happens when you open a file: the program makes a syscall to the kernel, which (typically) accesses the raw data via block device, which itself is an abstraction over SATA/SAS/NVMe protocols, which themselves obscure the inner workings of the physical storage medium. Does the drive spin? The OS doesn’t need to know or care.*

Temptation and Misuse

If you write software, you probably create a lot of abstraction as well as using it. Every class, method, wrapper script, etc, is a box that does a thing where the how is tucked away neatly.*

And this is where a problem comes in: it can be tempting to create too much abstraction.

The Tower of Pancakes

Suppose you have a tool that mostly does what you want, but it’s not optimized for your use-case. (It’s missing something you need, or requires a lot of boilerplate, etc.) So you build a simple wrapper around the tool to make it do exactly what you want. If this is for a new feature you push it into production, or if it’s a workstation script you share it with your colleagues in hopes they benefit the same as you.

It serves its purpose, but eventually the use-case will change. Typically the best thing to do is update the wrapper to support the new use-case, or build a new one around the original tool. But unfortunately, often the easiest thing to do is just build a wrapper around your wrapper.

And so it goes. Each time the use-case changes, there’s a good chance someone will build a yet another wrapper around the previous one. Heck, the original tool might have been a wrapper too.

The Pile of Band-Aids

At least the tower of pancakes actually works, and with enough maple syrup might even taste good. But sometimes you get a worse version of this, usually when you’re dealing with workstation scripts and other non-production things:

Suppose you have a tool that mostly does what you want, but it’s not optimized for your use-case. So you try to build a simple wrapper around the tool, but you’re a crappy dev who only tested with their own workflow. You share it with your colleagues in an attempt to be useful, but since they’re as crappy as you, they don’t collaborate to develop your buggy wrapper into something good. Instead they put band-aids on it in a haphazard way, perhaps by building their own wrappers which use both your wrapper and the original tool directly. Each of these band-aids makes things easier for its author, but more difficult for anyone else who tries to use it.

Using Abstraction Well

1. Create the Right Abstraction

This can be surprisingly difficult, largely because you’re creating an interface for others to use, and you can’t anticipate everything they’re going to do with it. For your abstraction to be truly useful, it needs to cover the user’s needs such that they never need to touch (or even understand) the underlying substrate.*

It’s also possible to have the opposite problem: to target such a wide range of use-cases that your interface become excessively complex, difficult for you to implement and possibly difficult for the user to understand.

To balance the above issues, I usually try to start with a narrow use-case and cover it fully, and then see what other use-cases I can cover without sacrificing simplicity. But everyone’s situation is different.

If something’s really complicated, you might need multiple interfaces at different abstraction levels. For example, Python’s subprocess.run is a pancake resting atop subprocess.Popen, which in turn is a heavier abstraction atop os.posix_spawn or _winapi.CreateProcess. The user can choose the interface that suits their needs.

On a related note, a key benefit of the subprocess module is that you can use the exact same code to run external processes on both POSIX and Windows, and the subtle differences between MacOS, GNU/Linux, Android, Solaris, etc, are handled for you. subprocess.Popen gives you a wide range of flexibility, while subprocess.run simplifies things for the most common use-cases.

2. Quality is Key

Consider a standalone executable, written in just about any language. It has a single entry point, and full control of its internal subroutines, data structures, etc. Now consider a class or library in the same language, common forms taken when implementing an abstraction. It may have a multitude of public methods and a wide variety of possible operational sequences as demanded by the caller. The class or library will have its functionality exercised more thoroughly, and in a manner less predictably, than the executable. As a result, edge cases are more likely to happen, surfacing any associated bugs.

For libraries in particular, the code may need aggressive invocation-sequence checking, bordering on idiot-proofing. And sometimes we’re the idiots: I’ve misused my own libraries quite a few times, and was often saved by my own seatbelt.

3. Create the Substance First

This one may eventually deserve its own page. In short, resist the temptation to rigorously define interfaces before you’ve started implementing them. Sometimes we have no choice, but in many cases we unnecessarily lock ourselves into a bad design.

I’ve found a better approach is to figure out roughly what the interfaces should be, and then write the underlying code, and adjust the interfaces as I go. Sometimes I even write the code first, and worry about interfaces and invocation patterns later.

The Big Caveat

And now, we need to talk about the asterisks that pepper this page. As the saying goes, all abstractions are leaky:

Websites often do care which browser you’re using, because each browser has its own quirks that the site needs to take into account.
You do need to worry about hard drive formats when dealing with large files. FAT32 can’t handle 4GiB+ files, VM images don’t belong on COW volumes, etc.
Every car has a gas and brake, but acceleration profiles vary. Your usual footwork might be smooth on your daily driver but jerky on a rental. Same goes for steering wheels.
Your OS does care if your hard drive spins. While not strictly necessary, different types of storage hardware are often treated differently to improve performance and longevity.

As such, perhaps we should rephrase: abstraction provides a box that does a thing, such that the how is demoted from “necessary knowledge” to merely “useful knowledge.”

Conclusion

Abstraction, like recursion, is one of the most powerful programming concepts out there. Use it when needed and don’t go too crazy.

The Tower of Abstraction

Created:	2024-12-10
Updated:	2025-10-06
Tags:	philosophy