This piece has been making the rounds among tech heads this week, and there’s one key point made by the author that I wish I could make to every stakeholder in every organization on the planet at once rather than having to explain it here and there as I encounter them personally.
The problem with code, in general, is that it rots.
There is no shortage of perfect code in the world. Perfect code is written every day. Probably every serious software developer has written perfect code several times in their career. The problem is that perfect code is temporary. Time ticks away quickly, and the perfection of the code decays. Perfect code is not impervious code.
So why and how does code rot? It’s so frustrating to stakeholders. “We” haven’t changed that code in months, sometimes years! That code was just fine last April! Now you’re telling us it needs to be reworked? What happened? You weren’t even with the company when that feature was built. What did you do?
The circumstances changed. The code’s just an innocent victim. At best it was just over there in the code management repositories silently passing automated tests, or at worst it was running in production on someone’s server, minding its own business, doing exactly what it’s supposed to do: telling some computers to do something exactly as the code commands.
As time passes, the circumstances under which the code was originally written changes, and so the ability of the code to tell the computers to do the right things changes. Particularly in a business context, circumstances necessarily change. Otherwise, the business has become static. A static business is not a business pursuing competitive advantages. A static business is a dead business; Possibly a walking-dead business, but dead nonetheless.
Code in a business context is a codification of how at least some part of that business operates. And so as the business pursues new competitive advantages with stakeholders making decisions both big and small, the circumstances under which the code operates will change. And when I say “decisions,” I mean every decision. Implemented a new notification system? Code’s rotting. Decided to start doing company all-hands every 2nd Thursday of the month? Code’s rotting. Using a new Notion.io “app” to track inbound sales? Code’s rotting. Changed the IPA in the office kegerator? Sorry, code’s rotting.
Further, every other business competing with the business for which that code runs will, themselves pursue competitive advantage. As such, the code’s circumstances change again, no matter what the owning business did. The old code doesn’t work as well as it used to b/c the business has lost some competitive advantage.
Even beyond that, the world in which that business runs changes. Even if the entire industry of that business colluded and agreed to do nothing at all in pursuit of competitive advantage, the code running in all of those businesses will, before long, find that the circumstances have changed. Or the code is now running on a newer version of an operating system. A version made necessary due to security updates and patches so that the operating system runs better. Or a cloud vendor running the virtual machines on which the code is installed has decided that this particular flavor of cloud hosting is no longer a business worth pursuing, and so the code needs to be moved. Either way, the code is rotting, and there’s not really anything anyone can do to prevent it.
My favorite example of “The world changed, and now the code that worked just fine is rotten,” is the Gangnam Style video on YouTube.
In 2014, the video – already YouTube’s most watched ever, and 2 years old – actually broke YouTube as the count for the number of views of the video outran the maximum for the type of field YouTube was using to store that count.
The world changed. The world started to watch more and more videos on the internet. The world couldn’t stop loving Gangnam Style. The code sat, static, and rotted. At the time the code was written, 2,147,483,647 probably seemed so huge a maximum field value that an engineer or product manager likely thought, “by the time enough people are watching videos online at that scale, we’ll have had a chance to re-engineer this.” Woops.
What’s great is that by then, everyone freshly remembered the Y2K Bug – itself a “simple” issue of field type maximums. But YouTube’s rot was different from that. The Y2K Bug was a problem of a very intuitive code rot. Of course decades-old software would run into problems. Of course the code would rot. But YouTube’s rot was because everything about the world changed so quickly. YouTube enjoyed more success. Success breaks things.
Gangnam Style and Y2K are well known, singular examples of code rot, but very similar examples of rot occur every day in every code base. And as every business becomes more and more a software company, every business must account for that rot. While rot is unavoidable, it can be mitigated. I’ll find another 30 minutes to write about how to mitigate the rot some other time. I really just wanted to hammer home the point about rot and get you singing Psy’s biggest hit in your head.