The lies about the necessity of a Big Rewrite

This post is about real world software products that make money and have required multiple man-years of development time to build. This is an industry in which quality, costs and thus professional software development matters. The pragmatism and realism of Joel Spolsky’s blog on this type of software development is refreshing. He is also not afraid to speak up when he believes he is right, like on the “Big Rewrite” subject. In this post, I will argue why Joel Spolsky is right. Next I will show the real reasons software developers want to do a Big Rewrite and how they justify it. Finally, Neil Gunton has a quote that will help you convince software developers that there is a different path that should be taken.

Big Rewrite: The worst strategic mistake?

Whenever a developer says a software product needs a complete rewrite, I always think of Joel Spolsky saying:

… the single worst strategic mistake that any software company can make: (They decided to) rewrite the code from scratch. – Joel Spolsky

You should definitely read the complete article, because it holds a lot of strong arguments to back the statement up, which I will not repeat here. He made this statement in the context of the Big Rewrite Netscape did that led to Mozilla Firefox. In an interesting very well written counter-post Adam Turoff writes:

Joel Spolsky is arguing that the Great Mozilla rewrite was a horrible decision in the short term, while Adam Wiggins is arguing that the same project was a wild success in the long term. Note that these positions do not contradict each other.

Indeed! I fully agree that these positions do not contradict. So the result was not bad, but this was the worst mistake the software company could make. Then he continues to say that:

Joel’s logic has got more holes in it than a fishing net. If you’re dealing with a big here and a long now, whatever work you do right now is completely inconsequential compared to where the project will be five years from today or five million users from now. – Adam Turoff

Wait, what? Now he chooses Netscape’s side?! And this argument makes absolutely no sense to me. Who knows what the software will require five years or five million users from now? For this to be true, the guys at Netscape must have been able to look into the future. If so, then why did they not buy Apple stock? In my opinion the observation that one cannot predict the future is enough reason to argue that deciding to do a Big Rewrite is always a mistake.

But what if you don’t want to make a leap into the future, but you are trying to catch up? What if your software product has gathered so much technical debt that a Big Rewrite is necessary? While this argument also feels logical, I will argue why it is not. Let us look at the different technical causes of technical debt and what should be done to counter them:

  • Lack of test suite, which can easily be countered by adding tests
  • Lack of documentation, writing it is not a popular task, but it can be done
  • Lack of building loosely coupled components, dependency injection can be introduced one software component at a time; your test suite will guarantee there is no regression
  • Parallel development, do not rewrite big pieces of code, keep the change sets small and merge often
  • Delayed refactoring, is refactoring much more expensive than rewriting? It may seem so due to the 80/20 rule, but it probably is not; just start doing it

And then we immediately get back to the reality, which normally prevents us from doing a big rewrite – we need to tend the shop. We need to keep the current software from breaking down and we need to implement critical bug fixes and features. If this takes up all our time, because there is so much technical debt, then that debt may become a hurdle that seems too big to overcome ever. So realize that not being able to reserve time (or people) to get rid of technical debt can be the real reason to ask for a Big Rewrite.

To conclude: a Big Rewrite is always a mistake, since we cannot look into the future and if there is technical debt then that should be acknowledged and countered the normal way.

The lies to justify a Big Rewrite

When a developer suggests a “complete rewrite” this should be a red flag to you. The developer is most probably lying about the justification. The real reasons the developer is suggesting Big Rewrite or “build from scratch” are:

  1. Not-Invented-Here syndrome (not understanding the software)
  2. Hard-to-solve bugs (which are not fun working on)
  3. Technical debt, including debt caused by missing tests and documentation (which are not fun working on)
  4. The developer wants to work on a different technology (which is more fun working on)

The lie is that the bugs and technical debt are presented as structural/fundamental changes to the software that cannot realistically be achieved without a Big Rewrite. Five other typical lies (according to Chad Fowler) that the developer will promise in return of a Big Rewrite include:

  1. The system will be more maintainable (less bugs)
  2. It will be easier to add features (more development speed)
  3. The system will be more scalable (lower computation time)
  4. System response time will improve for our customers (less on-demand computation)
  5. We will have greater uptime (better high availability strategy)

Any code can be replaced incrementally and all code must be replaced incrementally. Just like bugs need to be solved and technical debt needs to be removed. Even when technology migrations are needed, they need to be done incrementally, one part or component at a time and not with a Big Bang.

Conclusion

Joel Spolsky is right; You don’t need a Big Rewrite. Doing a Big Rewrite is the worst mistake a software company can make. Or as Neil Gunton puts it more gentle and positive:

If you have a very successful application, don’t look at all that old, messy code as being “stale”. Look at it as a living organism that can perhaps be healed, and can evolve. – Neil Gunton

If a software developer is arguing that a Big Rewrite is needed, then remind him that the software is alive and he is responsible for keeping it healthy and growing it up to become fully matured.

Share

Refactoring, upgrading and other software maintenance

In 1976, Lientz and Swanson categorized software maintenance activities into four classes (from Wikipedia):

  • Corrective: diagnosing and fixing errors, possibly ones found by users
  • Perfective: implementing new or changed user requirements which concern functional enhancements to the software
  • Adaptive: modifying the system to cope with changes in the software environment (DBMS, OS)
  • Preventive: increasing software maintainability or reliability to prevent problems in the future

“Corrective” and “perfective” maintenance are very visible and bring value to the customer. The need for “adaptive” and “preventive” maintenance on the other hand is harder to defend.

Customer value or waste?

The customer will demand any “corrective” change to be executed by the software developers with high priority, especially when it impacts his business. Any customer will also have a (wish) list of “perfective” enhancements to be executed and the client is willing to pay and prioritize this maintenance.

“Adaptive” maintenance on the other hand brings no (direct) value to the customer. It will be hard to get the customer to pay or even prioritize an operating system upgrade. Customers do like to argue “it works, do not touch it”. But the Internet has changed our world. We live in an all-connected world where security fixes need to be applied monthly. At some point the vendor stops providing (security) updates. At this point the product is called end-of-life (EOL). Not upgrading is considered a risk nowadays. This is why software developers can often argue that upgrading needs to get priority and the costs are justified. Examples of “adaptive” maintenance:

  1. Dependency upgrades
  2. Framework upgrades
  3. Operating system upgrades
  4. Database upgrades

“Preventive” maintenance is even harder to defend. Like “adaptive” it does not bring (direct) value to the customer. In the Agile methodology everything not adding value to the customer is called “waste”. Examples of “preventive” maintenance:

  1. Small code refactoring
  2. Big code refactoring
  3. Database refactoring

What is refactoring?

Wikipedia has a rather long definition that I tried to compact to:

Refactoring: technique for restructuring, altering internal structure without changing external behavior, undertaken to improve maintainability and extensibility

It may be trivial to conclude that refactoring is waste. Refactoring does not change behavior (functionality) and does therefor not bring value to the customer. Interestingly it is not mentioned as a waste in The Seven Wastes of Software Development by Matt Stine nor in the How to Manage the “7 Wastes” of Agile Software Development by Vijaya Kumar Bandaru. These posts are both referring to the 2006 book Implementing Lean Software Development: From Concept to Cash by Mary and Tom Poppendieck. Vijaya only says that “not refactoring” can lead to “partially done work”. From this remark we can conclude that not properly refactored code is not “done”. This is confirmed by Mayank Gupta who writes that “A common mistake is to not keep refactoring in the definition of done.”

Refactoring needs unit tests

Changing code without a functional reason, seems counter-productive at first. You spend money and take a risk to break the software, without promising any business benefits. This is why in software development the change management “change committee” should include software architects and security engineers, not only business people. The truth is that refactoring leads to lower costs for fixing bugs and extending with new features. This is mainly, because refactoring leads to software developers “touching” (and thinking about) the code. To counter the negative effects of refactoring (breaking software) one can introduce (and require) unit tests. These are programmed (automated) tests that ensure that after refactoring the code still works as before.

Selling refactoring to the business

Refactoring code that is touched by user stories in the sprint is easily done, since people are working on that code anyway. Making refactoring part of your definition-of-done encourages refactoring every time a piece of code is touched. To ensure high-quality, make sure the touched code also checks all the other marks on your definition-of-done. Some of our important definition-of-done topics:

  1. Unit tests written/ran
  2. Refactoring done
  3. Documentation updated
  4. Comitted to Git
  5. Functional tests written/ran
  6. Accepted by customer

Refactoring code unrelated to user stories in the sprint needs to be sold to the customer. Otherwise the customer will not understand why certain bugs suddenly arise. If you do not succeed to do so you must have patience. Every time you touch a part of the code for a user story, take the chance to refactor the related code or database. This way after a year or two you are able to renew your code base and data structure one feature or table at a time. Just make sure every user story is a little more expansive, so that you can squeeze in the preventive maintenance needed.

Share