»I will clean up later«—No, you won't.

Posted on 09/02/2015 by Dr. Daniela Steidl

Often, time pressure forces you to quickly write dirty code. You do not choose the most elegant solution. But at least the change is done and it works. You can always clean it up next time, right? Let me tell you: No, you won’t.

Why is that? Because the probability that you will change the code again is actually rather small. (Needless to say, next time you also will not have much more time at hand.) With our tool »Teamscale«, we studied how software systems evolve and how developers change their code. In particular, we examined how often a Java method is changed during its history. It turns out that most methods are only changed about two to three times on average. Two or three times? In a history of three, four, five, even up to 15 years? You might wonder how this can be true.

All systems we studied (well-known open source systems like Subclipse or ArgoUML, but also commercial systems) were changed frequently and growing in size. But in fact, they are growing because developers keep adding new methods rather than modifying existing methods. Many existing methods are not modified at all, they remain unchanged throughout the entire history. To be more precise, our research has shown that, on average, half the methods are not changed again after their initial commit to the repository.

Now, you might think: If my quick-and-dirty method is not modified again, why should I bother about cleaning it up in the first place? The answer is simple: It will still be read for other changes. Based on current estimates, »the ratio of time spent reading (code) versus writing is well over 10 to 1« (Robert C. Martin, »Clean Code«). And cleaner code is not just easier to change, but also much easier to read.

Despite many unchanged methods, there are some that are changed quite frequently over time. Especially, very long methods—methods with more than 75 statements—are changed between 6 and 18 times on average, depending on the system. Yet, they are very unlikely to be refactored: Only between 10 to 20% of the very long methods are significantly shortened. Even if you do change your quick-and-dirty method again, you are very likely to not refactor it, at least not in terms of its length. The results of this study have been published in our research paper »How Do Java Methods Grow?«—accepted for publication at the Working Conference of Source Code Analysis and Manipulation, SCAM, Bremen, Germany (27. – 28.8.2015).

What are the lessons learned from this? Refactor your code before you check it in! In the absence of rigorous manual code reviews, you are likely to not change it again and you will improve its readability tremendously.