Posted on 06/24/2015 by Dr. Elmar Juergens
We have had countless discussions about code clones with the teams responsible for their maintenance. These teams generally accept that some of the clones in their software are the product of copy & paste. In many cases this is obvious, since the clones share some quirks that can only be plausibly explained by copy & paste (e.g. identical typos in comments).
One hypothesis that comes up time and again, however, is that some or many of the clones were not created by copy & paste, but instead were written independently and then evolved into the same form.
This hypothesis reminds me of convergent evolution, where environmental factors drive independent evolution of similar traits in species whose ancestors do not show those traits. For example, both pill bugs and pill millipede have evolved similar defenses, and consequently look similar, but belong to different branches of
Every software system has been built by copy & paste at least to some degree. Some of this redundancy is caused by technical limitations in the languages and tools. Some is caused by time pressure, where duplicating and modifying an existing piece of code is faster than developing suitable abstractions. Some is also caused by the laziness of the developers or missing awareness of possible long-term consequences. While these duplications often work fine when they are created, they typically cause additional maintenance overhead during software evolution, as duplicates have to be changed consistently. But the real risk of these duplications is in inconsistent changes, as this frequently leads to actual bugs in the system.