Reasons for so many ABAP Clones

From the code audits and quality control of ABAP projects we do, we observe again and again that ABAP code tends to contain a relative high rate of duplication within the custom code. The data of our benchmark confirm this impression: From the ten projects with the highest rate of duplicated code, six projects are written in ABAP (but only 16% of all projects in the benchmark are ABAP projects). In this post I will discuss what are the reasons for the tendency to clones in ABAP.

What is cloning and why is it important?

Code clones are duplicated fragments (of a certain minimal length) in your source code. A high amount of duplicated code is considered to clearly increase maintenance efforts on the long term. Furthermore, clones bear a high risk of introducing bugs, e.g. if a change should affect all copies, but was missed in one instance. For more background information see e.g. Benjamin’s earlier post or »Do Code Clones Mater?«, a scientific study on that topic.

The following figure shows a typical example of an ABAP clone:

Example of an ABAP clone in compare view. The part with the light-blue marker in the middle is duplicated. The only difference is the name of the internal table over which is iterated; it gets highlighted.

The code is fully identical, unless the name of the variable over which is iterated. As mentioned before, in many ABAP projects we see many of such clones (frequently, the cloned part is much longer—some hundred lines are no surprise).

So, what might be the reasons for the high tendency towards code cloning in ABAP?

First, it is not a lack of language features to re-use code: The most important mechanism is the ability to structure code in re-usable procedures. There exist form routines, function modules and methods—but it seems the barrier to consequently use these concepts is higher than in other languages. Why? I see three main causes:

  • Poor IDE support
  • Constraints in the development process
  • Dependency fear

Besides these constructive reasons, there is also a lack of analysis tools to detect duplicated code. The SAP standard tools are not able to detect clones within custom code. Thus an external tool, like Teamscale, is required for clone detection. However, in this post I will focus the before mentioned constructive reasons and discuss them.

Poor IDE support

In every language, the fastest way to implement a function, which only differs in a tiny detail from an already existing function, is to copy the source code and modify it. To avoid the duplication, these are common practices:

  • Extract the common code to a separate procedure where it could be used form the old and new functionality
  • Add a parameter to a procedure’s signature to make it more generic
  • Rename a procedure (to reflect the adopted common function)
  • Move a procedure (method, function module) to a different development object (class, function group) for common functionality
  • Introduce a base class a move common members there

Most IDEs for other languages provide support for these refacotrings, e.g. method calls are updated automatically if a method was moved. The ABAP Workbench SE80 (which many developers still use) provides literally no refactoring support. Even with ADT (the newer Eclipse plug-ins for ABAP development) only limited refactorings, which are local in one development object, are supported yet. This makes restructuring the code more difficult, it is more time-consuming and the risk of introducing errors is increased. The last issues is especially relevant since not even syntax errors in non-edited objects might be detected, but these errors first unveil at runtime or during the next transport to another SAP system. All these makes duplicating ABAP code more »productive« during the initial development—but it will hinder maintenance as in any other program language.

Constraints in the Development Process

The shortcomings of the ABAP IDEs are obvious reasons for duplicated code. More surprisingly, but with even more impact are constraints in the development process. When we discuss duplicated ABAP code with developers, this is often justified by restrictions of the development scope: Assume program Z_OLD was copied to Z_NEW instead of extracting common functionality and re-use it from both programs. Sometimes the development team copied the program since they were not allowed to alter Z_OLD since the change request is bound to specific development objects or packages. The reason for such restrictions is an organization structure where the business departments »own« the respective programs and every department fears that changes initiated by others could influence their specific functionality.

A similar situation arises when changing of existing code is avoided to save manual test effort in the business departments. Especially if the change request for Z_NEW was issued by a different department, the owners of Z_OLD may refuse to test it. (Maybe the wouldn’t if tests were automated.—Having only manual tests is not the best idea.)

Dependency Fear

Not specific to ABAP, but here more widespread is the fear of introducing dependencies between different functionalities, especially if these are loosely related. Often the benefit of independent code / programs is seen, since a modification of the code is always local to one instance and would not influence other parts. It is hard to say why this fear is more common in the ABAP world, one reason is the before mentioned organization of the development process. An other reason may be the lack of continuous integration where the whole code base is automatically built. The lack of automated testing might be the major reason: Whereas substantial test suites for automated unit tests are the rule in Java or C# projects, ABAPUnit tests are not that widespread.

No matter what the reason for this fear of dependencies is, there is an assumption that future changes of one copy should not affect the other copies. But in many cases the opposite is true! Cloning makes the code independent, but not the functionality—it will still be a similar thing. Thus it is an apparent independence only. Yes, there might be cases where a future change should only affect one of many copies. But very often a change should be applied at all occurrences of the related functionality. Consider bug fixes for example: in general, these must be done in all copies. We’ve observed the same change in two copies under two different change requests (were the second change was done several time later). This will almost double the maintenance effort without any need.

Can we Avoid Cloning in ABAP?

Yes, I’m sure cloning could be avoided as in any other programming language. Despite the fact that in many ABAP projects there is a high trend towards cloning, we’ve also seen counter-examples with only few clones. It is possible to have a code base with many hundreds of thousands lines of ABAP code and keeping the clone coverage low. From the reasons for intensive ABAP cloning discussed above we can conclude these recommendations to avoid it:

  • Dismiss copy-and-paste programming and encourage your developers to avoid duplication and restructure existing code instead. Accept that this is a bit more time-consuming in the beginning.
  • Make intensive use of common code and utilities, which are intended to be used by several programs. This code should be clustered in separate packages.
  • The development team should be the owner of the code, not the business departments—at least not for common functionalities. The developers should be free to restructure code if it is worth for technical reasons. Keeping the code base maintainable is a software engineering task which hardly can be addressed by the business department.
  • Make use of test automation, e.g. using ABAPUnit and execute all of these tests at least once a day. Many regression errors could be detected this way.

If these is given, also ABAP code could be mainly free of redundancies. Of course, additionally you should introduce an appropriate quality assurance to keep your code base clean. This could be either by code reviews or static analysis. More about how to deal with clones can be found in part 2 of Benjamin’s posts on cloning.