Android Code Quality: Redundancy
Dr. Nils Göde
This is the second part of our quality audit of the Android core component’s source code. In my previous post we have looked at the structure of the code. In this post we will analyze the redundancy found in the code. Redundant code fragments—so-called clones— cause a variety of problems. The system is larger than it needs to be, defects are duplicated, changes have to be done multiple times, and individual copies may be overlooked when a bug is fixed (this is not a myth since many clone-related bugs have already been found in production software). Consequently, it is advisable to keep the redundancy as low as possible.
Teamscale implements a sophisticated clone detection algorithm that finds similar statement sequences in the source code. Similar statement sequences have to be a least 7 statements long to be considered as clones and, therefore, quality deficits. The clone detection algorithm used tolerates a certain amount of differences—for example, identifiers and literals do not have to be identical for statement sequences to be considered clones. For more information on the clone detection, please refer to the corresponding publication.
To measure the redundancy inside a system, we use the clone coverage which is defined as the percentage of code that is part of at least one clone. The clone coverage can be interpreted as the probability of a line being cloned if taken randomly from the system. This is a rough estimate of how often developers are confronted with cloned code—and the corresponding problems—during the maintenance of the code.
Clones in Android
The following screenshot taken from Teamscale gives you an impression of how clones look like on the code level. The compare view shows two identical code fragments (except for some spelling mistakes in the comments) from
Although most people would regard the above inconsistency as not problematic, it already demonstrates that clones tend to diverge over time. And this is not limited to comments. The next screenshot shows an inconsistency in cloned code fragments in two different versions of
VectorImpl.cpp. The left version of the function
replaceAt has an additional check for the validity of the index that is missing on the right side. Furthermore, an additional check for
item != prototype exists—presumably a performance tweak. Without further knowledge of the code it appears that the additional statements should also be included on the right side because the functions (and large parts of the whole files) are otherwise identical.
All in all, the Android source code has a clone coverage of 14.1%. The following tree map shows how the clones are distributed across the system. Each rectangle corresponds to a single source file. The size of the rectangle depends on the lines of code in that file. The tree map is hierarchically organized—files close together in the directory tree are also close together in this map. The color is used to indicate the clone coverage of each individual file. White rectangles represent files with no clones at all, whereas the source code in red files is mostly covered with clones.
As an example, the two larger files with a lot of clones at the bottom center of the map are the files
libcorkscrew/arch-x86/backtrace-x86.c with a clone coverage of 78.8% and 87.4% respectively. In this case, the high redundancy is not surprising as the names already suggest a high level of similarity. Likewise, the files
include/utils/List.h are almost completely identical. Please note that although the similarity can be guessed from the name, the files still suffer from clone-related problems.
The following figure shows how Android compares to other systems in our CQSE benchmark with respect to the clone coverage. The other systems are also written in C++ and clone detection has been done using the same configuration.
As the figure shows, Android has an average clone coverage. It does significantly better than Mozilla Firefox but cannot reach up to TortoiseSVN—which we know does pretty well in terms of clones based on an earlier study. The systems A to D are systems from our customers. Two of them achieve a lower clone coverage than Android, whereas the other two have a considerably higher clone coverage.
All in all, the redundancy in Android is not critical, but there are certainly clones than one should be aware of and that could be removed. We have found some inconsistencies that appear to be unintentional and that should be reviewed. During our quality audits, we have seen a number of systems with higher redundancy than Android, but we have also seen systems that do better. Consequently, the Android code is of medium quality with respect to redundancy.
This was the second part of our Android code quality audit. So far we have looked at the structure and the redundancy, but there are other factors (e.g., code anomalies, error handling, and documentation) that also contribute to the overall quality assessment. So stay tuned for the continuation of this audit.