Apple's #gotofail bug: The Code Quality Perspective

On Friday, February 21st Apple published an update for iOS that fixed a serious security issue. What makes this issue interesting, is not only its severity but also the fact that the issue can be nicely pinned down two a single line of code. Conveniently, this code is open-source and available for analysis! In this post I’ll explain why this major security issue is, after all, the result of a number of quality issues, which are often undervalued as minor flaws.

The Bug

Since Apple’s update on Friday the bug has been discussed on various blogs, websites and tweets. A good discussion of the bug can be found in Adam Langley’s blog. Interestingly, even mainstream media websites like the (German) Süddeutsche Zeitung and Der Spiegel try to explain this bug to their readers on the code level.

As the figure below shows, the bug is essentially caused by a superfluous goto fail; statement (code from Apple). This statement is not part of the if-block and, hence, always executed. Consequently, the following checks are not executed and the function will return a successful value, as err contains a successful value from the previous call to SSLHashSHA1.update().

Source code

The Analysis

Several violations of good coding practices can be nicely demonstrated using this bug:

  1. Missing Braces: Many coding guidelines enforce the usage of braces for all statements following an if (or for or while or …) statement, even if it is just a single statement. If the two goto fail; statements where properly placed in braces, one would still be superfluous (and proper languages would even prohibit unreachable code like this) but it would create no harm. If one of the statements was within the braces and one was not, the code would still malfunction but the fault could be identified a lot easier.
  2. Formatting: If an automatic code formatter would have been used, the odd role of the second goto fail; would have been a lot more obvious through the different indentation of the the two statements. This is also a problem in other places in the affected source file. For example, in the code shown below, one would expect the switch statement to be executed only if err is true. In fact, it does not have any relationship to the if statement. Also, the case statements that are syntactically on the same level have varying indents. Source code

  3. Naming: The jump label has the highly misleading name fail. When skimming the code, one would expect that a jump to fail will cause the entire function to fail, i.e., return unsuccessfully. In reality, however, the code after the jump label is totally agnostic about the success or failure of the function and is executed in all cases.
  4. Goto: The use of the goto statement itself has long been considered problematic as its usage tends to create code that is hard to maintain, test and debug. Looking at the whole file, one can see that Apple uses the goto statement following a specific pattern; presumably to group a function’s clean-up code in a single place. This is not very elegant and probably could have been solved by introducing a cleanup() function. However, this would not have prevented the bug as two return cleanup(); statements would cause the same behavior as two goto fail; statements. To summarize, goto statements are often harmful but not particularly so in this case.

What is highly interesting about these aspects is that most of them are often considered issues that just affect software maintainability and not its functionality. Missing braces, bad formatting and naming, as well as the existence of goto statements, do not automatically make a system fail and are often considered minor flaws. The goto fail; example nicely illustrates how such a minor flaw (or a combination of many of them) can quickly turn into a major problem! Next time you are about to brush aside findings about the maintainability of your system, think twice.