Inspecting Maintainability of an Open-Source Fortran Library

In this blog-post I’d like to give a few insights on how someone can inspect code maintainability of code written in Fortran and I will use as an example a popular open source library.

Similar to other programming languages, our recommendations towards a maintainable Fortran codebase include 1) use a modern IDE, e.g., Eclipse with Fortran Development Tools, 2) follow a set of consistent coding guidelines, and 3) use a linter tool that handles Fortran code to highlight code-styling issues, e.g, Fortlint. Apart from these general recommendations, this blog post describes how to setup our tool Teamscale to analyse the Fortran project Flibs. It also gives an overview of the functionality of this Fortran library and shows some examples of maintainability findings.

A bit of historical context on the Fortran language

In 1953, John Backus aimed to develop a practical alternative to assembly language for programming IBM mainframe computers. The first Fortran compiler was available in the late fifties. Even though initially skeptical that high-level programming languages like Fortran could provide performance equivalent to manually optimized assembly code, the community adopted Fortran with its set of operations that supported scientifical computing. According to the fortran90.org website, »Fortran is built from the ground up to translate mathematics into simple, readable, and fast code – straightforwardly maintainable by the gamut of mathematicians, scientists, and engineers who actually produce/apply that mathematics. If, however, mathematics is not the main task, then almost certainly C/C++, or a host of other more general-purpose languages, will be much better.« [1]

Guidelines for writing Fortran programs that are readable and maintainable were available as early as 1982, when the Fortran language already had a history of more than 20 years [2]. This early document lists for example Fortran statements that »must not be used under any circumstance« or those statements »which should be used only when necessary«. Some of the statements that are discouraged from use in the 1982 document, include statements that were naturally useful in the late fifties for punching cards (keyword PUNCH) or for giving branch probabilities to assist the compiler in generating optimized code (keyword FREQUENCY). In addition, the coding guidelines suggest a limit of »no more than 50 executable statements« per function and that loops »should not be more than four deep«. While such guidelines are more than 30 years old, nowadays they are almost universally followed for Fortran and other newer programming languages.

How to configure the anaylsis of a Fortran project in Teamscale

First, download and install Teamscale. If you have not used Teamscale before, I encourage you to apply for an evaluation license.

Configure a project in Teamscale that analyzes your code. This blog post uses as case-study the Fortran-Machine / Flibs which is a well-known collection of Fortran modules, a trendy project on GitHub. To add a project in Teamscale that analyzes the Flibs code, click Projects to open the project management view, then New project: the Add Source Code Repository dialog and Git dialog help specify the analysis details. See the following screenshot. I have selected the project name FortranMachineMaster, the default analysis profile Fortran (default) and the branch name to be analysed master. Clicking the (+) button allows adding an account for downloading code from Github. See the following screenshot that configures an account with id github and the URL https://github.com/. No username and password are needed in this case, as the Fortran-machine / Flibs project is publicly available on GitHub. Given the project url (https://github.com/mapmeld/fortran-machine), I used the path suffix matmeld/fortran-machine.git. I included in analysis the files from the folder flibs-0.9/flibs/src that have extension f90 and clicked Create Project. The next step is to let Teamscale finish the analysis of the history of the project. On my laptop, the analysis took approximately 10 seconds, with the current revision of approximately 20 thousands SLOC of Fortran code.

Before giving concrete examples of maintainability findings, it is useful to know a bit more about the Fortran modules that constitute the Flibs library.

What is Flibs

Flibs is a collection of Fortran modules. It is well documented, with the author’s description details available on GitHub. Mainly, the library provides functionality that complements the Fortran API and tools for manipulating Fortran source code.

  • Extensions to the Fortran API:
    • dynamic generation of HTML webpages (folder cgi)
    • implementation of various data structures like linked-lists or hashtables (datastructures)
    • file manipulation (filedir, stream)
    • other operating system like functionality (ipc, osutils, platform)
    • database connection functionality (sqlite)
    • functionality to handle strings of dynamic length (strings)
    • some form of exception handling (tools/exceptions)
    • timer operations, manipulating dates and decimal arithmetic (computing)
  • Tools for manipulating Fortran source code:
    • instrument source code to log various actions (checking)
    • unit testing (funit)
    • parser generator (lemon)
    • a reporting module (reporting)
    • a code generator based on simple specifications (specs)
    • a preprocessor module (tools)
    • a generator of C wrappers from Fortran code (wrapper)

Examples of maintainability findings in Fortran code

The Teamscale project configured as above shows some interesting findings with regards to cloning and code structuring. To view findings, click the Findings view in Teamscale. Teamscale groups findings for the Fortran code in the categories Code Anomalies, Code Duplication, Documentation, Formatting and Structure. See the following screenshot.

  • An example of cloning between cgi and datastructures: The file used in the implementation of the CGI interface (left-hand side) represents the implementation of a dictionary based on a linked list, perhaps not very efficient. The file from the data structures folder (right-hand side) represents a newer implementation of a dictionary based on a hash table. This is immediately obvious from the thorough documentation shown both in the left-hand and right-hand files. One would expect that the newer version of the dictionary implementation is used throughout the codebase rather than different versions being duplicated through the codebase.

clone example cgi

  • An example of cloning between strings and specs: The file from the strings folder (left-hand side) handles token delimiters (see lines 199-213), while the file used in the specification-based code generator (right-hand side) does not handle token delimiters (see lines 175-176 for a commented-out attempt to handle these delimiters). The left-hand side file that handles token delimiters has a todo comment ! TODO: delimiters!. While reviewing the code, it is unclear whether the implementation of token delimiters is completed and the todo comment should simply be removed. The maintainability of the code could also be improved by refactoring the long function next_token_gaps that has relatively complex control flow.

clone example

  • The Flibs codebase contains a significant number of comments containing the TODO tag. For more efective planning, it is recommended to replace the code comments with issues in an issue tracker. One example is the issue tracker integrated in the GitHub UI [3]. For the current state of the codebase, Teamscale groups the findings corresponding to these special formatted comments under the Documentation category. From a quick review of the corresponding findings, it looks like the Flibs comments serve at least three different purposes:

    1) Marking bugs to be fixed, e.g., ! TODO : this will bug if the path to the temporary directory is more than 200 characters ! in filedir folder.

    2) Describing features planned to be implemented, e.g., ! TODO: delimiters! in strings folder, ! TODO: kind in wrapper folder, !TODO: 3D, circles, spheres, on spherical surface in computing folder.

    3) Marking code whose maintainability should be improved, e.g., ! TODO: in separate function in wrapper folder.

    task tag example

Summary

In this blog-post, I aimed to explain how to configure Teamscale to analyse an open-source library from GitHub and also to explain some examples of maintainability findings as found by Teamscale when analyzing Flibs, a Fortran library that is popular on GitHub.

Feel free to share your opinions by commenting on this article: I’d be very interested to know how you review and improve your Fortran code. If you are working on an open source project and want to use Teamscale for your development, contact us at support@teamscale.com!

References

[1] What is the advantage of using Fortran, Ondřej Čertík, Fortran90 1.0 documentation.

[2] Guidelines for Coding Fortran Programs, John J. Cornyn, Numerical Modeling Division, Ocean Science and Technology Laboratory, July 1982.

[3] Mastering Issues, GitHub Guides.