Save the Semicolon

These days, it seems that a modern programming language is required to not use semicolons at all, or at least make them optional. While this might be a good trend from the perspective of a keyboard vendor (less stress on the single semicolon key), from a code quality perspective, this does not look like progress at all.

Semicolon, anyone?

Looking at the programming languages we encounter these days, it seems that you can estimate the age of a language by looking at its statement separator. Slightly aged languages (some would call them business standards), such as C/C++, Java, C#, require you to put a semicolon at the end of every statement. Very old languages, such as COBOL or ABAP, even require something different, such as a dot.

Newer languages, like Python discard the semicolon in favor of the line-break, although semicolons still have to be used when writing multiple statements in the same line (which is not common in Python and most other languages). In JavaScript (which is not really that new), semicolons are also optional, if the statement ends at the line-break. Actually, the exact rules for statement ending in JavaScript contain some special cases that are not that easy to understand for the beginner, and thus many guidelines recommend to consistently use semicolons (and better not break statements at the wrong places). For me, the most extreme case is Xtend, which basically ends the statement, as soon as the parser is no longer able to continue parsing the statement. You can actually write entire programs in a single line without any semicolon. The compiler will handle this, but I’m pretty sure that I don’t want to read the following (slightly misformatted) example.

Interestingly, many new languages tend to follow the optional semicolon road. Apple’s language Swift makes the semicolon optional. And the language Kotlin presents at its front page as one of the prime features of the language: »Semicolons are optional«.

Semicolon, why not?

Actually, I do not see how adding a semicolon to the end of every statement can be harmful. Maybe I’m just old-fashioned here, but I like to communicate to the compiler, that I expect the statement to end here. If the compiler thinks differently, it can tell me very clearly and I can fix it. In some of the languages, these misunderstandings can only be found by testing, which should actually focus on functionality and not language constructs.

But I’m also a bit biased, developing code analysis solutions. When building a parser for a language, that also has to deal with multiple versions of a language and should be robust enough to not choke on the first piece of non-compiling code, the semicolon is a blessing. Any time your parser is stuck, just find the next semicolon and continue from there. True, when building a compiler you are stuck in any case. But for all the tools we rely on so much these days, such as syntax-aware editors, refactoring tools, and quality analysis tools, being able to easily build a robust parser is a blessing.

Regarding code quality, I also strongly believe that the semicolon (or any other explicit statement separator) helps with readability, as it clearly communicates the end of the statement. Without it, I have to think about the implicit separation rules of the language, which becomes more involved, when switching between languages a lot in your projects. The typical answer is to rely on nice formatting (ideally provided by your IDE), but if you require nice formatting, why not also require the semicolon (or let your IDE insert it, so that readers with simple editors can benefit from it)?

Semicolon, oh yes!

For me, the main problem with the optional semicolon is the inconsistency it leads to. Most larger projects involving multiple programmers attempt to enforce a consistent coding style. Rules for indentation and formatting, naming of variables and functions, and commenting are defined (and ideally enforced via corresponding tools). The goal is to make all code of the project look the same, to make it easy for every team member to jump between code files. Making semicolons optional, just opens one more variation that has to be compensated for by guidelines and tools. For example, almost every JavaScript checker contains a rule for finding “missing” semicolons. Why not require it in the language and make the compiler handle this?

Given all these issues, and seeing no drawbacks of inserting explicit statement separators, I strongly prefer languages to require the trusty old semicolon. But of course this is also a matter of personal taste, so I’m interested in hearing your voice in the comments.