As we’re on the way to finally creating coding guidelines for GNU Radio (see GREP1, also see the original mailing list announcement) I had to stop and think about how useful and underappreciated coding guidelines are. When I first started learning Python, one of the things that people would often comment on was the fact that the formatting was built into the language, unlike languages like C/C++ where whitespace is technically speaking optional. Some considered it an infringement on their coding freedom, but what has happened is that Python has become one of the most readable programming languages out there. Since code is famously read more often than it is written, this is a remarkable achievement.
Many people have thought about coding guidelines. Wikipedia has its own page on coding conventions (linked to other pages for indent styles, programming style, etc.), books have been written about them.
The following article aims to help with adding coding guidelines to your own (open source) project.
Why have coding guidelines?
So many people have spoken and written about it that it’s almost a waste of bandwidth to repeat the arguments here. Kate Gregory’s talk on the C++ Core Guidelines had some great thoughts that are worth mentioning here; one of her main points is that the reason to have and follow coding guidelines is to skip unnecessary discussion. Do we do X or Y? Well, what do the coding guidelines have a recommendation? Yes? Well, end of discussion.
Ultimately, this comes down to efficiency. Code reviews for instance become much easier when certain things are simply already decided and taken care of. In free software projects, saving time on code reviews is especially important, because as a project maintainer, you’re always scrabbling for people’s time and you want it used as efficiently as possible.
What’s out there?
When developing coding guidelines for your own project, it helps to take a look at what has been developed in the past. First, it’s sensible to choose coding guidelines that are in line with other, common, coding guidelines since you want to encourage people to switch between projects. Second, there are many nuggets of wisdom in these guidelines that might as well just be applicable to your project.
The list here was mostly compiled by web searches, but the final list has an interesting mix of types of coding standards. They include formatting-related standards, revision control related, coding paradigms, and other types of guidelines.
Probably one of the most famous coding guidelines out there. It is typically considered the coding guidelines for Python, and most linters for Python (e.g., Pylint) will use its guidelines as a source for checking code for convention, as opposed to simply warnings or errors. However, Python has a few other PEPs, such as [PEP257][pep257] (which talks about docstring conventions) and even [PEP20][pep20] (the Zen of Python) which talk about conventions.
As coding guidelines go, it’s one of my favourite. Here’s why:
- It’s concise. Too long, and coding guidelines become no longer useful.
- Rationales are provided.
- Very friendly tone
- It emphasizes that ultimately, the coder’s judgement is more important than a coding guideline.
- Because it’s a PEP, there’s a clear path forward for suggesting changes, which is not that uncommon.
- There are formatting guidelines, but also other coding guidelines (best practices)
Python has benefited a lot from coding guidelines, but they don’t feel forced upon the community.
These are a very different kind of beast. There are no formatting suggestions here, but a lot of good advice for which path to choose in the complexity that is provided by C++. Kate Gregory’s talk is actually a good introduction to the core guidelines.
Again, these have a lot of good things:
- Every guideline is split into reason, example, and notes. I’ve never actually read the full document end to end, but so far, the reasons all make sense to me.
- There’s also a section called ‘Enforcement’ for every rule. This is pretty nice, it makes all the guidelines immediately actionable.
- It’s also tracked on github, and there is clear way to suggest changes.
- There is an index, and it’s well sorted. Despite it’s length, I usually find items quickly.
There’s a couple of things that I’m not a fan of:
- The document itself says its messy, but that takes away some of its authority.
- Many items are missing examples, and instead have just a ??? marker instead. It doesn’t seem that hard to add examples, especially given the core guideline’s attempt to keep things actionable.
C++ is so complex, and so many people use it in different ways, that this is exactly the kind of guidelines we need though. Formatting rules wouldn’t be accepted in the C++ world (although that’s a shame), but reduction of complexity by simply everyone doing the same thing is such a great thing to have.
In this list, this is the first project-specific standard. Project guidelines are different from language-specific guidelines, because they can say “there are many ways of doing X, but in this project, we do it this way”.
The guidelines are detailed, but not too long – a great plus. There are examples and rationales provided. The document is well formatted and sorted, easy to browse, and provides plenty of links to external docs where necessary.
As an outsider to the project, it’s not clear how to bring up issues with the guidelines, other than to “email Chris” (who’s Chris?). I assume that the mailing list is an OK place to go and ask.
These guidelines are split into four sections: Style Guidelines, Git Workflow, Source Code Control, and Best Practices. This resonates a lot with me: All major questions for contributing code (other than legal issues) are covered this way. It’s not clear how to contribute to these guidelines, but they seem to get updated.
If you’re in the automotive industry, you’re using MISRA-C. If you haven’t heard of MISRA-C, it’s a set of rules for writing C code that came out of the automotive industry’s endeavour to minimize mistakes in C code.
The big minus for MISRA-C is that the standard itself is proprietary, which means I can’t link to it. An embedded.com article discusses it without any specifics, but mentions there are 141 rules. Given these rules as a constraint for the C language, you can think of MISRA-C as a subset of C.
For open source projects, MISRA-C would never be accepted by most communities. The fact aside that the standard is proprietary, MISRA-C is very restrictive. However, there are some upsides: Most rules are automatically checkable by static code analysis tools, and thus can be very easily enforced.
When I first saw the Google coding guidelines, I thought ooof… this is a lot of text. However, the guidelines are fairly lean.
One thing to keep in mind is that Google has a lot of code. Even a heavily used project like LLVM is a teeny-tiny codebase compared to what Google has. Coming up with a single set of C++ guidelines is that actually a remarkable feat.
The style guide itself has goals, and I just love that this is the first one: “Style rules should pull their weight”. This is a sign that the authors of the guidelines understand that a heavyweight document will simply be ignored. All the other goals are also really well thought out – as are the entire guidelines.
For adoption in other projects, I would not suggest using the Google guidelines. They have a lot more ground to cover, and if you can come up with a shorter document that suits your purpose, you should do that. But for inspiration, this is a great document.
These guidelines are very short, which seems unusual for a project of this magnitude. From what I can tell they grew organically from a short document that Linus wrote at some point in the past. That probably explains a lot of the language, which hasn’t aged well. Consider this statement: “Tabs are 8 characters, and thus indentations are also 8 characters. There are heretic movements that try to make indentations 4 (or even 2!) characters deep, and that is akin to trying to define the value of PI to be 3.” Yeah, it’s kind of funny, but this is a community where people already tend towards being zealous, and the wording could be interpreted as implying that the tab-indentation style has an objective truth instead of being a choice of the author based on personal preference.
This is a somewhat interesting aspect: Consistency in formatting is very important, but it always goes back to preference. The original author of a software project will thus usually define the formatting guidelines, and now needs to convince new contributors to follow her lead. The trick is to provide rationales, involve the community, and use non-aggressive language, as much as you’d like to make your coding guidelines sound funny.
The kernel coding guidelines is not actually fully defined by this short document. Various subsystems have their own guidelines, and most of it is enforced by scripts. There are other guidelines for patch submission etc.
These coding standards are sometimes referred to as bad examples of coding guidelines, and I tend to agree. There’s a lot that’s wrong with them, so let’s use them as an example for what not to do in coding guidelines.
First, this document is pretty long. Not packed, like the Google guidelines, but more like someone started rambling. It’s definitely not the case that someone went through the guidelines and thought hard about what could be left out.
Next, it reads like it wasn’t conceptually updated since the early nineties. Consider this: “C++ is ok [sic] too, but please don’t make heavy use of templates.”.While it is true that heavy use of templates increases your compilation time, that’s simply not a good reason to avoid one of the main killer features of C++. It also includes a section on pre-standard C. If a project really needs that, it should include this section of having it in the general GNU coding standards.
Of course, the GNU coding standards are very particular about legal issues, and
that’s what we expect from GNU standards. However, some of the language goes
overboard, e.g.: “Please don’t use “win” as an abbreviation for Microsoft
Windows in GNU software or documentation. In hacker terminology, calling
something a “win” is a form of praise. If you wish to praise Microsoft Windows
when speaking on your own, by all means do so, but not in GNU software.”
What this means is you can’t call a variable
win_version or something like
that. That’s just micromanaging project maintainers.
The GNU indentation style is the only style where I think we’ve left the realm of personal preference. This is considered correct indentation:
First of all, this is impossible to consistently get right without crazy editor plugins. If you’ve ever had to edit code in nano over a slow ssh connection, you will simply not code in this style. Then, the placement of the brackets is mind-boggling, and the fact that the first if statement does not have brackets is simply inconsistent (note that the Linux kernel for example prefers no brackets on single-line if statements, but does prefer consistent bracing on the if and else sections).
A good reason to assume the GNU coding standards is that I don’t know many projects that actually follow them. Emacs, probably. But GNU Radio, gcc, GNU MediaGoblin, don’t seem to. Octave looks like it does.
On the plus side, the standards themselves explain how to suggest changes (you write an email to a sensible address), and it is nicely formatted which makes searching easy.
OK, a C++ standard for air force planes. Makes you think of MISRA-C: Safety first. In fact, MISRA is mentioned in the introduction.
Let’s start with the obvious: This document is a 141 pages PDF. This makes it great to print, put in a folder… and forget about it while it’s standing on the shelf behind you. Like MISRA, I doubt you could get an open source project to adopt those.
But this is Lockheed Martin that’s published these, and they can employ a code-guidelines supervisor for every software developer they have. I’m sure a lot of reports are written, signed, and counter-signed for standards compliance only.
Because I can actually link to this document, this is a good place to talk about things that are also in MISRA-C. An interesting thing is that it starts with rules on how to use this document: Before laying out the coding guidelines, it explains what kind of guidelines there are, and how they must be treated, for example: “Shall rules are mandatory requirements. They must be followed and they require verification (either automatic or manual).” There are other types of rules (“should” and “will”). The first actual rules from the guidelines are about situations where you have to and where may not stick to the other rules, and what procedure needs to be followed for breaking them.
The actual coding rules include rationales, but they are often not actual reasonings. See this:
AV Rule 8 All code shall conform to ISO/IEC 14882:2002(E) standard C++. Rationale: ISO/IEC 14882 is the international standard that defines the C++ programming language. Thus all code shall be well-defined with respect to ISO/IEC 14882. Any language extensions or variations from ISO/IEC 14882 shall not be allowed.
This isn’t really an explanation, it’s more of a longer version of the original rule. There’s plenty of those.
A similar set of rules is the JPL Institutional Coding Standard for the C Programming Language. It’s much shorter, still a PDF, and has longer and better rationales. Note that both standards are meant for extremely safety-critical environments (air- and spacecraft), so being very conservative in your coding standards is to be expected.
C# is not on my typical list of languages, and I found these while looking up the Mono standards. This document is simply odd: First of all, it looks like a blog post (which I think it is) and it is not clear at all what I’m supposed to do with it. Are these rules, guidelines? Are they used at Microsoft? And why is there a comments section below it? Is this how you suggest changes? Does that even make sense?
Let’s be honest: I’m not a fan of m-script, after having used it extensively for many years I just can’t stand the syntax any more. I do believe it encourages spaghetti code (it certainly does not encourage tidy code and good coding practices). Having guidelines for Matlab thus seems like a good idea.
I was actually surprised to find any Matlab coding guidelines. These seem somewhat popular, based on a limited web search.
Again, these are published as a PDF only it seems, and I really don’t think that’s a good idea. That might be to do with the fact that they’re published on the Mathworks file exchange, which may not allow other file types – I just don’t know.
This particular document however probably won’t solve badly written Matlab code, even if it strives to do exactly that (“Avoid cryptic code”). It simply focuses on the wrong things. A lot of attention is put on naming (personally, I think you should be very, very conservative with naming rules in coding conventions). Nothing is said about typical Matlab constructs that can be used instead of loops (such as initializing arrays).
Involve the Community
Whatever you do for your project, involve the people you are addressing with your guidelines. Once again, PEP8 did a great job by declaring how you can contribute to the guidelines themselves, and take in a lot of contributions on a regular basis.
Of course, it’s very likely that people will disagree on stylistic choices, and at that point, someone needs to step in and make a call. But that’s exactly why we need coding guidelines in the first place. Every sensible developer will accept a justified decision, and people who don’t are probably unreasonable anyway, and there’s no reason to cater to them. That said, it never costs to be polite. Get input from the community, resolve conflicting opinions in an transparent fashion, ignore snarky comments, and publish the result in an easy-to-find spot.
Coding guidelines are great. Your project should have one. Take a look at what’s out there. It’ll be worth it.