The 10 Commandments of Writing Clean Code
Making your code production-ready one commandment at a time
We all want to write clean code. But there are so many opinions on what constitutes as “clean” code. Over the years, I have picked up a few rules of thumb that I follow that help me write cleaner, more organized code. Here are 10 commandments of writing clean code.
Thou shalt write code for thy reader, not thyself.
The most important thing to remember when writing code is that you're writing it for other people to read. Sure, you can write code just for yourself if you want (and some people do this all the time), but if you're writing code that will be seen by others, then it is best to write it in a manner that makes sense to them.
Trust me. Your future self will also appreciate this.
Thou shalt use comments and docstrings when needed.
Comments and docstrings can be incorporated into the first commandment, but I want to emphasize them as they add a communication layer on top of your codebase. They let you document your code, they can help other people use it, and they might even help you understand it later on. But there’s one rule that should always be followed: comments and docstrings should be used sparingly.
There are times when a comment or a docstring is the best way to communicate something about your code. For example, if a function does something complicated and unusual, it may be worth documenting that in a comment or docstring so that other people will know how to use it correctly (and so that you can remember what it does when you come back to the project months later).
But for every comment or docstring you write, there are probably ten more that could have been written better using code instead of words.
Thou shalt name variables descriptively so that anyone reading your code will know what they refer to.
The third commandment of clean code is to make your code readable, and that means naming variables descriptively. This is particularly important if you're working in a team, where others may have to read your code.
Variable names should be the clearest possible description of what the variable represents. For example, if you have a game where players can choose their characters' names, you might use a string called current_name to store the player's name. If there's another variable called current_score that stores how many points the player has scored in the game, then it would be confusing if both these names had been used for different purposes.
Let’s take a look at a few bad variable name examples.
i, j, k - Unless you are writing a numerical algorithm, these are not descriptive enough.
x_test1, x_test2 - These are better than the above, but they still don't tell you what they mean. They could be test scores or some other data type.
df, df1, df_1 - These names are also not great for data frames. Think about what data you are importing.
Instead, a great variable name would be something like customer_address. This clearly tells me that we will be looking at a customer’s address.
Thou shalt use the proper naming conventions for thy language.
The point of naming conventions is to make your code easier to read and understand. This is especially important when your codebase is large or complex, as there are many different ways you could write the same thing. A good naming convention makes it easier for other developers to jump into your code and understand what it does without having to dig through the documentation.
You can use any combination of these approaches, but I would recommend sticking with one throughout a project or project branch if possible. Consistency is key here!
Thou shalt use Test Driven Development
Test-Driven Development (TDD) is a software development process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, implemented and run, a small amount of code is written to fulfill the current test case, and then refactoring is performed on this code. Each time the code to meet the current test fails, the developer must examine the last few lines of code that were written to determine why it failed. This process continues until all requirements are met by passing tests.
TDD helps developers write better code by providing early feedback, and helping you verify your design and implementation is correct before you get too far down the line. The goal of TDD is to create simple designs that can be easily tested and understood.
Thou shalt not repeat thyself.
I am a big fan of “Don’t Repeat Yourself” (DRY) principle, which states that every piece of knowledge must have a single, unambiguous, authoritative representation within a system. DRY is intended to reduce complexity and maintenance costs by allowing related information to be stored in one place and accessed in several ways.
Thou shalt use typing in functions.
Typing is where you declare the type of data that your variables are going to hold. This can be done by putting a colon after the variable name.
You can also specify the type of a function's arguments and return value. For example, we might want to write a function that returns an integer, but accepts two integers:
def multiply(a: int, b: int) -> int:
return a * b
You can use the type system to your advantage by declaring the types of your variables. This helps you avoid errors and makes it easier for other developers to understand what’s going on in your code.
Thou shalt use a linter and formatter.
A linter is a tool that looks for stylistic and formatting problems in code. Linters are often used to enforce a coding style guide or to ensure that the code follows the conventions of a particular language.
A formatter is a tool that automatically formats code according to some style guide. Instead of checking for errors, it will try to make the code look nicer by adding spaces, indentation and new lines.
The main difference between a linter and formatter is how they look at the code. The linter is just like a spell checker for code. It looks for problems in your code and points them out to you so that you can fix them before they cause problems in production. A formatter takes the output of linters (which is all marked up with warnings) and formats it into something that's easy on the eyes, like tabs or spaces instead of folding brackets.
Thou must use info, error, and warning logging.
When you have a program with a lot of logic and conditional statements, it can be hard to trace the root cause of an error. This is where logging comes in handy.
Logging allows you to record events that happen during the execution of your program so that you can analyze them at a later time. This can be especially useful when debugging an issue that only manifests itself intermittently.
Thou shalt understand and apply SOLID principles to all thy classes and methods.
The SOLID principles are a set of best practices for writing classes and methods. They were created by Robert C. Martin (also known as Uncle Bob and all sorts of goodies on his website) who has been programming for over 40 years.
SOLID stands for:
Single Responsibility Principle — Each class should have a single responsibility, and that responsibility should be entirely encapsulated by the class.
Open/Closed Principle — Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification. In simple terms, this means that you should be able to extend your code without modifying it (in most cases).
Liskov Substitution Principle — You should be able to substitute any object of type T with another object of type T without breaking any clients' code (and without having to change their source). This is also known as substitutability or the "rosetta stone principle."
Interface Segregation Principle — Interfaces should be fine-grained and client-specific rather than bloated and all-inclusive. I like to think of this one as "Make lots of small interfaces with narrow scopes; not only does this promote reuse but it will keep your code easier to read and maintain."
Dependency Inversion Principle — You should depend on abstractions, not concretions. Put another way, you should depend on interfaces, not implementations (or concrete classes).
Final thoughts
If you want to see examples of these rules, they will be available in my course “Jupyter Notebook to Production” coming out on dutchengineer.org in the next upcoming weeks.
There may or may not be a surprise in this newsletter. Stay tuned!