There's a gap between learning the syntax of the Python programming language and being able to build a project from scratch. When you finish reading your first tutorial or book about Python, you're good to go for writing a Fibonacci suite calculator, but that does not help you starting your actual project.

There are a few questions that pop up in your mind, and that's normal. Let's take a stab at those!

Which Python version should I use?

It's not a secret that Python has several versions that are supported at the same time. Each minor version of the interpreter gets bugfix support for 18 months and security support for 5 years. For example, Python 3.7, released on 27th June 2018, will be supported until Python 3.8 is released, around October 2019 (15 months later). Around December 2019, the last bugfix release of Python 3.7 will occur, and everyone is expected to switch to Python 3.8.

Current Python 3.7/3.8 release schedule

That's important to be aware of as the version of the interpreter will be entirely part of your software lifecycle.

On top of that, we should take into consideration the Python 2 versus Python 3 question. That still might be an open question for people working with (very) old platforms.

In the end, the question of which version of Python one should use is well worth asking.

Here are some short answers:

  • Versions 2.6 and older are really obsolete by now, so you don't have to worry about supporting them at all. If you intend on supporting these older versions anyway, be warned that you'll have an even harder time ensuring that your program supports Python 3.x as well. Though you might still run into Python 2.6 on some older systems; if that's the case, sorry for you!
  • Version 2.7 is and will remain the last version of Python 2.x. I don't think there is a system where Python 3 is not available one way or the other nowadays. So unless you're doing archeology once again, forget it. Python 2.7 will not be supported after the year 2020, so the last thing you want to do is build a new software based on it.
  • Versions 3.7 is the most recent version of the Python 3 branch as of this writing, and that's the one that you should target. Most recent operating systems ship at least 3.6, so in the case where you'd target those, you can make sure your application also work with 3.7.

Project Layout

Starting a new project is always a puzzle. You never know how to organize your files. However, once you have a proper understanding of the best practice out there, it's pretty simple.

First, your project structure should be fairly basic. Use packages and hierarchy wisely: a deep hierarchy can be a nightmare to navigate, while a flat hierarchy tends to become bloated.

Then, avoid making a few common mistakes. Don't leave unit tests outside the package directory. These tests should be included in a sub-package of your software so that:

  • They don't get automatically installed as a tests top-level module by setuptools (or some other packaging library) by accident.
  • They can be installed and eventually used by other packages to build their unit tests.

The following diagram illustrates what a standard file hierarchy should look like:

A Python project files and directories hierarchy

setup.py is the standard name for Python installation script, along with its companion setup.cfg, which should contain the installation script configuration. When run, setup.py installs your package using the Python distribution utilities.

You can also provide valuable information to users in README.rst (or README.txt, or whatever filename suits your fancy). Finally, the docs directory should contain the package's documentation in reStructuredText format, that will be consumed by Sphinx.

Packages often have to provide extra data, such as images, shell scripts, and so forth. Unfortunately, there's no universally accepted standard for where these files should be stored. Just put them wherever makes the most sense for your project: depending on their functions, for example, Web application templates could go in a templates directory in your package root directory.

The following top-level directories also frequently appear:

  • etc for sample configuration files.
  • tools for shell scripts or related tools.
  • bin for binary scripts you've written that will be installed by setup.py.

There's another design issue that I often encounter. When creating files or modules, some developers create them based on the type of code they will store. For example, they would create functions.py or exceptions.py files. This is a terrible approach. It doesn't help any developer when navigating the code. The code organization doesn't benefit from this, and it forces readers to jump between files for no good reason. There are a few exceptions, such as libraries, in some instances, because they do expose a complete API for consumers. However, other than that, think twice before doing that in your application.

Organize your code based on features, not based on types.

Creating a module directory with just an __init__.py file in it is also a bad idea. For example, don't create a directory named hooks with a single file named hooks/__init__.py in it where hooks.py would have been enough instead. If you create a directory, it should contain several other Python files that belong to the category the directory represents.

Be also very careful about the code that you put in the __init__.py files: it is going to be called and executed the first time that any of the module contained in the directory is loaded. This can have unwanted side effects. Those __init__.py files should be empty most of the time unless you know what you're doing.

Version Numbering

Software version needs to be stamped to know which one is more recent than another. As every piece of code evolves, it's a requirement for every project to be able to organize its timeline.

There is an infinite number of way to organize your version numbers, but PEP 440 introduces a version format that every Python package, and ideally every application, should follow. This way, programs and packages will be able to quickly and reliably identify which versions of your package they require.

PEP 440 defines the following regular expression format for version numbering:

N[.N]+[{a|b|c|rc}N][.postN][.devN]

This allows for standard numbering like 1.2 or 1.2.3.

However, please do note that:

  • 1.2 is equivalent to 1.2.0; 1.3.4 is equivalent to 1.3.4.0, and so forth.
  • Versions matching N[.N]+ are considered final releases.
  • Date-based versions such as 2013.06.22 are considered invalid. Automated tools designed to detect PEP 440-format version numbers will (or should) raise an error if they detect a version number greater than or equal to 1980.

Final components can also use the following format:

  • N[.N]+aN (e.g. 1.2a1) denotes an alpha release, a version that might be unstable and missing features.
  • N[.N]+bN (e.g. 2.3.1b2) denotes a beta release, a version that might be feature-complete but still buggy.
  • N[.N]+cN or N[.N]+rcN (e.g. 0.4rc1) denotes a (release) candidate, a version that might be released as the final product unless significant bugs emerge. While the rc and c suffixes have the same meaning, if both are used, rc releases are considered to be newer than c releases.

These suffixes can also be used:

  • .postN (e.g.1.4.post2) indicates a post-release. These are typically used to address minor errors in the publication process (e.g. mistakes in release notes). You shouldn't use .postN when releasing a bugfix version; instead, you should increment the minor version number.
  • .devN (e.g. 2.3.4.dev3) indicates a developmental release. This suffix is discouraged because it is harder for humans to parse. It indicates a prerelease of the version that it qualifies: e.g. 2.3.4.dev3 indicates the third developmental version of the 2.3.4 release, before any alpha, beta, candidate or final release.

This scheme should be sufficient for most common use cases.

You might have heard of Semantic Versioning, which provides its own guidelines for version numbering. This specification partially overlaps with PEP 440, but unfortunately, they're not entirely compatible. For example, Semantic Versioning's recommendation for prerelease versioning uses a scheme such as 1.0.0-alpha+001 that is not compliant with PEP 440.

Many DVCS platforms, such as Git and Mercurial, can generate version numbers using an identifying hash (for Git, refer to git describe). Unfortunately, this system isn't compatible with the scheme defined by PEP 440: for one thing, identifying hashes aren't orderable.

Those are only some of the first questions you could have. If you have any other one that you would like me to answer, feel free to write a comment below. Some goes if you have any other pieces of advice you'd like to share!