Logging is one of the most underrated features. Often ignored by software engineers, it can save your time when your application's running in production.

Most teams don't think about it until it's too late in their development process. It's when things start to get wrong in deployments that somebody realizes too late that logging is missing.

Guidelines

The Twelve-Factor App defines logs as a stream of aggregated, time-ordered events collected from the output streams of all running processes. It also describes how applications should handle their logging. We can summarize those guidelines as:

  • Logs have no fixed beginning or end.
  • Print logs to stdout.
  • Print logs unbuffered.
  • The environment is responsible for capturing the stream.

From my experience, this set of rules is a good trade-off. Logs have to be kept pretty simple to be efficient and reliable. Building complex logging systems might make it harder to get insight into a running application.

There's also no point in duplication effort in log management (e.g., log file rotation, archival policy, etc) in your different applications. Having an external workflow that can be shared across different programs seems more efficient.

In Python

Python provides a logging subsystem with its logging module. This module provides a Logger object that allows you to emit messages with different levels of criticality. Those messages can then be filtered and send to different handlers.

Let's have an example:

import logging

logger = logging.getLogger("myapp")
logger.error("something wrong")

Depending on the version of Python you're running you'll either see:

No handlers could be found for logger "test123"

or:

something wrong

Python 2 used to have no logging setup by default, so it would print an error message about no handler being found. Since Python 3, a default handler outputting to stdout is now installed β€” matching the requirements from the 12factor App.

However, this default setup is far from being perfect.

Shortcomings

The default format that Python uses does not embed any contextual information. There is no way to know the name of the logger β€” myapp in the previous example β€” nor the date and time of the logged message.

You must configure Python logging subsystem to enhance its output format.

To do that, I advise using the daiquiri module. It provides an excellent default configuration and a simple API to configure logging, plus some exciting features.

Logging Setup

When using daiquiri, the first thing to do is to set up your logging correctly. This can be done with the daiquiri.setup function as this:

import daiquiri

daiquiri.setup()

As simple as that. You can tweak the setup further by asking it to log to file, to change the default string formats, etc, but just calling daiquiri.setup is enough to get a proper logging default.

See:

import daiquiri

daiquiri.setup()
daiquiri.getLogger("myapp").error("something wrong")

outputs:

2018-12-13 10:24:04,373 [38550] ERROR    myapp: something wrong

If your terminal supports writing text in colors, the line will be printed in red since it's an error. The format provided by daiquiri is better than Python's default: this one includes a timestamp, the process ID, Β the criticality level and the logger's name. Needless to say that this format can also be customized.

Passing Contextual Information

Logging strings are boring. Most of the time, engineers end up writing code such as:

logger.error("Something wrong happened with %s when writing data at %d", myobject.myfield, myobject.mynumber")

The issue with this approach is that you have to think about each field that you want to log about your object, and to make sure that they are inserted correctly in your sentence. If you forget an essential field to describe your object and the problem, you're screwed.

A reliable alternative to this manual crafting of log strings is to pass interesting objects as keyword arguments. Daiquiri supports it, and it works that way:

import attr
import daiquiri
import requests


daiquiri.setup()
logger = daiquiri.getLogger("myapp")


@attr.s
class Request:
     url = attr.ib()
     status_code = attr.ib(init=False, default=None)
     
     def get(self):
         r = requests.get(self.url)
         self.status_code = r.status_code
         r.raise_for_status()
         return r


user = "jd"
req = Request("https://google.com/not-this-page")
try:
    req.get()
except Exception:
    logger.error("Something wrong happened during the request",
                 request=req, user=user)

If anything goes wrong with the request, it will be logged with the stack trace, like this:

2018-12-14 10:37:24,586 [43644] ERROR    myapp [request: Request(url='https://google.com/not-this-page', status_code=404)] [user: jd]: Something wrong happened during the request

As you can see, the call to logger.error is pretty straight-forward: a line that explains what's wrong, and then the different interesting objects are passed as keyword arguments.

Daiquiri logs those keyword arguments with a default format of [key: value] that is included as a prefix to the log string. The value is printed using its __format__ method β€” that's why I'm using the attr module here: it automatically generates this method for me and includes all fields by default. You can also customize daiquiri to use any other format.

Following those guidelines should be a perfect start for logging correctly with Python!