I've recently been using Cloudflare as an HTTP frontend for some applications, and getting things working correctly with WSGI was unobvious.

In Python, WSGI is the standard protocol to write a Web application. All Web frameworks that I know follows it. And many of those Web frameworks leverage some request environment variables to learn how the request has been made.

One of those environment variables is wsgi.url_scheme, and it contains either http or https, depending on the protocol that has been used to connect to your WSGI server.

And that's where things can get messy. If you enable SSL at Cloudflare in "Flexible" mode, your visitor will connect to your Web site using HTTPS, but Cloudflare will connect to your backend using HTTP. That means that for your application, the traffic will appear to be over HTTP, and not HTTPS: wsgi.url_scheme will be set to http.

Cloudflare SSL setting

That can lead to several problems with some frameworks. For example, the function url_for of Flask will rely on this variable to generate the scheme part of any URL. In this case, it would, therefore, generate URL starting with http:// whereas your visitors are using https.

The usual workaround is to leverage the X-Forwarded-Proto that is actually set by Cloudflare. In the case where Cloudflare proxies the request to your HTTP host, this will be set to https. By using the werkzeug.contrib.fixers.ProxyFix module, the variable wsgi.url_scheme will be set to what X-Forwarded-Proto is set.

That would work fine for any application that is directly behind Cloudflare, or any single HTTP reverse proxy.

But that does not work as soon as you have multiple reverse proxies. If your application runs on top of Heroku for example, they already provide a reverse proxy and overwrite those headers. That gives the following: Visitor -HTTPS-> Cloudflare -HTTP-> Heroku proxy -HTTP-> Heroku dyno. Once your dyno is reacher, X-Forwarded-For will be set to http.

Damn it!

The proper solution is, therefore, to have all your proxies implement RFC7239. This RFC defines a new Forwarded header that can contain all the hops that have forwarded this request, including all the scheme and IP addresses. Unfortunately, this is not implemented by Cloudflare nor Heroku. Bummer!

Finally, Cloudflare provides yet another custom header named Cf-Visitor. It contains a JSON payload with the original HTTP scheme used by the visitor: we can use that to solve our issue. Here's a WSGI middleware to do that:

class CloudflareProxy(object):
    """This middleware sets the proto scheme based on the Cf-Visitor header."""

    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):
        cf_visitor = environ.get("HTTP_CF_VISITOR")
        if cf_visitor:
            try:
                cf_visitor = json.loads(cf_visitor)
            except ValueError:
                pass
            else:
                proto = cf_visitor.get("scheme")
                if proto is not None:
                    environ['wsgi.url_scheme'] = proto
        return self.app(environ, start_response)

You can then use it to encapsulate your WSGI application with app = CloudflareProxy(app).

If you're using JavaScript, I noticed that the forwarded library provides that same support for Cloudflare along all the other headers – even RFC7239!