Simple Python concurrency

28 January 2019

Say you need to do some HTTP requests and doing it sequentially is taking too long. This is a great example of an I/O-bound task which can be sped up by using concurrency.

There are many ways you can achieve concurrency in Python, but one of the nicest frontends is the concurrent.futures package, available from version 3.2 onwards. You can get a better overview on the other options here and here. But if you want to do it fast and easy, with very few lines of code, here’s how:

import requests # for making http requests
import concurrent.futures

with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
    futures = [executor.submit(requests.get, 'http://example.org') for _ in range(20)]
    results = [j.result() for j in concurrent.futures.as_completed(futures)]
# 398 ms ± 159 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The futures variable stores values which may have not been computed yet. That’s why we do a second list-comprehension to retrieve those values into a results variable. This is by far the easiest way I have found to do concurrency on Python.

For the sake of comparison, here’s the non-concurrent way:

results = [requests.get('http://example.org') for _ in range(20)]
# 5.98 s ± 500 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Wow! From almost 6 whole seconds down to less than half a second. That’s a ~15x speed up.

Keep in mind that if your task is CPU-intensive, you won’t get as much of a good result by using this. You are better off with some form of parallelism/multi-threading, which I might cover in another post. Thanks for reading!