Good practise in Python: Part 1 Efficiency#

This notebook is the first (of two? of a series?) where we discuss the “best” or most appropriate way to program in Python (and beyond). Some of the things covered here will have been, or will be, discussed elsewhere in the module, and many will have been demonstrated to you, perhaps without you realising. Some habits you will have already formed, others you haven’t. Therefore, it is a good idea to point out a few important aspects of good practice so that you produce “good” code. You may have already noticed that there is rarely a unique way to program a particular algorithm or method. The choices we make when creating the program can have an impact on how quickly the code runs and how easy it might be for others to understand or use. Therefore good practise can fall into two catagories

  1. Efficiency

  2. Usability

Some features will fall into both catagories, but we will largely focus on the efficiency side in this notebook.

We should also say at this point that this is far from a comprehensive guide which will enable you to create the most efficient code possible. It is really a prompt to get you started thinking about these issues.

What do we mean when we describe a code as efficient?#

Efficiency, in this context, is mainly a question of time; does your code run as quickly as it could? The choices we make can have a significant impact on how long the code takes to run, for a variety of reasons.

Let us begin by looking at some simple examples of different code which seemingly does the same thing, but performs very differently.

Loops are very useful ways for us to traverse our data or iterate a method, however in Python there are often faster ways to perform simple operations rather than loop across all elements in a list or array. For example, suppose I wish to simply add two arrays together, element-wise. We know that it is possible to do

import numpy as np

# it really doesn't matter what the data is here

N = 100
x = np.linspace(1,10,N)
y = np.linspace(2,20,N)


z = x + y
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 import numpy as np
      3 # it really doesn't matter what the data is here
      5 N = 100

ModuleNotFoundError: No module named 'numpy'

There is really no reason to do anything other than this, however it is worth considering the equivalence of looping over the elements and doing the additions individually

z = np.zeros(N)
for i in range(N):
    z[i] = x[i] + y[i]

Clearly this works, however it creates more code than is necessary which impacts the readability. More cruicially though this addition is much slower than the first case:

%%timeit
z = x + y
333 ns ± 3.24 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
%%timeit
z = np.zeros(N)
for i in range(N):
    z[i] = x[i] + y[i]
25.9 µs ± 298 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

The reason for this is that the Python interpreter/numpy recognises that there is no strict order in which the \(N\) additions should be carried out and that they can be performed in any order or even simulataneously; modern computer architecture allows multiple floating point operations to be executed concurrently. This feature is called vectorisation. On the other hand, as soon as a loop is created, you are forcing Python to execute the operations in the order laid out by the loop.


This feature extends to a great many examples where you might be tempted to use a loop, for instance consider the implementation of Simpson’s rule for numerical integration;

\[ \int_{x_i}^{x_{i+1}} f(x) dx = \frac{h}{6}\left(f(x_i)+4f\left(\frac{x_{i+1}+x_i}{2}\right) + f(x_{i+1})\right) + E_i \]

For the composite version, for an integral \( \int_{a}^{b} f(x) dx,\) we break the interval \(x\in[a,b]\) into \(N\) sub-intervals and sum the contributions for each \([x_i,x_{i+1}]\) using the rule above. Here are two possible functions for this for the case \(f(x) = \int_{0}^{1}\sin(x)dx\) and \(N=100,\) one with a loop and one without.

def f(x):
    return np.sin(x)

def SimpSum(f,xp):
    N = len(xp)
    h = xp[1]-xp[0]  # assumes regularly spaced points, should really include a catch if we want this code to be more general
    return (h/6)*np.sum(f(xp[0:N-1])+4*f(0.5*(xp[0:N-1]+xp[1:N]))+f(xp[1:N]))

def SimpLoop(f,xp):
    N = len(xp)
    h = xp[1]-xp[0]
    intSimp = 0
    for i in range((N-1)):
        intSimp += f(xp[i]) + 4*f(0.5*(xp[i+1]+xp[i])) + f(xp[i+1])
    intSimp *= (h/6)
    return intSimp

x = np.linspace(0,1,101)

print(f'non-loop funtion = {SimpSum(f,x)}, loop function = {SimpLoop(f,x)}, exact = {np.cos(0)-np.cos(1)}')
non-loop funtion = 0.4596976941334565, loop function = 0.4596976941334565, exact = 0.45969769413186023

Both functions return the same value and so you might think they are equivalent, and the total operation count looks to be the same, however

%timeit SimpSum(f,x)
%timeit SimpLoop(f,x)
11.3 µs ± 32.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
371 µs ± 7.11 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

A significant difference. Many online sources will still demonstrate the application of a method like this in Python using loops so be cautious if you care about the efficiency of your code (good sources do not!).


Please note that there are a great many examples when sequential execution is a neccessary part of the logic of the code and loops are either unavoidable or the most appropriate choice. Good examples are recurrence relations or numerical methods for initial value problems (e.g. Euler’s method or Runge-Kutta) where a starting state is provided and future steps depend on the previous ones. Avoid loops where you can but this is not a call to abandon them altogether!


It is also worth mentioning that this is mostly a Python feature, some other interpreted languages, e.g. Julia & R, will also vectorise certain operations and functions automatically, however compiled languages like C++ and FORTRAN will leave this optimisation up to the compiler so programming with loops there is the norm.


Operation order#

There is another, less obvious, example of efficient programming implemented in these examples. Notice that the multiplication of \(\frac{h}{6}\) is placed outside of the sum/loop. This seems like a minor difference but it saves \(N\) multiplications!

def SimpSumSlow(f,xp):
    N = len(xp)
    h = xp[1]-xp[0]  # assumes regularly spaced points, should really include a catch if we want this code to be more general
    return np.sum((h/6)*(f(xp[0:N-1])+4*f(0.5*(xp[0:N-1]+xp[1:N]))+f(xp[1:N])))

def SimpLoopSlow(f,xp):
    N = len(xp)
    h = xp[1]-xp[0]
    intSimp = 0
    for i in range((N-1)):
        intSimp += (h/6)*f(xp[i]) + 4*f(0.5*(xp[i]+xp[i+1])) + f(xp[i+1])
    return intSimp


%timeit SimpSumSlow(f,x)
%timeit SimpLoopSlow(f,x)
13 µs ± 241 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
392 µs ± 2.76 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Relatively speaking this isn’t a huge advantage, but these kinds of small improvements can add up in a large code.


Don’t Repeat Yourself!#

To keep your code running fast, but also keep it clean and tidy, we should keep an eye on whether we are repeating code. It is usually more efficient to store a computed quantity in a variable if it is used repeatedly, rather than recompute it several times. It also keeps our code readable and consistent if we create functions to encode regularly repeated sets of operations. The mantra we have is:

Don’t Repeat Yourself

NOT

Write Everything Twice

We look for DRY code not WET code!

Dry code reduces the opportunity for mistakes and makes finding them easier. It also usually follows better general usability and efficiency conventions too.

Consider the following two example codes for obtaining the five term series expansion of \(e^x\) at \(x=0.4\)

\[ e^x = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + + \frac{4^2}{4!} + \dots\]
### DRY, EFFICIENT & USABLE SERIES CODE

from scipy.special import factorial as fac  # this version of factorial is required for array operation

def expSeries(x,N):
    n = np.arange(0,N)
    return np.sum(x**n/fac(n))

print(expSeries(0.4,5), np.exp(0.4))
1.4917333333333334 1.4918246976412703
### WET AND SLOPPY CODE

term1 = 1
term2 = 0.4/(1)
term3 = 0.4**2/(2*1)
term4 = 0.4**3/(3*2*1)
term5 = 0.4**4/(4*3*2*1)

final = term1 + term2 + term3 + term4 + term5

print(final, np.exp(0.4))
1.4917333333333334 1.4918246976412703

Hopefully it is obvious why the function implementation is preferable to coding out the individual terms…

Sustainability#

Computing is increasingly coming under a spotlight for the environmental impact it has; big calculations, like those performed to generate your weather forecast or climate prediction, use a great deal of energy. This is another excellent reason to make sure that we develop good habits early and try to produce efficient code wherever possible to reduce the burden on computing resources.


Thoughts for the future#

Later in your career you may end up working on problems which are sufficiently large that you are required to manage the memory in the computer more carefully or even use more advanced programming functionality, for instance parallel computing, to sove the problem in an efficient way. For now we don’t consider these issues but it is worth keeping in mind that some problems are so big they can fill computer memory (or important parts of it) or occupy a single CPU for an impractical length of time. One piece of good practise, in terms of memory use, is to manage carefully the number of variables in your program. If variables can be reused, and your naming convention doesn’t confuse users, why not reduce the bytes your program occupies. There are ways to manage memory more carefully in Python, e.g. clear variables from memory but it is a bit fiddly and rarely necessary in our cases.