Arrays
Contents
Arrays#
Lists are very “general” in the sense that they can contain objects of multiple different types. However, this generality is a trade-off against efficiency. For many applications, we instead use arrays. These are a central feature of numpy
, and can only contain elements all of the same type (usually int
or float
, though numpy
makes finer distinctions to both of these types). Arrays make operations with large amounts of numeric data much faster.
Creating Arrays#
There are a number of ways to create arrays.
Conversion from list
#
Given a list
of objects, you can easily create an array
from it:
import numpy as np
my_array = np.array([1, 3, 2, 5, 10]) # make an array from a list
my_array
array([ 1, 3, 2, 5, 10])
If you try and convert a list of mixed types then numpy
does its best to convert it to a sensible array - but you may not get what you expect!
my_list = [1, 2, 45.3, "hello"]
my_array2 = np.array(my_list)
my_array2
array(['1', '2', '45.3', 'hello'], dtype='<U32')
You can also create higher-dimensional arrays by converting lists of lists:
np.array([[1, 2], [3, 4]])
array([[1, 2],
[3, 4]])
arange
and linspace
#
When we are dealing with large arrays, it is impractical to type in each entry by hand. There are a number of helpful functions provided by numpy
to create arrays with certain properties. To create an array of evenly-spaced values between a start
and stop
value, we can use the arange
or linspace
functions from numpy
.
With both functions we specify the start and stop values (like with range
), but with arange
we specify the spacing between the values, and with linspace
we specify the number of points we want in the array (with even spacing).
np.arange(0.0, 4.2, 0.2)
array([0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2, 1.4, 1.6, 1.8, 2. , 2.2, 2.4,
2.6, 2.8, 3. , 3.2, 3.4, 3.6, 3.8, 4. ])
np.linspace(0.0, 4.2, num = 10)
array([0. , 0.46666667, 0.93333333, 1.4 , 1.86666667,
2.33333333, 2.8 , 3.26666667, 3.73333333, 4.2 ])
You might have expected the last array to contain 0, 0.42, ..., 4.2
given the argument num=10
. However, the first point 0
is included as one of these ten points; this means that the spacing is given by (b-a)/(num-1)
where a
is the start point and b
is the end point. If we wanted a spacing of 0.42
, we should use arange
or give the argument num=11
:
np.linspace(0.0, 4.2, num = 11)
array([0. , 0.42, 0.84, 1.26, 1.68, 2.1 , 2.52, 2.94, 3.36, 3.78, 4.2 ])
All-zero arrays#
Another common way to create arrays is by creating an array with all entries 0, then setting the entries to be the correct value. To create an all-zero array, use np.zeros
:
np.zeros(10)
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
# For multi-dimensional arrays, the dimensions have to be given
# as a tuple (dim1, dim2,...) - hence the extra pair of brackets in the line below.
np.zeros((3, 5))
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
In the following example, we create a \(3 \times 4\) matrix \(A\) with entries $\(a_{i,j} = \begin{cases}2i + j & i, j > 1 \\ 0 &\text{otherwise.} \end{cases}\)\( for \)1 \leq i \leq 3\( and \)1 \leq j \leq 4$.
Notice that we index matrices from 1 in mathematics, but from 0 in programming - you have to be very careful with this, and it’s best to clarify which you mean.
# Create a 3x4 matrix of zeros; notice the argument is a pair (dim1, dim2)
dim1 = 3
dim2 = 4
mat = np.zeros((dim1, dim2))
# now set the entries
for i in range(1,dim1):
for j in range(1,dim2):
# We use (i + 1) and (j + 1) to agree with the mathematical definition.
mat[i, j] = 2 * (i + 1) + (j + 1)
# Note we don't need multiple square brackets with numpy arrays.
# You could use mat[i][j] instead, but mat[i,j] is more efficient.
# view mat
mat
array([[ 0., 0., 0., 0.],
[ 0., 6., 7., 8.],
[ 0., 8., 9., 10.]])
Diagonal arrays#
We often want to create arrays with entries only on the main diagonal. There are several ways to do this in numpy
; here are some examples.
# create the N by N identity matrix with np.identity(N)
np.identity(3)
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
# You can also use np.eye(N).
# This is more flexible than np.identity - you make extra columns
# containing zeros using the "M" argument
np.eye(3, M = 5)
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.]])
# Create an array with specified diagonal entries.
# Note the extra brackets to make the argument a single tuple.
np.diag((1, 4, 2, 6))
array([[1, 0, 0, 0],
[0, 4, 0, 0],
[0, 0, 2, 0],
[0, 0, 0, 6]])
Working with arrays
#
Numpy
arrays are extremely powerful and we will not attempt to discuss all (or even a small portion) of their features in this Notebook. One of the most useful features is that many numpy
functions can be applied to all elements of an array at once:
np.sin(mat)
array([[ 0. , 0. , 0. , 0. ],
[ 0. , -0.2794155 , 0.6569866 , 0.98935825],
[ 0. , 0.98935825, 0.41211849, -0.54402111]])
This has many uses, like generating the \(y\)-coordinates for plots. Applying functions like this to arrays is not only more convenient than using lists and for
-loops, but often much faster. Unlike lists, we can also do arithmetic with arrays:
2 * mat + np.sin(mat)
array([[ 0. , 0. , 0. , 0. ],
[ 0. , 11.7205845 , 14.6569866 , 16.98935825],
[ 0. , 16.98935825, 18.41211849, 19.45597889]])
You can access elements in the same way as for lists:
# the second element of mat is the row [0, 6, 7, 8]
mat[1]
array([0., 6., 7., 8.])
# We could get the third element of the second row using mat[1][2],
# but it is more efficient to do mat[1, 2].
# This syntax does not work for lists, only for arrays.
mat[1,2]
7.0
Multiplication using *
means component-wise multiplication, using the formula \(c_{ij} = a_{ij}b_{ij}\).
mat = np.array([[1, 2], [3, 4]])
mat
array([[1, 2],
[3, 4]])
mat * mat
array([[ 1, 4],
[ 9, 16]])
For standard matrix multiplication, use @
instead:
mat @ mat
array([[ 7, 10],
[15, 22]])