Linear Algebra and Python Basics¶

In this chapter, I will be discussing some linear algebra basics that will provide sufficient linear algebra background for effective programming in Python for our purposes. We will be doing very basic linear algebra that by no means covers the full breadth of this topic. Why linear algebra? Linear algebra allows us to express relatively complex linear expressions in a very compact way.

Being comfortable with the rules for scalar and matrix addition, subtraction, multiplication, and division (known as inversion) is important for our class.

Before we can implement any of these ideas in code, we need to talk a bit about python and how data is stored.

Python Primer¶

There are numerous ways to run python code. I will show you two and both are easily accessible after installing Anaconda:

The Spyder integrated development environment. The major advantages of Spyder is that it provides a graphical way for viewing matrices, vectors, and other objects you want to check as you work on a problem. It also has the most intuitive way of debugging code.

Spyder looks like this: Code can be run by clicking the green arrow (runs the entire file) or by blocking a subset and running it. In Windows or Mac, you can launch the Spyder by looking for the icon in the newly installed Program Folder Anaconda.
The Ipython Notebook (now called Jupyter). The major advantages of this approach is that you use your web browser for all of your python work and you can mix code, videos, notes, graphics from the web, and mathematical notation to tell the whole story of your python project. In fact, I am using the ipython notebook for writing these notes. The Ipython Notebook looks like this: In Windows or Mac, you can launch the Ipython Notebook by looking in the newly installed Program Folder Anaconda.

In my work flow, I usually only use the Ipython Notebook, but for some coding problems where I need access to the easy debugging capabilities of Spyder, I use it. We will be using the Ipython Notebook interface (web browser) mostly in this class.

Loading libraries¶

The python universe has a huge number of libraries that extend the capabilities of python. Nearly all of these are open source, unlike packages like stata or matlab where some key libraries are proprietary (and can cost lots of money). In lots of my code, you will see this at the top:

In [1]:

%matplotlib inline
import sympy as sympy
import numpy as np
import matplotlib.pyplot as plt
from scipy import *

This code sets up Ipython Notebook environments (lines beginning with %), and loads several libraries and functions. The core scientific stack in python consists of a number of free libraries. The ones I have loaded above include:

sympy: provides for symbolic computation (solving algebra problems)
numpy: provides for linear algebra computations
matplotlib.pyplot: provides for the ability to graph functions and draw figures
scipy: scientific python provides a plethora of capabilities

Creating arrays, scalars, and matrices in Python¶

Scalars can be created easily like this:

In [2]:

x = .5
print(x)

0.5

Vectors and Lists¶

The numpy library (we will reference it by np) is the workhorse library for linear algebra in python. To creat a vector simply surround a python list ($[1,2,3]$) with the np.array function:

In [3]:

x_vector = np.array([1,2,3])
print(x_vector)

[1 2 3]

We could have done this by defining a python list and converting it to an array:

In [4]:

c_list = [1,2]
print("The list:",c_list)
print("Has length:", len(c_list))

c_vector = np.array(c_list)
print("The vector:", c_vector)
print("Has shape:",c_vector.shape)

The list: [1, 2]
Has length: 2
The vector: [1 2]
Has shape: (2,)

In [5]:

z = [5,6]
print("This is a list, not an array:",z)
print(type(z))

This is a list, not an array: [5, 6]
<class 'list'>

In [6]:

zarray = np.array(z)
print("This is an array, not a list",zarray)
print(type(zarray))

This is an array, not a list [5 6]
<class 'numpy.ndarray'>

Matrices¶

In [7]:

b = list(zip(z,c_vector))
print(b)
print("Note that the length of our combined list is 2 not (2 by 2):",len(b))

[(5, 1), (6, 2)]
Note that the length of our combined list is 2 not (2 by 2): 2

In [8]:

print("But we can convert the list to a matrix like this:")
A = np.array(b)
print(A)
print(type(A))
print("A has shape:",A.shape)

But we can convert the list to a matrix like this:
[[5 1]
 [6 2]]
<class 'numpy.ndarray'>
A has shape: (2, 2)

Matrix Addition and Subtraction¶

Adding or subtracting a scalar value to a matrix¶

To learn the basics, consider a small matrix of dimension $2 \times 2$, where $2 \times 2$ denotes the number of rows $\times$ the number of columns. Let $A=\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$. Consider adding a scalar value (e.g. 3) to the matrix $A$.

$$ A+3=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}+3 =\begin{bmatrix} a_{11}+3 & a_{12}+3 \\ a_{21}+3 & a_{22}+3 \end{bmatrix} $$

The same basic principle holds true for $A-3$:

$$ A-3=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}-3 =\begin{bmatrix} a_{11}-3 & a_{12}-3 \\ a_{21}-3 & a_{22}-3 \end{bmatrix} $$

Notice that we add (or subtract) the scalar value to each element in the matrix A. A can be of any dimension.

This is trivial to implement, now that we have defined our matrix A:

In [9]:

result = A + 3
#or
result = 3 + A
print(result)

[[8 4]
 [9 5]]

Adding or subtracting two matrices¶

Consider two small $2 \times 2$ matrices of dimension $2 \times 2$. As before let $A=\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$ and let $B=\bigl( \begin{smallmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{smallmatrix} \bigr)$. To find the result of $A-B$, simply subtract each element of $A$ with the corresponding element of $B$:

$$ A - B = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} - \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}-b_{11} & a_{12}-b_{12} \\ a_{21}-b_{21} & a_{22}-b_{22} \end{bmatrix} $$

Addition works exactly the same way:

$$ A + B = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} + \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}+b_{11} & a_{12}+b_{12} \\ a_{21}+b_{21} & a_{22}+b_{22} \end{bmatrix} $$

An important point to know about matrix addition and subtraction is that it is only defined when $A$ and $B$ are of the same size. Here, both are $2 \times 2$. Since operations are performed element by element, these two matrices must be conformable- and for addition and subtraction that means they must have the same numbers of rows and columns. I like to be explicit about the dimensions of matrices for checking conformability as I write the equations, so write

$$ A_{2 \times 2} + B_{2 \times 2}= \begin{bmatrix} a_{11}+b_{11} & a_{12}+b_{12} \\ a_{21}+b_{21} & a_{22}+b_{22} \end{bmatrix}_{2 \times 2} $$

Notice that the result of a matrix addition or subtraction operation is always of the same dimension as the two operands.

Let's define another matrix, B, that is also $2 \times 2$ and add it to A:

In [10]:

B = np.random.randn(2,2)
print(B)

[[-0.4089757  -0.03934539]
 [ 0.28474541 -0.47162285]]

In [11]:

result = A + B
result

Out[11]:

array([[4.5910243 , 0.96065461],
       [6.28474541, 1.52837715]])

Matrix Multiplication¶

Multiplying a scalar value times a matrix¶

As before, let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$. Suppose we want to multiply A times a scalar value (e.g. $3 \times A$)

$$ 3 \times A = 3 \times \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} = \begin{bmatrix} 3a_{11} & 3a_{12} \\ 3a_{21} & 3a_{22} \end{bmatrix} $$

is of dimension (2,2). Scalar multiplication is commutative, so that $3 \times A$=$A \times 3$. Notice that the product is defined for a matrix A of any dimension.

Similar to scalar addition and subtration, the code is simple:

In [12]:

A * 3

Out[12]:

array([[15,  3],
       [18,  6]])

Multiplying two matricies¶

Now, consider the $2 \times 1$ vector $C=\bigl( \begin{smallmatrix} c_{11} \\ c_{21} \end{smallmatrix} \bigr)$

Consider multiplying matrix $A_{2 \times 2}$ and the vector $C_{2 \times 1}$. Unlike the addition and subtraction case, this product is defined. Here, conformability depends not on the row and column dimensions, but rather on the column dimensions of the first operand and the row dimensions of the second operand. We can write this operation as follows

$$ A_{2 \times 2} \times C_{2 \times 1} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}_{2 \times 2} \times \begin{bmatrix} c_{11} \\ c_{21} \end{bmatrix}_{2 \times 1} = \begin{bmatrix} a_{11}c_{11} + a_{12}c_{21} \\ a_{21}c_{11} + a_{22}c_{21} \end{bmatrix}_{2 \times 1} $$

Alternatively, consider a matrix C of dimension $2 \times 3$ and a matrix A of dimension $3 \times 2$

$$ A_{3 \times 2}=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{bmatrix}_{3 \times 2} , C_{2 \times 3} = \begin{bmatrix} c_{11} & c_{12} & c_{13} \\ c_{21} & c_{22} & c_{23} \\ \end{bmatrix}_{2 \times 3} $$

Here, A $\times$ C is

$$ \begin{align} A_{3 \times 2} \times C_{2 \times 3}=& \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{bmatrix}_{3 \times 2} \times \begin{bmatrix} c_{11} & c_{12} & c_{13} \\ c_{21} & c_{22} & c_{23} \end{bmatrix}_{2 \times 3} \nonumber \\ =&\begin{bmatrix} a_{11} c_{11}+a_{12} c_{21} & a_{11} c_{12}+a_{12} c_{22} & a_{11} c_{13}+a_{12} c_{23} \\ a_{21} c_{11}+a_{22} c_{21} & a_{21} c_{12}+a_{22} c_{22} & a_{21} c_{13}+a_{22} c_{23} \\ a_{31} c_{11}+a_{32} c_{21} & a_{31} c_{12}+a_{32} c_{22} & a_{31} c_{13}+a_{32} c_{23} \end{bmatrix}_{3 \times 3}\nonumber \end{align} $$

So in general, $X_{r_x \times c_x} \times Y_{r_y \times c_y}$ we have two important things to remember:

For conformability in matrix multiplication, $c_x=r_y$, or the columns in the first operand must be equal to the rows of the second operand.
The result will be of dimension $r_x \times c_y$, or of dimensions equal to the rows of the first operand and columns equal to columns of the second operand.

Given these facts, you should convince yourself that matrix multiplication is not generally commutative, that the relationship $X \times Y = Y \times X$ does not hold in all cases. For this reason, we will always be very explicit about whether we are pre multiplying ($X \times Y$) or post multiplying ($Y \times X$) the vectors/matrices $X$ and $Y$.

For more information on this topic, see this http://en.wikipedia.org/wiki/Matrix_multiplication.

In [13]:

# Let's redefine A and C to demonstrate matrix multiplication:
A = np.arange(6).reshape((3,2))
C = np.random.randn(2,2)

print(A.shape)
print(C.shape)

(3, 2)
(2, 2)

We will use the numpy dot operator to perform the these multiplications. You can use it two ways to yield the same result:

In [14]:

print(A.dot(C))
print(np.dot(A,C))

[[-1.62960465 -0.66937954]
 [-4.71071009 -0.47809108]
 [-7.79181554 -0.28680261]]
[[-1.62960465 -0.66937954]
 [-4.71071009 -0.47809108]
 [-7.79181554 -0.28680261]]

Suppose instead of pre-multiplying C by A, we post-multiply. The product doesn't exist because we don't have conformability as described above:

In [16]:

C.dot(A)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-b4bc17848904> in <module>()
----> 1 C.dot(A)

ValueError: shapes (2,2) and (3,2) not aligned: 2 (dim 1) != 3 (dim 0)

Matrix Division¶

The term matrix division is actually a misnomer. To divide in a matrix algebra world we first need to invert the matrix. It is useful to consider the analog case in a scalar work. Suppose we want to divide the $f$ by $g$. We could do this in two different ways: $$ \frac{f}{g}=f \times g^{-1}. $$ In a scalar seeting, these are equivalent ways of solving the division problem. The second one requires two steps: first we invert g and then we multiply f times g. In a matrix world, we need to think about this second approach. First we have to invert the matrix g and then we will need to pre or post multiply depending on the exact situation we encounter (this is intended to be vague for now).

Inverting a Matrix¶

As before, consider the square $2 \times 2$ matrix $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\end{smallmatrix} \bigr)$. Let the inverse of matrix A (denoted as $A^{-1}$) be

$$ A^{-1}=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}^{-1}=\frac{1}{a_{11}a_{22}-a_{12}a_{21}} \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{bmatrix} $$

The inverted matrix $A^{-1}$ has a useful property: $$ A \times A^{-1}=A^{-1} \times A=I $$ where I, the identity matrix (the matrix equivalent of the scalar value 1), is $$ I_{2 \times 2}=\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} $$ furthermore, $A \times I = A$ and $I \times A = A$.

An important feature about matrix inversion is that it is undefined if (in the $2 \times 2$ case) $a_{11}a_{22}-a_{12}a_{21}=0$. If this relationship is equal to zero the inverse of A does not exist. If this term is very close to zero, an inverse may exist but $A^{-1}$ may be poorly conditioned meaning it is prone to rounding error and is likely not well identified computationally. The term $a_{11}a_{22}-a_{12}a_{21}$ is the determinant of matrix A, and for square matrices of size greater than $2 \times 2$, if equal to zero indicates that you have a problem with your data matrix (columns are linearly dependent on other columns). The inverse of matrix A exists if A is square and is of full rank (ie. the columns of A are not linear combinations of other columns of A).

For more information on this topic, see this http://en.wikipedia.org/wiki/Matrix_inversion, for example, on inverting matrices.

In [17]:

# note, we need a square matrix (# rows = # cols), use C:
C_inverse = np.linalg.inv(C)
print(C_inverse)

[[-0.56388901 -0.64446024]
 [ 1.37278794  0.07501784]]

Check that $C\times C^{-1} = I$:

In [18]:

print(C.dot(C_inverse))
print("Is very close to:")
print(C_inverse.dot(C))

[[ 1.00000000e+00 -6.02293165e-18]
 [-7.37890427e-17  1.00000000e+00]]
Is very close to:
[[1.00000000e+00 4.09029871e-17]
 [1.37169701e-17 1.00000000e+00]]

Transposing a Matrix¶

At times it is useful to pivot a matrix for conformability- that is in order to matrix divide or multiply, we need to switch the rows and column dimensions of matrices. Consider the matrix $$ A_{3 \times 2}=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{bmatrix}_{3 \times 2} $$ The transpose of A (denoted as $A^{\prime}$) is $$ A^{\prime}=\begin{bmatrix} a_{11} & a_{21} & a_{31} \\ a_{12} & a_{22} & a_{32} \\ \end{bmatrix}_{2 \times 3} $$

In [19]:

A = np.arange(6).reshape((3,2))
B = np.arange(8).reshape((2,4))
print("A is")
print(A)

print("The Transpose of A is")
print(A.T)

A is
[[0 1]
 [2 3]
 [4 5]]
The Transpose of A is
[[0 2 4]
 [1 3 5]]

One important property of transposing a matrix is the transpose of a product of two matrices. Let matrix A be of dimension $N \times M$ and let B of of dimension $M \times P$. Then $$ (AB)^{\prime}=B^{\prime}A^{\prime} $$ For more information, see this http://en.wikipedia.org/wiki/Matrix_transposition on matrix transposition. This is also easy to implement:

In [20]:

print(B.T.dot(A.T))
print("Is identical to:")
print((A.dot(B)).T)

[[ 4 12 20]
 [ 5 17 29]
 [ 6 22 38]
 [ 7 27 47]]
Is identical to:
[[ 4 12 20]
 [ 5 17 29]
 [ 6 22 38]
 [ 7 27 47]]

More python tools¶

Indexing¶

Python begins indexing at 0 (not 1), therefore the first row and first column is referenced by 0,0 not 1,1.

Slicing¶

Accessing elements of numpy matrices and arrays. This code grabs the first column of A:

In [21]:

print(A)
A[:,0]

[[0 1]
 [2 3]
 [4 5]]

Out[21]:

array([0, 2, 4])

or, we could grab a particular element (in this case, the second column, last row):

In [22]:

A[2,1]

Out[22]:

The last row of A is

In [23]:

A[-1,:]

Out[23]:

array([4, 5])

Excluding the first row:

In [24]:

A[1:,:]

Out[24]:

array([[2, 3],
       [4, 5]])

Logical Checks to extract values from matrices/arrays:¶

In [25]:

print(A)

[[0 1]
 [2 3]
 [4 5]]

In [26]:

print(A[:,1]>4)

A[A[:,1]>4]

[False False  True]

Out[26]:

array([[4, 5]])

For loops¶

Create a $12 \times 2$ matrix and print it out:

In [27]:

A = np.arange(24).reshape((12,2))
print(A)
print(A.shape)

[[ 0  1]
 [ 2  3]
 [ 4  5]
 [ 6  7]
 [ 8  9]
 [10 11]
 [12 13]
 [14 15]
 [16 17]
 [18 19]
 [20 21]
 [22 23]]
(12, 2)

The code below loops over the rows (12 of them) of our matrix A. For each row, it slices A and prints the row values across all columns. Notice the form of the for loop. The colon defines the statement we are looping over. For each iteration of the loop idented lines will be executed:

In [28]:

for rows in A:
    print(rows)

[0 1]
[2 3]
[4 5]
[6 7]
[8 9]
[10 11]
[12 13]
[14 15]
[16 17]
[18 19]
[20 21]
[22 23]

In [29]:

for cols in A.T:
    print(cols)

[ 0  2  4  6  8 10 12 14 16 18 20 22]
[ 1  3  5  7  9 11 13 15 17 19 21 23]

If/then/else¶

The code below checks the value of x and categorizes it into one of three values. Like the for loop, each logical if check is ended with a colon, and any commands to be applied to that particular if check (if true) must be indented.

In [30]:

x=.4

if x<.5:
    print("Heads")
    print(100)
elif x>.5:
    print("Tails")
    print(0)
else:
    print("Tie")
    print(50)

Heads
100

While loops¶

Again, we have the same basic form for the statement (note the colons and indents). Here we use the shorthand notation x+=1 for performing the calculation x = x + 1:

In [31]:

x=0
while x<10:
    x+=1 
    print(x<10)

print(x)

True
True
True
True
True
True
True
True
True
False
10