Matrices


A matrix is a bi-dimensional rectangular array of expressions arranged in rows and columns

An m \times n matrix has m rows and n columns. Given a matrix A , the notation a_{i,j} \in A refers to the element of A in the i th row and the j th column, counting from the top and from the left, respectively.

Binary Operations

Addition

\begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} + \begin{bmatrix} y_1 & y_3 \\ y_2 & y_4 \end{bmatrix} = \begin{bmatrix} x_1 + y_1 & x_3 + y_3 \\ x_2 + y_2 & x_4 + y_4 \end{bmatrix}

Substraction

\begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} - \begin{bmatrix} y_1 & y_3 \\ y_2 & y_4 \end{bmatrix} = \begin{bmatrix} x_1 - y_1 & x_3 - y_3 \\ x_2 - y_2 & x_4 - y_4 \end{bmatrix}

Scaling (Scalar Multiplication)

Given constant \alpha :

\alpha \begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} = \begin{bmatrix} \alpha x_1 & \alpha x_3 \\ \alpha x_2 & \alpha x_4 \end{bmatrix}

Vector Dot Product (or Inner Product)

Denoted A \thinspace . B or \langle A, B \rangle given two vectors A and B .

\begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} . \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = x_1 x_2 + y_1 y_2 + z_1 z_2

Note that the vectors must have the same number of rows, and that the result of a dot product is a scalar.

Two vectors v_1 and v_2 are orthogonal if v_1 \thinspace . v_2 = 0 .

Vector dot product of u and v is equivalent to the matrix product of u^{T} and v :

\begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} . \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix}^{T} \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} x_1 & y_1 & z_1 \end{bmatrix}^{T} \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = x_1 x_2 + y_1 y_2 + z_1 z_2

Vector Outer Product

Given two vectors u and v with the same number of elements, the outer product between them is u \oplus v = u v^{T} , where the result is always a square matrix:

\begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} \oplus \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} \begin{bmatrix} x_2 & y_2 & z_2 \end{bmatrix} = \begin{bmatrix} x_1 x_2 & x_1 y_2 & x_1 z_2 \\ y_1 x_2 & y_1 y_2 & y_1 z_2 \\ z_1 x_2 & z_1 y_2 & z_1 z_2 \end{bmatrix}

Vector Product (or Cross Product)

\begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} \times \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} y_1 z_2 - z_1 y_2 \\ z_1 x_2 - x_1 z_2 \\ x_1 y_2 - y_1 x_2 \end{bmatrix}

Note that the vectors must have the same number of rows, and that the result of a cross product is another vector of the same number of rows.

Cross product is not commutative: A \times B \neq B \times A .

Matrix-Vector Multiplication

Given matrix A and vector B , the number of columns in A must equal the number of rows in B :

\begin{bmatrix} x_1 & x_4 \\ x_2 & x_5 \\ x_3 & x_6 \end{bmatrix} \begin{bmatrix} p \\ q \end{bmatrix} = p \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + q \begin{bmatrix} x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} p x_1 \\ p x_2 \\ p x_3 \end{bmatrix} + \begin{bmatrix} q x_4 \\ q x_5 \\ q x_6 \end{bmatrix} = \begin{bmatrix} p x_1 + q x_4 \\ p x_2 + q x_5 \\ p x_3 + q x_6 \end{bmatrix}

The resulting matrix has the same number of rows as A , but only 1 column.

Note that the following addition is a linear combination: p \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + q \begin{bmatrix} x_4 \\ x_5 \\ x_6 \end{bmatrix}

Notice that given matrices A and vectors x and y , A^{T} x = y is equivalent to x^{T} A = y^{T} .

Matrix Multiplication

Given matrices A and B , the number of columns in A must match the number of rows in B .

\begin{bmatrix} x_1 & x_4 \\ x_2 & x_5 \\ x_3 & x_6 \end{bmatrix} \begin{bmatrix} y_1 & y_3 \\ y_2 & y_4 \end{bmatrix} = \begin{bmatrix} x_1 y_1 + x_4 y_2 & x_1 y_3 + x_4 y_4 \\ x_2 y_1 + x_5 y_2 & x_2 y_3 + x_5 y_4 \\ x_3 y_1 + x_6 y_2 & x_3 y_3 + x_6 y_4 \end{bmatrix}

Note that matrix multiplication is associative: (AB)C = A(BC) = ABC but its not commutative: AB \neq BA .

Multiplying a 1 \times m matrix with a m \times m matrix looks like this:

\begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} = \begin{bmatrix} a x_1 + b x_3 & a x_2 + b x_4 \end{bmatrix}

Scalar Division

Given constant \alpha :

\begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} \div \alpha = \begin{bmatrix} \frac{x_1}{\alpha} & \frac{x_3}{\alpha} \\ \frac{x_2}{\alpha} & \frac{x_4}{\alpha} \end{bmatrix}

Matrix Division

Dividing A / B is the same as multiplying A by the inverse of B : A / B = A(B^{-1}) .

Unary Operations

Trace

The trace of a square matrix its the sum of its diagonal, and its defined as Tr(A) = \sum_{i = 1}^{n} A_{i,i} . For example:

Tr(\begin{bmatrix} 5 & 9 & -2 \\ 8 & 3 & 1 \\ -1 & 0 & 6 \end{bmatrix}) = 5 + 3 + 6

Given constant \alpha , then Tr(\alpha A + \alpha B) = \alpha Tr(A) + \alpha Tr(B) . The trace function is commutative and associative: Tr(AB) = Tr(BA) , and Tr(ABC) = Tr(CAB) = Tr(BCA) . Also Tr(A^{T}) = Tr(A) .

Vector Norm (length)

Given vector X = \begin{bmatrix}x_1 \\ x_2 \\ x_3 \end{bmatrix} , the norm of X is the absolute value: \parallel \thinspace X \parallel \thinspace = \sqrt{x_1^2 + x_2^2 + x_3^2} , which is also equal to the square root of the dot product of X with itself: \sqrt{\langle X, X \rangle} .

Unit Vector

The unit vector of vector v is v divided by its norm: \hat{v} = \frac{v}{\parallel v \parallel} .

Minor

The minor of an entry a_{i,j} of a square matrix A is the determinant of the square submatrix of A when the i row and j column (indexed by 1) are removed, and is denoted M_{i,j} . For example, given: \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} , its minor M_{2,3} is det(\begin{bmatrix} a & b \\ g & h \end{bmatrix}) .

Cofactor

The cofactor of an entry a_{i,j} of a square matrix A is denoted C_{i,j} or cofactor(a_{i,j}) , and is defined as the entry’s minor with alternating sign depending on the indexes: C_{i,j} = (-1)^{i+j} M_{i,j} .

Adjugate

The adjugate matrix of n \times m matrix A is another n \times m where every entry of A is replaced by its cofactor.

For example, adj(\begin{bmatrix}2 & 3 \\ 2 & 2\end{bmatrix}) = \begin{bmatrix} C_{1,1} & C_{1,2} \\ C_{2,1} & C_{2,2} \end{bmatrix} = \begin{bmatrix}2 & -3 \\ -2 & 2\end{bmatrix} as:

Determinant

The determinant of a square matrix A is a scalar denoted det(A) or |A| .

The determinant of a 1 \times 1 matrix is the element itself: det(\begin{bmatrix}x\end{bmatrix}) = x . Given a 2 \times 2 matrix: det(\begin{bmatrix} a & b \\ c & d \end{bmatrix}) = ad - bc . For 3 \times 3 and larger matrices A , the determinant is defined recursively: det(A) = \sum_{j=1}^n A_{1,j} M_{1,j} where n is the number of columns in A .

The following laws hold given two square matrices A and B :

The rows of a matrix A are linearly independent if det(A) \neq 0 . We can say det(A) = 0 if any of the rows of A is all zeroes. Also, matrix A is not invertible if det(A) \neq 0 . If det(A) = 0 then A is deficient, and full otherwise.

Given row operations:

Considering RREF, given square matrix A , then det(A) \neq 0 implies that rref(A) = \unicode{x1D7D9} . Also, if det(A) \neq 0 , then det(rref(A)) \neq 0 , and conversely, if det(A) = 0 , then det(rref(A)) = 0 .

Inverse

A matrix A^{-1} is the inverse of matrix A if either A (A^{-1}) = \unicode{x1D7D9} or (A^{-1}) A = \unicode{x1D7D9} .

The Invertible Matrix Theorem states that for any square matrix n \times n , the following statements are either all true or all false:

The following laws hold, given two invertible matrices A and B :

Using Adjugates

We can calculate the inverse of an n \times n square matrix A using its adjugate and determinant as follows:

A^{-1} = \frac{1}{det(A)} \cdot adj(A)

For example, given \begin{bmatrix}2 & 3 \\ 2 & 2\end{bmatrix} , we know its adjugate is \begin{bmatrix}2 & -3 \\ -2 & 2\end{bmatrix} and its determinant is 2 \cdot 2 - 3 \cdot 2 = 4 - 6 = -2 , so A^{-1} = \begin{bmatrix}2 & -3 \\ -2 & 2\end{bmatrix} \div -2 = \begin{bmatrix}\frac{2}{-2} & \frac{-3}{-2} \\ \frac{-2}{-2} & \frac{2}{-2}\end{bmatrix} = \begin{bmatrix}-1 & \frac{3}{2} \\ 1 & -1\end{bmatrix} .

Which we can check as:

\begin{bmatrix} -1 & \frac{3}{2} \\ 1 & -1 \end{bmatrix} \begin{bmatrix} 2 & 3 \\ 2 & 2 \end{bmatrix} = \begin{bmatrix} -1 \cdot 2 + \frac{3}{2} \cdot 2 & -1 \cdot 3 + \frac{3}{2} \cdot 2 \\ 1 \cdot 2 + -1 \cdot 2 & 1 \cdot 3 + -1 \cdot 2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

Using Gauss-Jordan Elimination

We can calculate the inverse of an n \times n square matrix A by creating an n \times 2n matrix that contains A at the left and \unicode{x1D7D9} at the right:

Given \begin{bmatrix}2 & 3 \\ 2 & 2\end{bmatrix} , the matrix is then \begin{bmatrix} 2 & 3 & 1 & 0 \\ 2 & 2 & 0 & 1 \end{bmatrix} .

Calculate the RREF of the matrix:

rref(\begin{bmatrix} 2 & 3 & 1 & 0 \\ 2 & 2 & 0 & 1 \end{bmatrix}) = \begin{bmatrix} 1 & 0 & -1 & \frac{3}{2} \\ 0 & 1 & 1 & -1 \end{bmatrix}

The left side of the RREF should be the identity matrix (otherwise the matrix is not invertible) and the right side contains the inverse:

\begin{bmatrix} 2 & 3 \\ 2 & 2 \end{bmatrix}^{-1} = \begin{bmatrix} -1 & \frac{3}{2} \\ 1 & -1 \end{bmatrix}

Which we can check as:

\begin{bmatrix} -1 & \frac{3}{2} \\ 1 & -1 \end{bmatrix} \begin{bmatrix} 2 & 3 \\ 2 & 2 \end{bmatrix} = \begin{bmatrix} -1 \cdot 2 + \frac{3}{2} \cdot 2 & -1 \cdot 3 + \frac{3}{2} \cdot 2 \\ 1 \cdot 2 + -1 \cdot 2 & 1 \cdot 3 + -1 \cdot 2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

Transpose

Matrix transpose flips a matrix by its diagonal, and its denoted A^{T} for a matrix A .

The following laws hold, given A and B :

Rank

The rank of a matrix A , denoted rank(A) is a scalar that equals the number of pivots in the RREF of A . More formally, is the dimension of either the row or column spaces of A : rank(A) = dim(\mathcal{R}(A)) = dim(\mathcal{C}(A)) . Basically, the rank describes the number of linearly independent rows or columns in a matrix.

Nullity

The nullity of a matrix A , denoted nullity(A) , is the number of linearly independent vectors in the null space of A : nullity(A) = dim(\mathcal{N}(A)) .

Row Echelon Form

The first non-zero element of a matrix row is the leading coefficient or pivot of the row. A matrix is in row echelon form (REF) if:

For example: \begin{bmatrix} 3 & 1 & 0 & 1 \\ 0 & 2 & 2 & 0 \\ 0 & 0 & 0 & 7 \\ 0 & 0 & 0 & 0 \end{bmatrix} .

The process of bringing a matrix to row echelon form is called Gaussian Elimination. Starting with the first row:

For example, given \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} , the leading coefficient of the first row is already 1, so we can move on. The value below the first leading coefficient is 4, so we can multiply the first vector by 4 and substract it from the second row: (4, 5, 6) - 4 (1, 2, 3) = (0, -3, -6) so the matrix is now \begin{bmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 7 & 8 & 9 \end{bmatrix} . The leading coefficient of the third row is 7, so we can multiply the first row by 7 and substract it from the third row: (7, 8, 9) - 7 (1, 2, 3) = (0, -6, -12) so the matrix is now: \begin{bmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & -6 & -12 \end{bmatrix} . The entries below the first row’s leading coefficient are zero, so we can move on to the second row, which we can divide by -3 to make its leading coefficient 1: (0, -3, -6) \div -3 = (0, 1, 2) , so the matrix is now: \begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & -6 & -12 \end{bmatrix} . The coefficient below the second row’s leading coefficient is -6, so we can add the second row multiplied by 6 to it: (0, -6, -12) + 6 (0, 1, 2) = (0, 0, 0) so the matrix is now: \begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} and is in row echelon form as the third row is all zeroes.

Reduced Row Echelon Form

A matrix is in reduced row echelon form (RREF) if:

The process of bringing a matrix to row echelon form is called Gaussian-Jordan Elimination. Starting with the last row with a pivot:

For example, given \begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} , the last row with a pivot is the second row. The entry above the leading coefficient is 2, so we can multiply the second row by 2 and substract it from the first row: (1, 2, 3) - 2 (0, 1, 2) = (1, 0, -1) , so the matrix is now: \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} and is in reduced row echelon form. There is no pivot in the third column, so the last elements of the first and second rows don’t need to be zeroed out.

Vector Spaces

The following vector spaces are the fundamental vector spaces of a matrix. Assume an m \times n matrix M .

Left Space

The set of all vectors v that can multiply M from the left. Basically the vectors v where v M is a valid operation. Given an m \times n matrix M , its left space is m dimensional.

Any element v from the left space can be written as the sum of a vector from the column space and a vector from the left null space:

\forall v \bullet (\exists w \bullet w = v M) \implies (\exists c \in \mathcal{C}(M), n \in \mathcal{N}(M^{T}) \bullet v = c + n)

Right Space

The set of all vectors v that can multiply M from the right. Basically the vectors v where M v is a valid operation. Given an m \times n matrix M , its right space is n dimensional.

Any element v from the right space can be written as the sum of a vector from the row space and a vector from the null space:

\forall v \bullet (\exists w \bullet w = M v) \implies (\exists r \in \mathcal{R}(M), n \in \mathcal{N}(M) \bullet v = r + n)

Row Space

The span of the rows of matrix M : \mathcal{R}(M) . Note that \mathcal{R}(M) = \mathcal{R}(rref(M)) . Defined as \mathcal{R}(M) = \{ v \mid \exists w \bullet v = w^{T} M \} .

Column Space

The span of the columns of matrix M : \mathcal{C}(M) . Defined as \mathcal{C}(M) = \{ w \mid \exists v \bullet w = Mv \} .

(Right) Null Space

The set of vectors v where M . v is the zero vector: \mathcal{N}(M) = \{ v \mid Mv = 0 \} . It always contains the zero vector. Sometimes called the kernel of the matrix.

Given matrix M with a null space containing more than the zero vector, then the equation Mx = y has infinite solutions, as the rows in M would not be linearly independent, and given a solution x , we can add any member of the null space and it would still be a valid solution.

For example, consider M = \begin{bmatrix}1 & -2 \\ -2 & 4\end{bmatrix} . Its null space consists of \begin{bmatrix}2 \\ 1\end{bmatrix} and any linear combination of such vector, including the zero vector. Then consider the equation M \begin{bmatrix}x \\ y\end{bmatrix} = \begin{bmatrix}-1 \\ 2\end{bmatrix} . A valid solution is \begin{bmatrix}5 \\ 3\end{bmatrix} as 5 \cdot 1 + 3 \cdot (-2) = -1 and 5 \cdot (-2) + 3 \cdot 4 = 2 . But then another valid solution is \begin{bmatrix}5 \\ 3\end{bmatrix} + \begin{bmatrix}2 \\ 1\end{bmatrix} = \begin{bmatrix}7 \\ 4\end{bmatrix} as 7 \cdot 1 + 4 \cdot (-2) = -1 and 7 \cdot (-2) + 4 \cdot 4 = 2 . Same for any \begin{bmatrix}5 \\ 3\end{bmatrix} + \alpha \begin{bmatrix}2 \\ 1\end{bmatrix} given any constant \alpha .

If the null space of M only contains the zero vector, then Mx = y has exactly one solution, as that solution is x plus any member of the vector space, which is only the zero vector, and x plus the zero vector is just x .

Left Null Space

The set of vectors v where M^{T} . v is the zero vector. It is denoted as the (right) null space of the transpose of the input vector: \mathcal{N}(M^{T}) = \{ v \mid M^{T}v = 0 \} , or similarly: \mathcal{N}(M^{T}) = \{ v \mid v^{T}M = 0 \} .

Similarity Transformations

We say that matrices M and N are related by a similarity transformation if there exists an invertible matrix P such that: M = (P)(N)(P^{-1}) .

If the above holds, then the following statements hold as well:

Special Matrices

Identity Matrix

The identity matrix \unicode{x1D7D9} is a square matrix with 1’s in the diagonal and 0’s elsewhere. The 3 \times 3 identity matrix is:

\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} .

Given a square and invertible matrix M , then (M^{-1})(M) = \unicode{x1D7D9} = (M)(M^{-1}) . The identity matrix is symmetric and positive semidefinite.

Multiplying the n \times n identity matrix with a n dimensional vector is equal to the same vector. Basically \unicode{x1D7D9} v = v for any vector v .

\unicode{x1D7D9}_3 \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 x_1 + 0 x_2 + 0 x_3 \\ 0 x_1 + 1 x_2 + 0 x_3 \\ 0 x_1 + 0 x_2 + 1 x_3 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}

Elementary Matrices

Every row or column operation that can be performed on a matrix, such as a row swap, can be expressed as left multiplication by special matrices called elementary matrices.

For example, given a 2 \times 2 matrix \begin{bmatrix}1 & 2 \\ 3 & 4\end{bmatrix} , the elementary matrix to swap the first and second rows is \begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix} as:

\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} 0 \cdot 1 + 1 \cdot 3 & 0 \cdot 2 + 1 \cdot 4 \\ 1 \cdot 1 + 0 \cdot 3 & 1 \cdot 2 + 0 \cdot 4 \end{bmatrix} = \begin{bmatrix} 1 \cdot 3 & 1 \cdot 4 \\ 1 \cdot 1 & 1 \cdot 2 \end{bmatrix} = \begin{bmatrix} 3 & 4 \\ 1 & 2 \end{bmatrix}

In order to find elementary matrices, we can perform the desired operation on the identity matrix. In the above case, we can build a 2 \times 2 identity matrix \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} and then swap the rows: \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} .

Some more 2 \times 2 elementary matrices examples: