Matrices


A matrix is a bi-dimensional rectangular array of expressions arranged in rows and columns

An m×nm \times n matrix has mm rows and nn columns. Given a matrix AA, the notation ai,jAa_{i,j} \in A refers to the element of AA in the iith row and the jjth column, counting from the top and from the left, respectively.

Binary Operations

Addition

[x1x3x2x4]+[y1y3y2y4]=[x1+y1x3+y3x2+y2x4+y4]\begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} + \begin{bmatrix} y_1 & y_3 \\ y_2 & y_4 \end{bmatrix} = \begin{bmatrix} x_1 + y_1 & x_3 + y_3 \\ x_2 + y_2 & x_4 + y_4 \end{bmatrix}

Substraction

[x1x3x2x4][y1y3y2y4]=[x1y1x3y3x2y2x4y4]\begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} - \begin{bmatrix} y_1 & y_3 \\ y_2 & y_4 \end{bmatrix} = \begin{bmatrix} x_1 - y_1 & x_3 - y_3 \\ x_2 - y_2 & x_4 - y_4 \end{bmatrix}

Scaling (Scalar Multiplication)

Given constant α\alpha:

α[x1x3x2x4]=[αx1αx3αx2αx4]\alpha \begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} = \begin{bmatrix} \alpha x_1 & \alpha x_3 \\ \alpha x_2 & \alpha x_4 \end{bmatrix}

Vector Dot Product (or Inner Product)

Denoted A.BA . B or A,B\langle A, B \rangle given two vectors AA and BB.

[x1y1z1].[x2y2z2]=x1x2+y1y2+z1z2\begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} . \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = x_1 x_2 + y_1 y_2 + z_1 z_2

Note that the vectors must have the same number of rows, and that the result of a dot product is a scalar.

Two vectors v1v_1 and v2v_2 are orthogonal if v1.v2=0v_1 . v_2 = 0.

Vector dot product of uu and vv is equivalent to the matrix product of uTu^{T} and vv:

[x1y1z1].[x2y2z2]=[x1y1z1]T[x2y2z2]=[x1y1z1]T[x2y2z2]=x1x2+y1y2+z1z2\begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} . \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix}^{T} \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} x_1 & y_1 & z_1 \end{bmatrix}^{T} \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = x_1 x_2 + y_1 y_2 + z_1 z_2

Vector Outer Product

Given two vectors uu and vv with the same number of elements, the outer product between them is uv=uvTu \oplus v = u v^{T}, where the result is always a square matrix:

[x1y1z1][x2y2z2]=[x1y1z1][x2y2z2]=[x1x2x1y2x1z2y1x2y1y2y1z2z1x2z1y2z1z2] \begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} \oplus \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} \begin{bmatrix} x_2 & y_2 & z_2 \end{bmatrix} = \begin{bmatrix} x_1 x_2 & x_1 y_2 & x_1 z_2 \\ y_1 x_2 & y_1 y_2 & y_1 z_2 \\ z_1 x_2 & z_1 y_2 & z_1 z_2 \end{bmatrix}

Vector Product (or Cross Product)

[x1y1z1]×[x2y2z2]=[y1z2z1y2z1x2x1z2x1y2y1x2]\begin{bmatrix} x_1 \\ y_1 \\ z_1 \end{bmatrix} \times \begin{bmatrix} x_2 \\ y_2 \\ z_2 \end{bmatrix} = \begin{bmatrix} y_1 z_2 - z_1 y_2 \\ z_1 x_2 - x_1 z_2 \\ x_1 y_2 - y_1 x_2 \end{bmatrix}

Note that the vectors must have the same number of rows, and that the result of a cross product is another vector of the same number of rows.

Cross product is not commutative: A×BB×AA \times B \neq B \times A.

Matrix-Vector Multiplication

Given matrix AA and vector BB, the number of columns in AA must equal the number of rows in BB:

[x1x4x2x5x3x6][pq]=p[x1x2x3]+q[x4x5x6]=[px1px2px3]+[qx4qx5qx6]=[px1+qx4px2+qx5px3+qx6]\begin{bmatrix} x_1 & x_4 \\ x_2 & x_5 \\ x_3 & x_6 \end{bmatrix} \begin{bmatrix} p \\ q \end{bmatrix} = p \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + q \begin{bmatrix} x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} p x_1 \\ p x_2 \\ p x_3 \end{bmatrix} + \begin{bmatrix} q x_4 \\ q x_5 \\ q x_6 \end{bmatrix} = \begin{bmatrix} p x_1 + q x_4 \\ p x_2 + q x_5 \\ p x_3 + q x_6 \end{bmatrix}

The resulting matrix has the same number of rows as AA, but only 1 column.

Note that the following addition is a linear combination: p[x1x2x3]+q[x4x5x6]p \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + q \begin{bmatrix} x_4 \\ x_5 \\ x_6 \end{bmatrix}

Notice that given matrices AA and vectors xx and yy, ATx=yA^{T} x = y is equivalent to xTA=yTx^{T} A = y^{T}.

Matrix Multiplication

Given matrices AA and BB, the number of columns in AA must match the number of rows in BB.

[x1x4x2x5x3x6][y1y3y2y4]=[x1y1+x4y2x1y3+x4y4x2y1+x5y2x2y3+x5y4x3y1+x6y2x3y3+x6y4]\begin{bmatrix} x_1 & x_4 \\ x_2 & x_5 \\ x_3 & x_6 \end{bmatrix} \begin{bmatrix} y_1 & y_3 \\ y_2 & y_4 \end{bmatrix} = \begin{bmatrix} x_1 y_1 + x_4 y_2 & x_1 y_3 + x_4 y_4 \\ x_2 y_1 + x_5 y_2 & x_2 y_3 + x_5 y_4 \\ x_3 y_1 + x_6 y_2 & x_3 y_3 + x_6 y_4 \end{bmatrix}

Note that matrix multiplication is associative: (AB)C=A(BC)=ABC(AB)C = A(BC) = ABC but its not commutative: ABBAAB \neq BA.

Multiplying a 1×m1 \times m matrix with a m×mm \times m matrix looks like this:

[ab][x1x3x2x4]=[ax1+bx3ax2+bx4]\begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} = \begin{bmatrix} a x_1 + b x_3 & a x_2 + b x_4 \end{bmatrix}

Scalar Division

Given constant α\alpha:

[x1x3x2x4]÷α=[x1αx3αx2αx4α]\begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix} \div \alpha = \begin{bmatrix} \frac{x_1}{\alpha} & \frac{x_3}{\alpha} \\ \frac{x_2}{\alpha} & \frac{x_4}{\alpha} \end{bmatrix}

Matrix Division

Dividing A/BA / B is the same as multiplying AA by the inverse of BB: A/B=A(B1)A / B = A(B^{-1}).

Unary Operations

Trace

The trace of a square matrix its the sum of its diagonal, and its defined as Tr(A)=i=1nAi,iTr(A) = \sum_{i = 1}^{n} A_{i,i}. For example:

Tr([592831106])=5+3+6Tr(\begin{bmatrix} 5 & 9 & -2 \\ 8 & 3 & 1 \\ -1 & 0 & 6 \end{bmatrix}) = 5 + 3 + 6

Given constant α\alpha, then Tr(αA+αB)=αTr(A)+αTr(B)Tr(\alpha A + \alpha B) = \alpha Tr(A) + \alpha Tr(B). The trace function is commutative and associative: Tr(AB)=Tr(BA)Tr(AB) = Tr(BA), and Tr(ABC)=Tr(CAB)=Tr(BCA)Tr(ABC) = Tr(CAB) = Tr(BCA). Also Tr(AT)=Tr(A)Tr(A^{T}) = Tr(A).

Vector Norm (length)

Given vector X=[x1x2x3]X = \begin{bmatrix}x_1 \\ x_2 \\ x_3 \end{bmatrix}, the norm of XX is the absolute value: X=x12+x22+x32\parallel X \parallel = \sqrt{x_1^2 + x_2^2 + x_3^2}, which is also equal to the square root of the dot product of XX with itself: X,X\sqrt{\langle X, X \rangle}.

Unit Vector

The unit vector of vector vv is vv divided by its norm: v̂=vv\hat{v} = \frac{v}{\parallel v \parallel}.

Minor

The minor of an entry ai,ja_{i,j} of a square matrix AA is the determinant of the square submatrix of AA when the ii row and jj column (indexed by 1) are removed, and is denoted Mi,jM_{i,j}. For example, given: [abcdefghi]\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}, its minor M2,3M_{2,3} is det([abgh])det(\begin{bmatrix} a & b \\ g & h \end{bmatrix}).

Cofactor

The cofactor of an entry ai,ja_{i,j} of a square matrix AA is denoted Ci,jC_{i,j} or cofactor(ai,j)cofactor(a_{i,j}), and is defined as the entry’s minor with alternating sign depending on the indexes: Ci,j=(1)i+jMi,jC_{i,j} = (-1)^{i+j} M_{i,j}.

Adjugate

The adjugate matrix of n×mn \times m matrix AA is another n×mn \times m where every entry of AA is replaced by its cofactor.

For example, adj([2322])=[C1,1C1,2C2,1C2,2]=[2322]adj(\begin{bmatrix}2 & 3 \\ 2 & 2\end{bmatrix}) = \begin{bmatrix} C_{1,1} & C_{1,2} \\ C_{2,1} & C_{2,2} \end{bmatrix} = \begin{bmatrix}2 & -3 \\ -2 & 2\end{bmatrix} as:

Determinant

The determinant of a square matrix AA is a scalar denoted det(A)det(A) or |A||A|.

The determinant of a 1×11 \times 1 matrix is the element itself: det([x])=xdet(\begin{bmatrix}x\end{bmatrix}) = x. Given a 2×22 \times 2 matrix: det([abcd])=adbcdet(\begin{bmatrix} a & b \\ c & d \end{bmatrix}) = ad - bc. For 3×33 \times 3 and larger matrices AA, the determinant is defined recursively: det(A)=j=1nA1,jM1,jdet(A) = \sum_{j=1}^n A_{1,j} M_{1,j} where nn is the number of columns in AA.

The following laws hold given two square matrices AA and BB:

The rows of a matrix AA are linearly independent if det(A)0det(A) \neq 0. We can say det(A)=0det(A) = 0 if any of the rows of AA is all zeroes. Also, matrix AA is not invertible if det(A)0det(A) \neq 0. If det(A)=0det(A) = 0 then AA is deficient, and full otherwise.

Given row operations:

Considering RREF, given square matrix AA, then det(A)0det(A) \neq 0 implies that rref(A)=1rref(A) = \mathbb{1}. Also, if det(A)0det(A) \neq 0, then det(rref(A))0det(rref(A)) \neq 0, and conversely, if det(A)=0det(A) = 0, then det(rref(A))=0det(rref(A)) = 0.

Inverse

A matrix A1A^{-1} is the inverse of matrix AA if either A(A1)=1A (A^{-1}) = \mathbb{1} or (A1)A=1(A^{-1}) A = \mathbb{1}.

The Invertible Matrix Theorem states that for any square matrix n×nn \times n, the following statements are either all true or all false:

The following laws hold, given two invertible matrices AA and BB:

Using Adjugates

We can calculate the inverse of an n×nn \times n square matrix AA using its adjugate and determinant as follows:

A1=1det(A)adj(A) A^{-1} = \frac{1}{det(A)} \cdot adj(A)

For example, given [2322]\begin{bmatrix}2 & 3 \\ 2 & 2\end{bmatrix}, we know its adjugate is [2322]\begin{bmatrix}2 & -3 \\ -2 & 2\end{bmatrix} and its determinant is 2232=46=22 \cdot 2 - 3 \cdot 2 = 4 - 6 = -2, so A1=[2322]÷2=[22322222]=[13211]A^{-1} = \begin{bmatrix}2 & -3 \\ -2 & 2\end{bmatrix} \div -2 = \begin{bmatrix}\frac{2}{-2} & \frac{-3}{-2} \\ \frac{-2}{-2} & \frac{2}{-2}\end{bmatrix} = \begin{bmatrix}-1 & \frac{3}{2} \\ 1 & -1\end{bmatrix}.

Which we can check as:

[13211][2322]=[12+32213+32212+1213+12]=[1001] \begin{bmatrix} -1 & \frac{3}{2} \\ 1 & -1 \end{bmatrix} \begin{bmatrix} 2 & 3 \\ 2 & 2 \end{bmatrix} = \begin{bmatrix} -1 \cdot 2 + \frac{3}{2} \cdot 2 & -1 \cdot 3 + \frac{3}{2} \cdot 2 \\ 1 \cdot 2 + -1 \cdot 2 & 1 \cdot 3 + -1 \cdot 2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

Using Gauss-Jordan Elimination

We can calculate the inverse of an n×nn \times n square matrix AA by creating an n×2nn \times 2n matrix that contains AA at the left and 1\mathbb{1} at the right:

Given [2322]\begin{bmatrix}2 & 3 \\ 2 & 2\end{bmatrix}, the matrix is then [23102201]\begin{bmatrix} 2 & 3 & 1 & 0 \\ 2 & 2 & 0 & 1 \end{bmatrix}.

Calculate the RREF of the matrix:

rref([23102201])=[101320111]rref(\begin{bmatrix} 2 & 3 & 1 & 0 \\ 2 & 2 & 0 & 1 \end{bmatrix}) = \begin{bmatrix} 1 & 0 & -1 & \frac{3}{2} \\ 0 & 1 & 1 & -1 \end{bmatrix}

The left side of the RREF should be the identity matrix (otherwise the matrix is not invertible) and the right side contains the inverse:

[2322]1=[13211] \begin{bmatrix} 2 & 3 \\ 2 & 2 \end{bmatrix}^{-1} = \begin{bmatrix} -1 & \frac{3}{2} \\ 1 & -1 \end{bmatrix}

Which we can check as:

[13211][2322]=[12+32213+32212+1213+12]=[1001] \begin{bmatrix} -1 & \frac{3}{2} \\ 1 & -1 \end{bmatrix} \begin{bmatrix} 2 & 3 \\ 2 & 2 \end{bmatrix} = \begin{bmatrix} -1 \cdot 2 + \frac{3}{2} \cdot 2 & -1 \cdot 3 + \frac{3}{2} \cdot 2 \\ 1 \cdot 2 + -1 \cdot 2 & 1 \cdot 3 + -1 \cdot 2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

Transpose

Matrix transpose flips a matrix by its diagonal, and its denoted ATA^{T} for a matrix AA.

The following laws hold, given AA and BB:

Rank

The rank of a matrix AA, denoted rank(A)rank(A) is a scalar that equals the number of pivots in the RREF of AA. More formally, is the dimension of either the row or column spaces of AA: rank(A)=dim((A))=dim(𝒞(A))rank(A) = dim(\mathcal{R}(A)) = dim(\mathcal{C}(A)). Basically, the rank describes the number of linearly independent rows or columns in a matrix.

Nullity

The nullity of a matrix AA, denoted nullity(A)nullity(A), is the number of linearly independent vectors in the null space of AA: nullity(A)=dim(𝒩(A))nullity(A) = dim(\mathcal{N}(A)).

Row Echelon Form

The first non-zero element of a matrix row is the leading coefficient or pivot of the row. A matrix is in row echelon form (REF) if:

For example: [3101022000070000]\begin{bmatrix} 3 & 1 & 0 & 1 \\ 0 & 2 & 2 & 0 \\ 0 & 0 & 0 & 7 \\ 0 & 0 & 0 & 0 \end{bmatrix}.

The process of bringing a matrix to row echelon form is called Gaussian Elimination. Starting with the first row:

For example, given [123456789]\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}, the leading coefficient of the first row is already 1, so we can move on. The value below the first leading coefficient is 4, so we can multiply the first vector by 4 and substract it from the second row: (4,5,6)4(1,2,3)=(0,3,6)(4, 5, 6) - 4 (1, 2, 3) = (0, -3, -6) so the matrix is now [123036789]\begin{bmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 7 & 8 & 9 \end{bmatrix}. The leading coefficient of the third row is 7, so we can multiply the first row by 7 and substract it from the third row: (7,8,9)7(1,2,3)=(0,6,12)(7, 8, 9) - 7 (1, 2, 3) = (0, -6, -12) so the matrix is now: [1230360612]\begin{bmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & -6 & -12 \end{bmatrix}. The entries below the first row’s leading coefficient are zero, so we can move on to the second row, which we can divide by -3 to make its leading coefficient 1: (0,3,6)÷3=(0,1,2)(0, -3, -6) \div -3 = (0, 1, 2), so the matrix is now: [1230120612]\begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & -6 & -12 \end{bmatrix}. The coefficient below the second row’s leading coefficient is -6, so we can add the second row multiplied by 6 to it: (0,6,12)+6(0,1,2)=(0,0,0)(0, -6, -12) + 6 (0, 1, 2) = (0, 0, 0) so the matrix is now: [123012000]\begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} and is in row echelon form as the third row is all zeroes.

Reduced Row Echelon Form

A matrix is in reduced row echelon form (RREF) if:

The process of bringing a matrix to row echelon form is called Gaussian-Jordan Elimination. Starting with the last row with a pivot:

For example, given [123012000]\begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix}, the last row with a pivot is the second row. The entry above the leading coefficient is 2, so we can multiply the second row by 2 and substract it from the first row: (1,2,3)2(0,1,2)=(1,0,1)(1, 2, 3) - 2 (0, 1, 2) = (1, 0, -1), so the matrix is now: [101012000]\begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} and is in reduced row echelon form. There is no pivot in the third column, so the last elements of the first and second rows don’t need to be zeroed out.

Vector Spaces

The following vector spaces are the fundamental vector spaces of a matrix. Assume an m×nm \times n matrix MM.

Left Space

The set of all vectors vv that can multiply MM from the left. Basically the vectors vv where vMv M is a valid operation. Given an m×nm \times n matrix MM, its left space is mm dimensional.

Any element vv from the left space can be written as the sum of a vector from the column space and a vector from the left null space:

v(ww=vM)(c𝒞(M),n𝒩(MT)v=c+n)\forall v \bullet (\exists w \bullet w = v M) \implies (\exists c \in \mathcal{C}(M), n \in \mathcal{N}(M^{T}) \bullet v = c + n)

Right Space

The set of all vectors vv that can multiply MM from the right. Basically the vectors vv where MvM v is a valid operation. Given an m×nm \times n matrix MM, its right space is nn dimensional.

Any element vv from the right space can be written as the sum of a vector from the row space and a vector from the null space:

v(ww=Mv)(r(M),n𝒩(M)v=r+n)\forall v \bullet (\exists w \bullet w = M v) \implies (\exists r \in \mathcal{R}(M), n \in \mathcal{N}(M) \bullet v = r + n)

Row Space

The span of the rows of matrix MM: (M)\mathcal{R}(M). Note that (M)=(rref(M))\mathcal{R}(M) = \mathcal{R}(rref(M)). Defined as (M)={vwv=wTM}\mathcal{R}(M) = \{ v \mid \exists w \bullet v = w^{T} M \}.

Column Space

The span of the columns of matrix MM: 𝒞(M)\mathcal{C}(M). Defined as 𝒞(M)={wvw=Mv}\mathcal{C}(M) = \{ w \mid \exists v \bullet w = Mv \}.

(Right) Null Space

The set of vectors vv where M.vM . v is the zero vector: 𝒩(M)={vMv=0}\mathcal{N}(M) = \{ v \mid Mv = 0 \}. It always contains the zero vector. Sometimes called the kernel of the matrix.

Given matrix MM with a null space containing more than the zero vector, then the equation Mx=yMx = y has infinite solutions, as the rows in MM would not be linearly independent, and given a solution xx, we can add any member of the null space and it would still be a valid solution.

For example, consider M=[1224]M = \begin{bmatrix}1 & -2 \\ -2 & 4\end{bmatrix}. Its null space consists of [21]\begin{bmatrix}2 \\ 1\end{bmatrix} and any linear combination of such vector, including the zero vector. Then consider the equation M[xy]=[12]M \begin{bmatrix}x \\ y\end{bmatrix} = \begin{bmatrix}-1 \\ 2\end{bmatrix}. A valid solution is [53]\begin{bmatrix}5 \\ 3\end{bmatrix} as 51+3(2)=15 \cdot 1 + 3 \cdot (-2) = -1 and 5(2)+34=25 \cdot (-2) + 3 \cdot 4 = 2. But then another valid solution is [53]+[21]=[74]\begin{bmatrix}5 \\ 3\end{bmatrix} + \begin{bmatrix}2 \\ 1\end{bmatrix} = \begin{bmatrix}7 \\ 4\end{bmatrix} as 71+4(2)=17 \cdot 1 + 4 \cdot (-2) = -1 and 7(2)+44=27 \cdot (-2) + 4 \cdot 4 = 2. Same for any [53]+α[21]\begin{bmatrix}5 \\ 3\end{bmatrix} + \alpha \begin{bmatrix}2 \\ 1\end{bmatrix} given any constant α\alpha.

If the null space of MM only contains the zero vector, then Mx=yMx = y has exactly one solution, as that solution is xx plus any member of the vector space, which is only the zero vector, and xx plus the zero vector is just xx.

Left Null Space

The set of vectors vv where MT.vM^{T} . v is the zero vector. It is denoted as the (right) null space of the transpose of the input vector: 𝒩(MT)={vMTv=0}\mathcal{N}(M^{T}) = \{ v \mid M^{T}v = 0 \}, or similarly: 𝒩(MT)={vvTM=0}\mathcal{N}(M^{T}) = \{ v \mid v^{T}M = 0 \}.

Similarity Transformations

We say that matrices MM and NN are related by a similarity transformation if there exists an invertible matrix PP such that: M=(P)(N)(P1)M = (P)(N)(P^{-1}).

If the above holds, then the following statements hold as well:

Special Matrices

Identity Matrix

The identity matrix 1\mathbb{1} is a square matrix with 1’s in the diagonal and 0’s elsewhere. The 3×33 \times 3 identity matrix is:

[100010001]\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}.

Given a square and invertible matrix MM, then (M1)(M)=1=(M)(M1)(M^{-1})(M) = \mathbb{1} = (M)(M^{-1}). The identity matrix is symmetric and positive semidefinite.

Multiplying the n×nn \times n identity matrix with a nn dimensional vector is equal to the same vector. Basically 1v=v\mathbb{1} v = v for any vector vv.

13[x1x2x3]=[100010001][x1x2x3]=[1x1+0x2+0x30x1+1x2+0x30x1+0x2+1x3]=[x1x2x3]\mathbb{1}_3 \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 x_1 + 0 x_2 + 0 x_3 \\ 0 x_1 + 1 x_2 + 0 x_3 \\ 0 x_1 + 0 x_2 + 1 x_3 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}

Elementary Matrices

Every row or column operation that can be performed on a matrix, such as a row swap, can be expressed as left multiplication by special matrices called elementary matrices.

For example, given a 2×22 \times 2 matrix [1234]\begin{bmatrix}1 & 2 \\ 3 & 4\end{bmatrix}, the elementary matrix to swap the first and second rows is [0110]\begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix} as:

[0110][1234]=[01+1302+1411+0312+04]=[13141112]=[3412] \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} 0 \cdot 1 + 1 \cdot 3 & 0 \cdot 2 + 1 \cdot 4 \\ 1 \cdot 1 + 0 \cdot 3 & 1 \cdot 2 + 0 \cdot 4 \end{bmatrix} = \begin{bmatrix} 1 \cdot 3 & 1 \cdot 4 \\ 1 \cdot 1 & 1 \cdot 2 \end{bmatrix} = \begin{bmatrix} 3 & 4 \\ 1 & 2 \end{bmatrix}

In order to find elementary matrices, we can perform the desired operation on the identity matrix. In the above case, we can build a 2×22 \times 2 identity matrix [1001]\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} and then swap the rows: [0110]\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}.

Some more 2×22 \times 2 elementary matrices examples:

Diagonal Matrices

A diagonal matrix is a square matrix with values on the diagonal and zeroes everywhere else, such as: A=[x10000x20000x30000x4]A = \begin{bmatrix} x_1 & 0 & 0 & 0 \\ 0 & x_2 & 0 & 0 \\ 0 & 0 & x_3 & 0 \\ 0 & 0 & 0 & x_4 \end{bmatrix}. The values on the diagonal are the eigenvalues of AA: eig(A)={x1,x2,x3,x4}eig(A) = \{ x_1, x_2, x_3, x_4 \}.

An n×nn \times n matrix is only diagonalizable if it has nn eigenvalues. All normal matrices are diagonalizable.

Properties

Normal

A matrix AA is normal if ATA=AATA^{T} A = A A^{T}

Orthogonal

A matrix AA is orthogonal if ATA=AAT=1A^{T} A = A A^{T} = \mathbb{1}, which means that A1=ATA^{-1} = A^{T}. All orthogonal matrices are normal. The determinant of an orthogonal matrix is always -1 or 1.

Symmetric

A matrix AA is symmetric if AT=AA^{T} = A. All symmetric matrices are normal. Notice that given any n×mn \times m matrix AA, the matrix ATAA^{T} A is always symmetric.

Upper Triangular

A matrix AA is upper triangular if it contains zeroes below the diagonal, such as [abcd0efg00hi000j]\begin{bmatrix} a & b & c & d \\ 0 & e & f & g \\ 0 & 0 & h & i \\ 0 & 0 & 0 & j \end{bmatrix}.

Square

An n×mn \times m matrix AA is a square matrix if n=mn = m. A trick to convert a non-square matrix into a square matrix is multiply it by its transpose:

Positive Semidefinite

A matrix AA is positive semidefinite if v(vTAv)0\forall \vec{v} \bullet (\vec{v}^{T} A \vec{v}) \geq 0.

For example, conside A=[1001]A = \begin{bmatrix}1 & 0 \\ 0 & 1\end{bmatrix} and let v=[xy]\vec{v} = \begin{bmatrix} x \\ y \end{bmatrix}, then:

[xy][1001][xy]=[x+0y0x+y][xy]=x2+y2\begin{bmatrix}x & y\end{bmatrix} \begin{bmatrix}1 & 0 \\ 0 & 1\end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix}x + 0y & 0x + y\end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = x^2 + y^2

Both x20x^2 \geq 0 and y20y^2 \geq 0, so AA is positive semidefinite.

Positive Definite

A matrix AA is positive definite if v(vTAv)>0\forall \vec{v} \bullet (\vec{v}^{T} A \vec{v}) \gt 0.