Linear Algebra


Bases

A basis for a vector space V is a set of linearly independent vectors \{ \vec{v_1}, \vec{v_2}, ..., \vec{v_n} \} such that span(\{ \vec{v_1}, \vec{v_2}, ..., \vec{v_n} \}) = V . A vector space may have more than one basis, but all bases for a vector space have the same number of elements. Any vector of a vector space can be expressed as a linear combination of a basis of such vector space.

The basis of a vector can be explicitly denoted using subscript notation. For example if \vec{v} is expressed in terms of basis B , then we can say [\vec{v}]_{B} .

More formally, given vector space V , the set B is a basis for V if span(B) = V and if removing any element from B stops making B a basis for V : \forall b \in \wp(B) \setminus \{ \emptyset \} \bullet span(B \setminus b) \neq V . A basis B for V is maximal, which means that adding any element to B makes it a linearly dependent vector set.

Orthogonal and Orthonormal Bases

Two vectors \vec{u} and \vec{v} are orthogonal if \langle \vec{u}, \vec{v} \rangle = 0 . Given a vector space V , a set of vectors \{ e_1, e_2, ..., e_n \} is an orthogonal basis for V if each e_i is orthogonal to all the other vectors in the basis: \forall e_i, e_j \in \{ e_1, e_2, ..., e_n \} \mid e_i \neq e_j \bullet \langle e_i, e_j \rangle = 0 .

The unit vector of vector v is \hat{v} = \frac{v}{\parallel v \parallel} . The set \{ \hat{e}_1, \hat{e}_2, ..., \hat{e}_n \} is an orthonormal basis for V if every \hat{e}_i is the unit vector of a vector e_i in an orthogonal basis for V . A vecto \vec{v} can be expressed in terms of an orthonormal basis \{ \hat{e}_1, \hat{e}_2, ..., \hat{e}_n \} as \vec{v} = \langle \vec{v}, \hat{e}_1 \rangle \hat{e}_1 + \langle \vec{v}, \hat{e}_2 \rangle \hat{e}_2 + ... + \langle \vec{v}, \hat{e}_n \rangle \hat{e}_n .

The Gram–Schmidt process allows us to obtain an orthogonal basis \{ e_1, e_2, ..., e_n \} given any basis \{ v_1, v_2, ..., v_n \} . If we have an orthogonal basis, its trivial to obtain an orthonormal basis by calculating the unit vector of each vector in the orthogonal basis. The process is defined as e_n = v_n - \sum_{i=1}^{n - 1} (\langle \hat{e_i}, v_n \rangle \hat{e_i}) so basically:

Matrix Bases

Given matrix A :

We can setup the system of equations rref(A) \cdot \vec{x} = \vec{0} . If A is an n \times m matrix, then \vec{x} contains m elements.

For example, given \begin{bmatrix}1 & 2 & 0 & 0 \\ 0 & 0 & 1 & -3 \\ 0 & 0 & 0 & 0\end{bmatrix} , then the equation is \begin{bmatrix}1 & 2 & 0 & 0 \\ 0 & 0 & 1 & -3 \\ 0 & 0 & 0 & 0\end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix}0 \\ 0 \\ 0 \end{bmatrix} .

The goal is to re-express \vec{x} so that the elements corresponding to columns in A with pivots are expressed in terms of the elements corresponding to columns without pivots.

In our example, the columns of A with pivots are the first and the third one, so we need to express x_1 and x_3 in terms of x_2 and x_4 . Expanding the equations gives us:

So we now know that \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix}-2x_2 \\ x_2 \\ 3x_4 \\ x_4\end{bmatrix} .

Finally, we can express \vec{x} as a linear combination over the terms corresponding to columns without pivots. The coefficients of such linear combination are a basis for \mathcal{N}(A) .

In our example, \begin{bmatrix}-2x_2 \\ x_2 \\ 3x_4 \\ x_4\end{bmatrix} = \begin{bmatrix}-2 \\ 1 \\ 0 \\ 0\end{bmatrix}x_2 + \begin{bmatrix}0 \\ 0 \\ 3 \\ 1\end{bmatrix}x_4 , so the basis is \{ \begin{bmatrix}-2 \\ 1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}0 \\ 0 \\ 3 \\ 1\end{bmatrix} \} .

Change of Basis

The identity transformation matrix changes the basis of a matrix to another basis of the same vector space. The identity transformation matrix that changes from basis B_1 to bases B_2 is {}_{B_2}[\unicode{x1D7D9}]_{B_1} . Notice an identity transformation matrix is not equal to the identity matrix \unicode{x1D7D9} , even though they re-use the same symbol.

Given B_1 = \{ e_1, e_2, e_3 \} and B_2 = \{ t_1, t_2, t_3 \} , then {}_{B_2}[\unicode{x1D7D9}]_{B_1} consists of all the dot products e_i \cdot t_j :

{}_{B_2}[\unicode{x1D7D9}]_{B_1} = \begin{bmatrix} t_1 \cdot e_1 & t_1 \cdot e_2 & t_1 \cdot e_3 \\ t_2 \cdot e_1 & t_2 \cdot e_2 & t_2 \cdot e_3 \\ t_3 \cdot e_1 & t_3 \cdot e_2 & t_3 \cdot e_3 \end{bmatrix}

Notice that ({}_{B_2}[\unicode{x1D7D9}]_{B_1})^{-1} = {}_{B_1}[\unicode{x1D7D9}]_{B_2} .

Linear Independence

A set of vectors \{ \vec{v_1}, ..., \vec{v_n} \} is linearly independent if the only solution to the equation \alpha_1 \vec{v_1} + ... + \alpha_n \vec{v_n} = \vec{0} is 0 for all \alpha_n .

We can also say a set V = \{ \vec{v_1}, ..., \vec{v_n} \} where no v_i is the zero vector is linearly independent if no vector from the set is in the span of the other vectors: \forall v \in V \bullet v \notin span(V \setminus \{ v \}) .

Another way to express that the set of vectors in V are linearly independent is that every vector in span(V) has a unique expression as a linear combination of vectors in V .

The rows of a matrix A are linearly independent if det(A) \neq 0

Linear Combinations

A linear combination is an algebraic expression consisting on the sum of terms and constants of the form: a_1 x_1 + a_2 x_2 + ... + a_n x_n where the set of x_n are terms (with exponent 1) and the set a_n are their corresponding constants. Linear combinations are degree-1 polynomials.

In a linear combination a_1 x_1 + a_2 x_2 + ... + a_n x_n , the x_n terms correspond to the basis in which we are expressing the linear combination.

Span

The span of a set of vectors is the set of all vectors that can be constructed as linear combinations of those vectors.

Consider that given \vec{v_1} , \vec{v_2} , and \vec{v_3} = \vec{v_1} + \vec{v_2} , then span(\{ \vec{v_1}, \vec{v_2}, \vec{v_3} \}) = span(\{ \vec{v_1}, \vec{v_2} \}) , as \vec{v_3} can be expressed as a linear combination of the other two.

A set of vectors V is spanned by \{ \vec{v_1}, ..., \vec{v_n} \} if any vector in V can be expressed as a linear combination of the vectors in \{ \vec{v_1}, ..., \vec{v_n}\} .

Vector Spaces

A vector space is a set that consists of a number of linearly independent vectors and all linear combinations of those vectors. Vector spaces must be closed under addition and scalar multiplication, which means that, given vector space V :

An abstract vector space is defined as (V, F, +, \cdot) where:

The addition function + and the set V must have the following properties:

The scalar function \cdot and the set F must have the following properties:

Vector spaces define an inner product \langle \cdot, \cdot \rangle : V \times V \mapsto \mathbb{R} function that is:

Defining an inner product automatically defines the length/norm operator \parallel \thinspace \vec{v} \parallel = \sqrt{\langle \vec{v}, \vec{v} \rangle} and the distance operator d(\vec{u}, \vec{v}) = \parallel \thinspace \vec{u} - \vec{v} \parallel = \sqrt{\langle (\vec{u} - \vec{v}), (\vec{u} - \vec{v}) \rangle} . Both operations have the following characteristics given a valid inner product definition:

Subspaces

A vector space W \subseteq V is a vector subspace of V if:

Notice vector subspaces always contain the zero vector, as in order for a vector space to be closed under scalar multiplication, it must hold that \forall \vec{w} \in W \bullet 0\vec{w} \in W for the scalar zero, and multiplication with the scalar zero always yields the zero vector.

One way to define a vector subspace is to constrain a larger vector space. Given \mathbb{R}^3 , we can define a bi-dimensional vector subspace as \{ (x, y, z) \in \mathbb{R}^3 \mid (0, 0, 1) \cdot (x, y, z) = 0 \} . Another way is to define vector subspaces using span . The bi-dimensional vector subspace of \mathbb{R}^3 is also defined as span(\{ (1, 0, 0), (0, 1, 0)\}) .

Orthogonal Complement

Given vector space Q and a vector subspace P \subseteq Q , P^{\perp} is the orthogonal complement of P in vector space Q , defined as: P^{\perp} = \{ \vec{q} \in Q \mid \forall \vec{p} \in P \bullet \vec{q} \cdot \vec{p} = 0 \} .

Dimension

The dimension of vector space S , denoted dim(S) , is the cardinality (number of elements) in a basis of S . Every possible basis of a vector space has the same dimension.

The following laws hold given an n \times m dimensional matrix M :

Zero Vector

The zero vector \vec{0} of a vector space V is a vector such that \forall \vec{x} \in V \bullet \vec{x} + \vec{0} = \vec{0} + \vec{x} = \vec{x} .

Linear Transformations (or Map)

A linear transformation (also called linear map or linear function), is a function that maps vectors to vectors, and that preserves the following property, assuming function f and linear combination \alpha \vec{x_1} + \beta \vec{x_2} :

f(\alpha \vec{x_1} + \beta \vec{x_2}) = \alpha f(\vec{x_1}) + \beta f(\vec{x_2})

Which in turn implies that:

Given a linear transformation f that maps an n dimensional vector space to an m dimensional vector space, if f is a bijective function, then n = m , and it means that f is a one to one mapping between vector spaces.

Consider a linear transformation f : V \mapsto W and \vec{v_1}, \vec{v_2} \in V . If we know that f(\vec{v_1}) = \vec{w_1} and that f(\vec{v_2}) = \vec{w_2} , then f(\alpha \vec{v_1} + \beta \vec{v_2}) = \alpha \vec{w_1} + \beta \vec{w_2} , which means we know how f will behave for any linear combination of \vec{v_1} and \vec{v_2} . This is important as if we know how a linear transformation behaves for a basis of a vector space, then we know how it behaves for the whole vector space.

Kernel

The kernel of a linear transformation t : V \mapsto W is the set of vectors from V that map to the zero vector: Ker(t) = \{ \vec{v} \in V \mid t(\vec{v}) = \vec{0} \} . Notice that if Ker(t) = \{ \vec{0} \} , then t is an injective function.

Image Space

The image space of a linear transformation t : V \mapsto W , denoted Im(t) is the range of t , which is the set of vectors from W that the function can produce. Notice that if Im(t) = W , then t is a surjective function.

Matrix Representation

If we have a linear transformation f : V \mapsto W and a basis B_v for V and a basis B_w for W , then we can express f as a matrix M_{f} such that applying the transformation to a vector is equivalent to multiplying the matrix with the vector: given \vec{v} \in V and \vec{w} \in W , then f(\vec{v}) = \vec{w} \iff M_{f} \vec{v} = \vec{w} . Notice the matrix is not the linear transformation, but a representation of the linear transformation with respect to certain bases.

Any linear transformation that maps an n dimensional vector space to a m dimensional vector space can be represented as an m \times n matrix.

Matrix M_f “takes” a vector in basis B_v and outputs a vector in basis B_w , so its sometimes more explicitly denoted as {}_{B_w}[M_f]_{B_v} , writing the input basis at the right, and the output basis at the left.

Correspondences between linear transformations and their matrix representations, given \vec{v} , f , s , M_f , and M_s :

In order to find the matrix representation with respect to a basis of a linear transformation, apply the linear transformation to all the vectors in the chosen basis and use the results as columns of the matrix representation.

For example, consider \mathbb{R}^3 , basis \{ (0, 0, 1), (0, 1, 0), (1, 0, 0) \} , and a linear transformation f((x, y, z)) = (x, y, 0) . Then M_f = \begin{bmatrix}0 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 0\end{bmatrix} as f((0, 0, 1)) = (0, 0, 0) , f((0, 1, 0)) = (0, 1, 0) , and f((1, 0, 0)) = (1, 0, 0) .

We can express a matrix transformation in terms of different bases by surrounding it with the corresponding identity transformation matrices. For example, given {}_{B_v}[M_f]_{B_v} , then {}_{B_w}[M_f]_{B_w} = {}_{B_w}[\unicode{x1D7D9}]_{B_v} {}_{B_v}[M_f]_{B_v} {}_{B_v}[\unicode{x1D7D9}]_{B_w} , which basically means that we transform the input matrix to the original’s transformation basis, apply the linear transformation, and then change to the new basis again.

Inverse

A linear transformation f : V \mapsto W is invertible if its either injective, which implies Ker(f) = \{ \vec{0} \} , or surjective, which implies Im(t) = W . It is also invertible if \exists f^{-1} \bullet \forall \vec{v} \bullet f^{-1}(f(\vec{v})) = \vec{v} . If such linear transformation is invertible, then its matrix representation M_f is invertible as well, and viceversa.

Given linear transformation f and its matrix representation M_f in terms of a certain basis, then M_f^{-1} corresponds to f^{-1} . Notice that given vector \vec{v} , if M_f^{-1} is invertible, then M_f^{-1} M_f \vec{v} = \vec{v} .

Affine Transformations

An affine transformation is a function q : V \mapsto W that maps vector spaces, which is a combination of a linear transformation t and a translation by a fixed vector \vec{b} : q(\vec{x}) = t(\vec{x}) + \vec{b} , or given the matrix representation M_t , q(\vec{x}) = M_t \vec{x} + \vec{b} .

Systems of Linear Equations

Using RREFs

We can solve a system of n linear equations given m terms by constructing an n \times m + 1 matrix where the last column correspond to the constants at the right of the equal sign and computing its RREF. The last column of the RREF contains the solutions for each corresponding pivot term.

The system of equations has no solutions if the contructed matrix is not linearly independent, in which case its RREF contains zero coefficients with a potentially non-zero constant at the end.

For example, consider 1x + 2y = 5 and 3x + 9y = 21 . The resulting matrix is \begin{bmatrix}1 & 2 & 5 \\ 3 & 9 & 21\end{bmatrix} . Then, rref(\begin{bmatrix}1 & 2 & 5 \\ 3 & 9 & 21\end{bmatrix}) = \begin{bmatrix}1 & 0 & 1 \\ 0 & 1 & 2\end{bmatrix} , so the solution set is x = 1 and y = 2 .

Using Inverses

We can solve a system of n linear equations given m terms by expressing it as a matrix equation A\vec{x} = \vec{b} of an n square (otherwise there is no inverse) matrix A containing the coefficients multiplied by an n vector \vec{x} containing the terms, all equal to an n vector \vec{b} containing the right-hand side constants. Using the inverse of A , we can re-express A\vec{x} = \vec{b} as A^{-1} A \vec{x} = A^{-1} \vec{b} , which in turn equals \vec{x} = A^{-1} \vec{b} as A^{-1} A = \unicode{x