Linear independence and basis vectors

2.2 Linear independence and basis vectors

A set of vectors $a, b, \dots, u$ is said to be linearly independent provided the equation

λ a + μ b + \dots + σ u = 0

has no solution except $λ = μ = \dots = σ = 0$ .

This can be used to show that when you pick one of the vectors $a, b, \dots, u$ , it can not be expressed as a sum over the rest. There usually is a largest number of independent vectors:

The dimension $n$ of a space is the largest possible number of linearly independent vectors which can be found in the space.

Any set of $n$ linearly independent vectors $e_{1}, e_{2}, \dots, e_{n}$ in an $n$ -dimensional space is said to form a complete set of basis vectors, since one can show that any vector $x$ in the space can be expanded in the form

x = x_{1} e_{1} + x_{2} e_{2} + . . . + x_{n} e_{n},

(2.1)

where the numbers $x_{i}$ are called the components of $x$ in the basis $e_{1}, e_{2}, \dots, e_{n}$ .

Example 2.4:

: Show that the vectors $(1, 1, 0)$ , $(1, 0, 1)$ and $(0, 1, 1)$ are linearly independent. Find the component of a general vector $(x, y, z)$ in this basis.

Solution:

We get the three coupled equations

\begin{array}{l} x & = x_{1} + x_{2}, \\ y & = x_{1} + x_{3}, \\ z & = x_{2} + x_{3} . \end{array}

These have a unique solution if the determinant is not zero,

det (\begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{matrix}) \neq 0 .

This is true, since the determinant equals $- 2$ . The components are found by solving the equations (Gaussian elimination is quite easy in this case):

x_{1} = \frac{1}{2} (x + y - z), x_{2} = \frac{1}{2} (x + z - y), x_{3} = (y + z - x) .

Theorem 2.1. The decomposition ( 2.1 ) is unique.

Proof. Suppose that there is a second decomposition, $x = y_{1} e_{1} + y_{2} e_{2} + . . . + y_{n} e_{n}$ . Subtract the left- and right-hand sides of the two decompositions, collecting terms:

0 = (x_{1} - y_{1}) e_{1} + (x_{2} - y_{1}) e_{2} + . . . + (x_{n} - y_{n}) e_{n} .

Linear independence of the vectors ${e_{i}}$ implies that $x_{i} = y_{i}$ , which contradicts our assumption of a second decomposition, and thus it is unique. □

Let us look at an example in an inﬁnite dimensional space:

Example 2.5:

The Fourier decomposition of a function deﬁned only on the interval $[- π, π]$ is given by

f (x) = a_{0} + \sum_{n = 1}^{\infty} (a_{n} cos n x + b_{n} sin n x) .

This means that for the Fourier series the basis functions are:

1, {sin n x, cos n x}, n = 1, \dots, \infty .

It is highly non-trivial (i.e., quite hard) to show that this basis is complete! This is a general complication in inﬁnite dimensional spaces.

2.2.1 The scalar product

For any two vectors $a, b$ we can deﬁne a scalar product ¹ $(a, b)$ [the mathematicians’ preferred notation for $a \cdot b$ ], which satisﬁes:

\begin{align} (a, b) & = {(b, a)}^{*}, & (2.2) \\ (a, λ b + μ c) & = λ (a, b) + μ (a, c), & (2.3) \end{align}

together with

(a, a) \geq 0,

(2.4)

where the equality holds for $a = 0$ only.

Note: Mathematically the inner product is a mapping from $V \otimes V \mapsto S$ !

Note: In functional spaces we physicists often use the Dirac bra-ket notation $⟨ ψ | ϕ ⟩$ for $(ψ, ϕ)$ . We can use the inner product to deﬁne the norm (a more correct description of the length) of the vector $a$ ,

∥ a ∥ \equiv {(a, a)}^{1 ∕ 2}

One can deﬁne a length without an inner product. A good example is the so-called “1-norm” of a vector, $| | x | |_{1} = \sum_{n} | x_{n} |$ , which is used quite commonly in numerical analysis of linear algebra.

One of the important uses of an inner product is to test whether two vectors are at straight angles:

The vectors

a

and

b

are said to be orthogonal if

(a, b) = 0

Triangle and Cauchy-Schwartz inequality

The triangle inequality is a trivial statement that the length of the sum of two vectors is less than the sum of the lengths,

∥ a + b ∥ \leq | a + b | .

From the triangle inequality we can prove the Cauchy Schwartz inequality

Theorem 2.2. For any two vectors $a$ , $b$ , we have

| (a, b) | \leq a b .

Proof. The proof is simple. Consider $(a + x b, a + x b,)$ , which is $\geq 0$ . Minimise this with respect to $x$ , and we ﬁnd a minimum for $x = - (a, b) ∕ ∥ b ∥^{2}$ . At that point the function takes on the value $a^{2} - {(a, b)}^{2} ∕ b^{2} \geq 0$ . Multiply this by $b^{2}$ and take the square root at both sides for the desired result. □

Orthogonalisation and Orthogonalisation

There are two ways to turn an arbitrary set of vectors into an orthogonal set–one where every pair of vectors is orthogonal–, or even better orthonormal set–an orthogonal set where each vector has length one.

The most traditional approach is the Gramm-Schmidt procedure. This procedure is deﬁned recursively. In the $m$ th step of the algorithm one deﬁnes the vector $e_{m}^{'}$ that is orthonormal to the $m - 1$ orthonormal vectors deﬁned in previous steps. Thus

\begin{array}{l} 1 : & e_{m}^{″} = e_{m} - \sum_{i = 1}^{m - 1} (e_{i}^{'}, e_{m}) e_{i}^{'}; \\ 2 : & e_{m}^{'} = e_{m}^{″} ∕ ∥ e_{m}^{″} ∥ . \end{array}

The ﬁrst line above removes all components parallel to the $m - 1$ previous normalised and orthogonal vectors (check!), the second step normalises the result, so that $e_{m}^{'}$ is normalised.

A more modern approach (based on extensive use of numerical linear algebra) is based on the construction of the “overlap matrix” (also called norm matrix, which is why we use the symbol) $N$ , with entries $N_{i j} = (e_{i}, e_{j})$ . This matrix is Hermitian (or symmetric if the basis is real), and we can now deﬁne a matrix $N^{- 1 ∕ 2}$ , such that $N^{- 1 ∕ 2} N N^{- 1 ∕ 2} = I$ . This can then be used to deﬁne the orthonormal basis

{e^{'}}_{k}^{(i)} = {(N^{- 1 ∕ 2})}_{k l} e_{l}^{(i)} .

For a real symmetric matrix $M$ (and similarly for a Hermitian one, but we shall concentrate on the ﬁrst case here) we can deﬁne matrix powers in a simple and unique way by requiring that the powers are also symmetric matrices.
The easiest way to get the result is ﬁrst to diagonalise the matrix $M$ , i.e., to ﬁnd its eigenvalues $λ_{i}$ and eigenvectors $e_{j}^{(i)}$ . We can then write $M = O diag (λ) O^{T}$ , with $O$ the matrix with the normalised eigenvectors as columns, $O_{i j} = e_{i}^{(j)}$ . The eigenvectors are orthonormal, and thus $O^{T} O = I$ . The matrix $diag (. .)$ has the eigenvalues $λ$ on the diagonal, and is zero elsewhere. (Convince yourself that $O^{T} O = I$ and $O^{T} M O = diag (λ)$ .)
We then deﬁne arbitrary powers of $M$ by

M^{a} = O diag (λ^{a}) O^{T} .

Question: Show that

M^{a}

is a symmetric matrix

Orthonormal basis functions: For discrete ² vector spaces one can always choose an orthonormal set of basis functions satisfying

(e_{i}, e_{j}) = δ_{i j} .

(2.5)

Here we have introduced the Kronecker delta $δ_{i j}$ , deﬁned for integer $i, j$ . This object is zero unless $i = j$ , when it is 1.

For such an orthonormal basis the completeness relation can be written as

\sum_{i} {(e_{i})}_{a} {(e_{i})}_{b} = δ_{a b} .

(2.6)

2.2.2 Questions

2.: Use the deﬁnition of independence to show that Eq. ( 2.1 ) holds for any set of independent functions.

[back][up]