2.2 Linear independence and basis vectors

A set of vectors a,b,\mathop{\mathop{…}},u is said to be linearly independent provided the equation

λa + μb + \mathop{\mathop{…}} + σu = 0

has no solution except λ = μ = \mathop{\mathop{…}} = σ = 0.

This can be used to show that when you pick one of the vectors a,b,\mathop{\mathop{…}},u, it can not be expressed as a sum over the rest. There usually is a largest number of independent vectors:

The dimension n of a space is the largest possible number of linearly independent vectors which can be found in the space.

Any set of n linearly independent vectors {e}_{1},{e}_{2},\mathop{\mathop{…}},{e}_{n} in an n-dimensional space is said to form a complete set of basis vectors, since one can show that any vector x in the space can be expanded in the form

x = {x}_{1}{e}_{1} + {x}_{2}{e}_{2} + ... + {x}_{n}{e}_{n},
(2.1)

where the numbers {x}_{i} are called the components of x in the basis {e}_{1},{e}_{2},\mathop{\mathop{…}},{e}_{n}.

Example 2.4: 

Show that the vectors (1,1,0), (1,0,1) and (0,1,1) are linearly independent. Find the component of a general vector (x,y,z) in this basis.

Solution: 

We get the three coupled equations

\eqalignno{ x & = {x}_{1} + {x}_{2}, & & \cr y & = {x}_{1} + {x}_{3}, & & \cr z & = {x}_{2} + {x}_{3}. & & }

These have a unique solution if the determinant is not zero,

\mathop{det}\left (\array{ 1&1&0\cr 1&0 &1 \cr 0&1&1} \right )\mathrel{≠}0.

This is true, since the determinant equals − 2. The components are found by solving the equations (Gaussian elimination is quite easy in this case):

{x}_{1} = {1\over 2}(x + y − z),\quad {x}_{2} = {1\over 2}(x + z − y),\quad {x}_{3} = (y + z − x).

Theorem 2.1. The decomposition (2.1) is unique.

Proof. Suppose that there is a second decomposition, x = {y}_{1}{e}_{1} + {y}_{2}{e}_{2} + ... + {y}_{n}{e}_{n}. Subtract the left- and right-hand sides of the two decompositions, collecting terms:

0 = ({x}_{1} − {y}_{1}){e}_{1} + ({x}_{2} − {y}_{1}){e}_{2} + ... + ({x}_{n} − {y}_{n}){e}_{n}.

Linear independence of the vectors \{{e}_{i}\} implies that {x}_{i} = {y}_{i}, which contradicts our assumption of a second decomposition, and thus it is unique. □

Let us look at an example in an infinite dimensional space:

Example 2.5: 

The Fourier decomposition of a function defined only on the interval [−π,π] is given by

f(x) = {a}_{0} +{ \mathop{∑ }}_{n=1}^{∞}\left ({a}_{ n}\mathop{ cos}\nolimits nx + {b}_{n}\mathop{ sin}\nolimits nx\right ).

This means that for the Fourier series the basis functions are:

1,\{\mathop{sin}\nolimits nx,\mathop{cos}\nolimits nx\},\quad n = 1,\mathop{\mathop{…}},∞\quad .

It is highly non-trivial (i.e., quite hard) to show that this basis is complete! This is a general complication in infinite dimensional spaces.

2.2.1 The scalar product

For any two vectors a,b we can define a scalar product1 (a,b) [the mathematicians’ preferred notation for a⋅b], which satisfies:

\eqalignno{ (a,b) & = {(b,a)}^{∗}, &\text{(2.2)} \cr (a,λb + μc) & = λ(a,b) + μ(a,c)\quad , &\text{(2.3)} }

together with

(a,a) ≥ 0,
(2.4)

where the equality holds for a = 0 only.

Note: Mathematically the inner product is a mapping from V ⊗ V \mathrel{↦}S!

Note: In functional spaces we physicists often use the Dirac bra-ket notation \langle ψ|ϕ\rangle for (ψ,ϕ) . We can use the inner product to define the norm (a more correct description of the length) of the vector a,

\|a\| ≡ {(a,a)}^{1∕2}.

One can define a length without an inner product. A good example is the so-called “1-norm” of a vector, ||x|{|}_{1} ={\mathop{ \mathop{∑ }}\nolimits }_{n}|{x}_{n}|, which is used quite commonly in numerical analysis of linear algebra.

One of the important uses of an inner product is to test whether two vectors are at straight angles:

The vectors a and b are said to be orthogonal if (a,b) = 0.
Triangle and Cauchy-Schwartz inequality

The triangle inequality is a trivial statement that the length of the sum of two vectors is less than the sum of the lengths,

\|a + b\| ≤|a + b|.

From the triangle inequality we can prove the Cauchy Schwartz inequality

Theorem 2.2. For any two vectors a, b, we have

|(a,b)|≤ ab.

Proof. The proof is simple. Consider (a + xb,a + xb,), which is ≥ 0. Minimise this with respect to x, and we find a minimum for x = −(a,b)∕\|{b\|}^{2}. At that point the function takes on the value {a}^{2} − {(a,b)}^{2}∕{b}^{2} ≥ 0. Multiply this by {b}^{2} and take the square root at both sides for the desired result. □

Orthogonalisation and Orthogonalisation

There are two ways to turn an arbitrary set of vectors into an orthogonal set–one where every pair of vectors is orthogonal–, or even better orthonormal set–an orthogonal set where each vector has length one.

The most traditional approach is the Gramm-Schmidt procedure. This procedure is defined recursively. In the mth step of the algorithm one defines the vector {e'}_{m} that is orthonormal to the m − 1 orthonormal vectors defined in previous steps. Thus

\eqalignno{ 1 : &{e''}_{m} ={ e}_{m} −{\mathop{∑ }}_{i=1}^{m−1}({e'}_{ i},{e}_{m}){e'}_{i}; & & \cr 2 : &{e'}_{m} ={ e''}_{m}∕\|{e''}_{m}\|. & & }

The first line above removes all components parallel to the m − 1 previous normalised and orthogonal vectors (check!), the second step normalises the result, so that {e'}_{m} is normalised.

A more modern approach (based on extensive use of numerical linear algebra) is based on the construction of the “overlap matrix” (also called norm matrix, which is why we use the symbol) N, with entries {N}_{ij} = ({e}_{i},{e}_{j}). This matrix is Hermitian (or symmetric if the basis is real), and we can now define a matrix {N}^{−1∕2}, such that {N}^{−1∕2}N{N}^{−1∕2} = I. This can then be used to define the orthonormal basis

{ e'}_{k}^{(i)} ={ ({N}^{−1∕2})}_{ kl}{e}_{l}^{(i)}.

For a real symmetric matrix M (and similarly for a Hermitian one, but we shall concentrate on the first case here) we can define matrix powers in a simple and unique way by requiring that the powers are also symmetric matrices.
The easiest way to get the result is first to diagonalise the matrix M, i.e., to find its eigenvalues {λ}_{i} and eigenvectors {e}_{j}^{(i)}. We can then write M = O\mathop{diag}\nolimits (λ){O}^{T}, with O the matrix with the normalised eigenvectors as columns, {O}_{ij} = {e}_{i}^{(j)}. The eigenvectors are orthonormal, and thus {O}^{T}O = I. The matrix \mathop{diag}\nolimits (..) has the eigenvalues λ on the diagonal, and is zero elsewhere. (Convince yourself that {O}^{T}O = I and {O}^{T}MO =\mathop{ diag}\nolimits (λ).)
We then define arbitrary powers of M by

{M}^{a} = O\mathop{diag}\nolimits ({λ}^{a}){O}^{T}.

Question: Show that {M}^{a} is a symmetric matrix

Orthonormal basis functions: For discrete2 vector spaces one can always choose an orthonormal set of basis functions satisfying

({e}_{i},{e}_{j}) = {δ}_{ij}.
(2.5)

Here we have introduced the Kronecker delta {δ}_{ij}, defined for integer i,j. This object is zero unless i = j, when it is 1.

For such an orthonormal basis the completeness relation can be written as
{\mathop{∑ }}_{i}{({e}_{i})}_{a}{({e}_{i})}_{b} = {δ}_{ab}.
(2.6)

2.2.2 Questions

2.

Use the definition of independence to show that Eq. (2.1) holds for any set of independent functions.