Final Exam Solutions, Math. 221 (s. 02), 11 May 2004

[This may now be the finished product -- more details will be coming if required.]
[For best reading, widen your window so that this entire line is visible, and no horizontal scroll bar shows up.]

1. The system of equations

x + y + 2z = a
x + z = b
2x + y + 3z = c

reduces, straightforwardly, to (among other things) the system

x + z = b
y + z = a − b
0 = c − (a + b).

This system is inconsistent -- i.e., has NO solutions, answering (iv) -- when c − (a + b) ≠ 0, i.e., when c ≠ a + b.
But when the system IS consistent, i.e., when c = a + b, there are infinitely many solutions (answering (i)), all with

z arbitrary,
y = (a − b) − z ( = a − (b + z)), and
x = b − z.

As for (iii) and (ii), there are no values of c left for which the system can have exactly one solution; and it's never the case that a system of linear equations can (as some exams hoped) have more than one solution, but not infinitely many -- cf. page 4 (reddish display announcement) or page 59 (Theorem 1.6.1).

2. Don't let the reciprocals of the variables throw you: just treat 1/x, 1/y, and 1/z, as unknowns in their own right, and solve

3/x + 2/y − 1/z = −15
5/x + 3/y + 2/z = 0
3/x + 1/y + 3/z = 11

for 1/x, 1/y, and 1/z. Then invert once more, to find x, y, and z. Standard techniques yield

1/x = −4, 1/y = 2, and 1/z = 7, whence x = −1/4, y = 1/2, and z = 1/7.

3. (i) Trace(A +/− B) = Trace(A) +/− Trace(B): always true, cf. Ex. 1.3.23.

(ii) Trace(AB) = Trace(A)Trace(B): not true in general. Fails, for example, when A = B = the 2×2 matrix with top row (0, 1) and bottom row (1, 0). Here AB = I_2×2, which has trace 2; but A and B each have trace 0.
(iii) Trace(AB) = Trace(BA): always true. Here's why.
Both those traces turn out to be equal to the sum of ALL the entries in the matrix C mentioned in the suggestion.

For example, Trace(AB) is the sum of all the entries on the principal diagonal of AB. Now the first such entry is the sum of all possible products a_1j b_j1 (right? top row of A "zipped" against the first column of B), i.e., the sum of the entries in the top row of the matrix C mentioned in the suggestion; the second such entry is the sum of all the products a_2j b_j2, i.e., the sum of all the entries in the second row of that matrix C; and so on, down the line.

Thus the trace Trace(AB) of AB turns out to be the sum got by adding together all the various sums of all the entries in any given row of C, i.e., it's just the sum of ALL the entries (in all the rows) in C.

Trace(BA) admits an analogous reinterpretation: each entry along the diagonal of BA is equal to the sum of all the entries in the corresponding column of C; and so Trace(BA) = the sum of all those column-sums, i.e., again, the sum of all the entries in C. QED

4. Matrices of type A will (i) always have determinant zero, and hence (ii) never be invertible. To understand why, notice for example that the row space of such an A can never have dimension more than 4, because elementary row manipulations just among the first three rows alone will surely already lead to a row of zeros, so that rank(A) < 4.

Or notice that each of the 5! = 120 different products of 5 selected entries (selecting one from each row and from each column) will have to include as a factor one of the later 0 's from at least one of the first three rows.

Or, yet again, try evaluating the determinant by blocks: subdivide the matrix by separating column 2 from column 3 and row 2 from row 3: because of the 2×3 block of zeroes at the upper right, the determinant of a type A matrix will be the product of the determinant of the upper left 2×2 block with the determinant of the lower right 3×3 block; but the latter block, because its top row is all-zero, has determinant zero, hence so does A.

Matrices of type B will likewise (i) always have determinant zero, and hence (ii) never be invertible. To understand why, notice for example that the only way one of the 5! = 120 different products of 5 selected entries (selecting one from each row and from each column) can try to be non-zero is by selecting a non-zero entry from each row. But once the only non-zero entry in the second row has been selected, it's no longer possible to use a non-zero entry from the fourth row, as that lies in the same column as the second-row choice. So all the 5! products contributing to the determinant are zero, and hence so is the determinant itself.

Or again, by means of a suitable row operation, involving only subtracting from one of rows 2 or 4 an appropriate multiple of the other, a matrix of type B can be reduced to one having an all-zero row, hence has determinant zero.

Matrices of type C, on the other hand, being upper-triangular, will have as determinant whatever the number is at the intersection of row 3 with column 3, and will be invertible whenever that number ≠ 0.

5. To say the columns of A are all mutually perpendicular and of length one is to say that A^T A = I.
But for square matrices like A, any "left inverse" (like A^T) must be a "right inverse" as well -- A A^T = I -- and that equation, spelled out entry by entry, lets you read off that the various rows of A all have length 1 and are mutually perpendicular.

6. (i) If λ is an eigenvalue for A, with corresponding eigenvector x,
so that Ax = λx, then AAx = A·λx = λ·Ax = λ·λx = λ²x.
On the other hand, if A is idempotent, then AAx = Ax = λx.
Combining both relations, we obtain λ²x = λx , from which,
because x ≠ 0, we may infer λ² = λ.
And from this it follows that λ must = 0 or = 1.

(ii) (I − A)(I − A) = I(I − A) − A(I − A) = (I I − I A) − (A I − A A) = I − A − A + A = I − A .

(iii) If A is idempotent and not 0, then A has at least one non-zero column. Claim: such a column -- call it c -- must satisfy Ac = c. Reason: if column # j, for example, is the non-zero column we're calling c, and if we write (as usual) e_j for the column vector with a 1 in position # j and 0 's in all remaining positions, then c = Ae_j, and so Ac = A Ae_j = AA e_j = Ae_j = c, as claimed.
But then c is an eigenvector for A linked with eigenvalue 1.

If instead A is not I, then I − A is a non-zero idempotent and, by the same reasoning as for A, also has eigenvector/eigenvalue pair (v, 1), i.e., (I − A) v = v. But this means v − Av = v, or Av = 0, i.e., v is an eigenvector for A linked with eigenvalue zero.

(iv) Where x = Az and y = (I − A)z, we have:
Ax = AAz = Az = x.
Ay = A(I − A)z = Az − AAz = Az − Az = 0.
Similarly, (I − A)x = 0 and (I − A)y = y , while x + y = z .
Send e-mail if you can't discover why x and y, if both non-zero, must then be linearly independent. (Or see Theorem 7.2.2.)

(v) The column space of A, the eigenspace for A corresponding to eigenvalue 1, the null space of I − A, the eigenspace for I − A corresponding to eigenvalue 0, and even the range space of A, are all the same. The column space of I − A, the eigenspace for I − A corresponding to eigenvalue 1, the null space of A, the eigenspace for A corresponding to eigenvalue 0, and even the range space of I − A, are likewise all the same.

(vi) For idempotent n×n matrix A, either A = I, or A = 0, or every vector in Rⁿ can (as in (iv)) be expressed as the sum of a vector in the eigenspace for 0 and a vector in the eigenspace for 1. In the first two (special) cases, A is already diagonal. In the general case, the eigenvectors for A clearly span Rⁿ (it's (iv) that tells you this), and so A can be diagonalized by means of any matrix P whose columns are all the vectors in a basis for the eigenspace for eigenvalue 1, followed by all the vectors in a basis for the eigenspace for eigenvalue 0. [The diagonalized form of A is then a matrix with a 1 in each of the first rank(A) positions along the main diagonal, and a 0 everywhere else.]

Feedback. Too many thought that identity matrices and all-zero matrices were the only idempotent matrices possible. Just to illustrate how many and varied idempotents can be, here are a few (add your own matrix brackets, please):

1 0 0 1 0 0 1 17 0 1/3 1/3 1/3 0 1/2 1/2 0 1 0 0 0 0 1/3 1/3 1/3 0 1/2 1/2 0 0 0 0 0 1 1/3 1/3 1/3

7. (i) If λ is an eigenvalue for M, with (non-zero) eigenvector x, we have Mx = λx, whence also
M² x = M Mx = M(λx) = λ Mx = λλx = λ²x, and, continuing the same reasoning, M^mx = λ^mx.
But since M^m = 0, this means λ^mx = 0. But since x is non-zero, this in turn means λ^m = 0, whence λ = 0.

(ii) Remember that the column space ( = range space) of a product matrix AB is always a subspace of the column space ( = range space) of the first factor A. Consequently, the column space of M^k+1 = M^k M is always a subspace of the column space of the first factor M^k.

(iii) Suppose M^k = M^k+1. Of course we don't know how k compares with m, so we have to take all possibilities into account. The possibility k > m is an easy case, for then k − m > 0 and M^k = M^m M^k−m = 0 M^k−m = 0, as desired.
Even easier are the cases k = m or k = m − 1: in the former, we have M^k = M^m = 0 directly, while in the latter, we have almost as simply: M^k = M^k+1 = M^(m−1)+1 = M^m = 0.
There still remains only the possibility k < m − 1. Here, after observing, for use in a moment, that (m − 1) − k > 0, we notice that, after multiplying both sides of the hypothesis M^k = M^k+1 first by M, then by M², ..., and finally by M^(m−1)−k, we can obtain the additional equalities
M^k+1 = M^k+2, M^k+2 = M^k+3, ..., M^{k+((m−1)−k)} = M^{k+((m−1)−k)+1} = M^m = 0, which, when combined, yield
M^k = M^k+1 = M^k+2 = M^k+3 = ... = M^{k+((m−1)−k)} = M^{k+((m−1)−k)+1} = M^m = 0. This completes the proof.

(iv) We know from (ii) that, for each integer k > 0, the column space of M^k+1 is always a subspace of the column space of M^k, and that the column space of the matrix M^m = 0 is the trivial space {0}. Let, say, κ be the first of the integers k ≥ 1 for which the column space of M^k+1 coincides with the whole column space of M^k (we know there are such integers because m is an example of one, the column spaces both of M^m = 0 and of M^m+1 being the trivial space {0}). Thus the column spaces of M^κ and M^κ+1 are the same. Moreover, the column space of M^κ+2, being the range space of M^κ+2 = M M^κ+1, is the collection of all (column) vectors Mx obtained as x ranges over the column space of M^κ+1, and this, of course, is the same as the collection of all (column) vectors Mx obtained as x ranges over the column space of M^κ, i.e., is the same as the range space (column space) of M^κ+1. Thus the column spaces not only of M^κ and M^κ+1 but also of M^κ+2 all are the same. Continuing in the same way, we can see that all the matrix powers M^κ+i (i = 0, 1, 2, ..., m) have the same column space. And since the last of these column spaces must be a subspace of the trivial column space of M^m = 0, hence itself trivial, we deduce that already the column space of M^κ must have been trivial, as well, which can only signify that M^κ = 0.
It remains only to explain why κ ≤ n. For this, notice that each time we move from the column space of Mⁱ, for 1 ≤ i ≤ κ, to that of Mⁱ⁺¹, the dimension must be dropping by at least 1 (for otherwise those two column spaces would have to coincide, which, by the way κ was selected, does not happen yet for i < κ. Moreover, the column space of M itself has dimension only < n (remember, M has 0 as an eigenvalue, hence has non-trivial null-space, hence has rank < n; but the rank of M is the dimension of the column space of M; this dimension being < n is another reflection of the fact that, with rank < n, M has range ≠ Rⁿ). Putting this together, we see that the number κ of steps needed to reach the trivial column space of M^κ cannot exceed n, i.e., κ ≤ n, as desired.

Feedback. Too many thought that the only way a matrix could be nilpotent would be by having all its entries zero. Just to illustrate how many and varied (and non-zero) nilpotent matrices can be, here are a few (add your own matrix brackets, please):

0 0 0 0 0 0 2 0 −2 0 −1/2 1/2 1 0 0 0 0 0 0 −1/2 1/2 9 5 0 2 0 −2

8. (i) The lengths are (all of them): √(1 + 1 + 1) = √3 .
(ii) The dot product of any one of these with any other is: −1 + 1 − 1 = −1,
and so the angle between any two of them has cosine (−1)/(√3)² = −(1/3) .
(That's π/2 radians (or 90°) wider than the smaller acute angle in a (1, √8, 3) right triangle,
which is to say, an obtuse angle of roughly 109° or about 1.9 radians.)

9. When z is a zero vector in the sense specified here, z + z' = z'.
Likewise, when z' is such a zero vector, z + z' = z.
Thus when both z and z' are zero vectors, z' = z + z' = z, as desired.

10. Some of the various topics regretted as absent from the problems were:
• cross products in R³;
• geometric transformations in low dimensions and/or their relevance to computer graphic display methodology;
• linear transformations in general;
• linear (in)dependence, spanning, basis;
• least squares approximations;
• explicit diagonalizations;
• orthonormalizing a basis;
• complex numbers;
• finding inverse matrices; and
• explicit row-reductions.

Last revised: 17 May 2004.
Formulae breaking awkwardly across lines?
Let me know which, please, and I'll knit them together with " " 's, where needed.