Physics 213 9-12-03 Friday READING FOR THE NEXT FEW LECTURES: - Chapter 5 of Strogatz. We will be then covering some material in Chapter 6 so you can go ahead and read that when you finish with Chapter 5. Our discussion will be more general and somewhat more advanced than Strogatz, e.g., we will consider the general case of dynamical systems with N variables while Strogatz discusses mainly the case for N=2 in Chapter 5. - Chapter 2 of Ott's book has a lot of great information about the properties of 1d maps but goes into greater depth and with more advanced analysis than what I want to cover in this course. You should at least browse through this chapter. - Chapter 1, especially Section 1.5, of the book "Applied Mathematics" by Gilbert Strang is likely a good place for many in the class to review succinctly and without too many mathematical details the linear algebra associated with linear dynamical systems. - For pure entertainment, track down the article "The Mountains of Pi" by R. Preston in the New Yorker, March 2, 1992, pages 36-67. The article is about two top Russian mathematicians, the Chudnovsky brothers, living in New York City and who became obsessesed whether the digits of the number pi = 3.141592653589793238462643383279502884197.... were NORMAL, i.e, occurred with equal frequency. They spent a small fortune building a large parallel computer in their tiny NYC apartment and made much progress but were not able to settle this still open question by their heroic numerical study. The relevance of this article to our course is our discovery last lecture that the chaos of the Bernoulli shift map directly arose from the "randomness" of "typical" initial conditions. Mathematicians in the area of number theory have been able to prove that the digits of most numbers are normal but the proof is indirect and it is exceedingly hard to prove normality for a particular irrational number like pi. MATHEMATICA DEMO OF THE BERNOULLI SHIFT MAP - From the Miscellaneous Files section of the course web page, http://www.phy.duke.edu/~hsg/213/misc you can download and play with the Mathematica notebook plotshiftmap.nb, which is a modification of the plotlogisticmap notebook except that I have replaced the logistic map with the map 2x mod 1. In Mathematica, this can be simply implemented with an If statement: shiftmap[ x_ ] := If[ x < 0.5 , 2 x , 2 x - 1 ] which returns the value 2x if x <= 1/2, otherwise returns the value 2x-1. - If life were only that simple. If you only make the above change in the plotlogistic notebook and try some iterations for various initial values in the interval [0,1], you will make the discovery that all initial conditions iterate to the constant sequence 1, 1, 1, ... after about 50 to 60 iterations. The problem is that Mathematica treats all floating point numbers as doubles, i.e., roughly 64-bit binary numbers and so after about 60 binary shifts, there are no digits left to use. - I will show you briefly in lecture, using the plotshiftmap.nb notebook, how to tell Mathematica to use an arbitrarily large number of digits in all of its calculations. You then see the expected behavior of periodicity for certain initial conditions, chaos for others, and you will have gained in wisdom about how to be careful about the number of significant digits used in a Mathematica calculation. For calculations related to nonlinear dynamics, it is extremely rare that one needs to use more than the default 16 decimal digits, but because of the sensitivity to initial conditions, you should know how to take advantage of Mathematica's ability to calculate with many digits. LORENZ EQUATIONS JAVA DEMOS - At the beginning of the previous lecture, I had meant to take some time out to show you some solutions of the Lorenz equations and also show you directly how the local maxima of the z(t) variable lead to the cusp-like 1d map that Lorenz then used to argue for the existence of chaos in his model. We could certainly do this using Mathematica, but my colleague Mike Cross at Caltech has written some nice Java demos that make it easier to see these points. So I encourage you to go to these links: http://www.cmp.caltech.edu/~mcc/Chaos_Course/Lesson1/Demo3.html http://www.cmp.caltech.edu/~mcc/Chaos_Course/Lesson1/Demo7.html The first plots the x(t) and z(t) time series for given parameter values and given initial conditions of the Lorenz equations. The default values are the ones that lead to the chaos discovered by Lorenz. Demo7 plots successive pairs of points (zmax[i],zmax[i+1]) and you can see the cusp-like curve form in real time. If you change the value of the parameter a (the main bifurcation or control parameter for the Lorenz equations) to a new value, say 24, the map develops parts that have slopes less than one in magnitude and indeed the nontransient dynamics is no longer chaotic for this parameter value. LINEARIZING ABOUT FIXED POINTS OF N-DIMENSIOTAL FLOWS AND MAPS - I argued in class a fact that I hope you are familiar with from an undergraduate ode course: that any finite system of odes of any order can be rewritten to be of first-order form: dU/dt = F(U) , where: - U(t) = ( u_1(t), ... u_N(t) ) is a N-dimensional vector of the variables u_i(t) that are evolving over time - F(U) = ( f_1(U), ... f_n(U) ) is a N-dimensional vector function, each of whose components f_i(N) depend on the vector U - The function F(U) does not depend explicitly on time, i.e., the variable t does not show up in any of the components f_i(U). Odes that do not depend explicitly on time are called _autonomous_, otherwise they are called _non-autonomous_. - Then if U* is a fixed point of this flow so that: F(U*) = 0 , then a sufficiently small perturbation of U*, V(t), such that U* + V(t) is a solution of the above flow, will evolve according to the following constant coefficient linear flow: dV/dt = J[U*] . V , where J[U*] is the NxN Jacobian matrix of the vector function F(U), evaluated at the particular fixed point U*. I am using the Mathematica notation here, M.V means multiplication of a matrix M into a vector V, to give another vector. - The key fact you need to use here is that the lowest-order linear term in a multivariable Taylor expansion of a vector function F(U) involves the Jacobian matrix of F: F(U + E) ~= F(U) + J[U].E + h.o.t.s , where E is a "tiny" vector in the sense that we can ignore all higher-order terms beyond the linear one. - Similarly, you should be able to show that a small perturbation V[i] of a fixed point U* of a N-dimensional map: U[i+1] = G( U[i] ) , will evolve according to the constant coefficient _linear_ map: V[i+1] = J[U*] V[i] , where again we see the NxN Jacobian matrix J[U*] of the vector function G(U), evaluated at the fixed point U*. SOLUTIONS OF CONSTANT-COEFFICIENT LINEAR DYNAMICAL SYSTEMS - The linear stability of fixed points is so important from a practical point of view that I would like to give you the big picture of how to find and then study the linear stability of fixed points of arbitrary maps and flows, that is for evolution equations that describe N separate variables all evolving at the same time. This will kill two birds with one stone since it will also give us a chance to review some math that we will need many times through the rest of the semester, and prepare us for thinking about dynamics in a N-dimensional phase space, which is the modern global valuable way of thinking about dynamical systems. - As a first step, let's discuss one large class of linear problems whose dynamics can be worked out completely by analytical means. These are the constant-coefficient flows and maps of the form: dX/dt = M X(t) , and X[n+1] = M X[n] . Here X(t) or X[n] is a N-dimensional vector of real numbers, and M is a constant NxN matrix of real numbers. EXAMPLE: The familiar harmonic oscillator x'' = - x can be written in this form: d/dt ( x1, x2 )^T = ( 0 1 ) ( x1 ) ( -1 0 ) ( x2 ) , where the X(t) = (x1(t),x2(t))^T, x1= x(t) and x2=x'(t). Please verify this for yourself. AFFINE DYNAMICAL SYSTEMS IN TERMS OF LINEAR DYNAMICAL SYSTEMS - A technical aside: If we can solve the purely linear problem, we can also solve the affine problem in which a constant vector B is added to the right side. X[n+1] = M X[n] + B , X' = M X + B For a map, consider a constant solution X[n]=Y that is independent of the index n. Then: Y = M Y + B or (I - M) Y = B , where I is the NxN identity matrix. This is a NxN set of linear equations for the unknown vector Y, given a right side vector B, and can be solved provided the vector B lies in the column space of the matrix I - M. If we simply add the two solutions: Y = M Y + B X[n+1] = M X[n] , we see by linearity: (X[n+1] + Y) = M ( X[n] + Y ) + B , and so X[n] + Y is a solution of the affine system if X[n] is a solution of the linear system. - Similarly, we can solve the affine flow X' = M X + B by first trying to find a time-independent solution Y: 0 = M Y + B , which is given by solving a set of linear equations: M Y = - B . Given this solution, superposition gives the general solution: Y' = M Y + B X' = M X so (X + Y)' = M (X + Y) + B , the general solution is X(t) + Y, where Y = M^{-1} B. JUSTIFICATION FOR STUDYING ANALYTICS OF LINEAR DYNAMICAL SYSTEMS - Although the world is clearly not completely described by constant-coefficient linear dynamical systems, we can learn a lot by studying their analytical solutions: - We will see that this class of models is already capable of quite complex behavior in time. What behavior do nonlinear systems have that this class of models does not? - We can also identify what kind of changes will take place in time as parameters are varied (bifurcations). - The mathematics will be essential for our discussion of linear stability of constant solutions of maps and flows, which, in turn, is essential for our discussion of transition to more complex dynamical behavior such as chaos. GENERAL PICTURE FOR SOLUTIONS OF LINEAR DYNAMICAL SYSTEMS - The qualitative behavior of the general solutions to constant-coefficient maps or flows is easily summarized: Any particular variable x(t) that is part of a linear dynamical system is just a superposition of solutions of one- or two-dimensional linear dynamical systems solutions, each corresponding to the behavior that we have already derived for driven damped oscillators or one-dimensional maps: - exponential growth or decay, like dx/dt = -(1/t0) x; - exponentially growing or decaying oscillations, like the damped harmonic oscillator, i.e., Exp[-t/t0] Cos[w t]; - pure oscillations with no growth or decay, just like the harmonic oscillator x'' + w^2 x = 0, with solution Cos[w t + phi]. Each part of the linear superposition can have its own separate decay constant t0 and angular frequency w, so such superpositions can look quite complicated! ANALYTIC SOLUTION OF CONSTANT-COEFFICIENT LINEAR SYSTEMS: - I would like to justify for you now the above qualitative statements, that the solution of any linear dynamical map or flow is just a superposition of terms of the form: Exp[a t] Cos[b t + p] , flows at time t or d^k Cos[b k + p] maps at discrete time t=k for different "growth" constants a and d, angular freqs b, and phases p, and integers k. The question is where do the numbers a, b, d, and p come from, given the flow dX/dt = M*X, or the map X[i+1] = M*X[i]? The answer is from the eigenvalues and eigenvectors of the matrix M. - Besides providing a valuable example, constant-coefficient linear systems will reappear in our discussion of LINEAR STABILITY THEORY of constant and periodic states, one of the most practical tools in nonlinear dynamics. It is very important to understand under which conditions ALL superpositions will decay to a zero state. - We have already discussed the possible dynamical behavior of the simplest one-dimensional constant- coefficient linear maps and flows, x[i+1] = c x[i] , and x' = c x . For the map, we get geometric growth or decay, oscillatory geometric growth or decay, or pure oscillations (|c|=1). For the flow, we get exponential growth or decay, or constant behavior (c=0). GENERAL SOLUTION FOR N-DIMENSIONAL MAPS - What happens in higher dimensions, when there are more and more coupled linear equations? We can work out the behavior by looking at the eigenvalues and eigenvectors of the real matrix M. Let us first consider the case of N-dimensional linear maps: X[i+1] = M X[i] , where X[i] is the vector with N components: X[i] = ( x1[i], x2[i], ..., xN[i] ) , and where M is a NxN real matrix with elements M_{ij}. Whenever one sees a matrix M multiplying any vector X, a natural strategy (if you have really appreciated the essence of linear algebra) is to expand the vector X in terms of the N eigenvectors E[i] of the matrix M. This is a good strategy because matrices have a particularly simple behavior when acting on an eigenvector: M E[i] = e[i] E[i] , (*) we simply get the same vector back, but scaled by the eigenvalue e[i]. Note that the eigenvalues e[i] can be complex numbers since real matrices like M can have complex eigenvalues, e.g., the 2x2 matrix 0 1 -1 0 associated with the harmonic oscillator has eigenvalues +/- I, where I is a square root of -1. Technical aside: If e[i] = a[i] + I b[i] is a complex eigenvalue of a real matrix M, then e[i]* = a[i] - I b[i] , the complex conjugate of e[i], must also be an eigenvalue. So complex eigenvalues of real matrices come in conjugate pairs. You can see this pretty easily as follows: each eigenvalue e of the matrix M must satisfy be a root of the characteristic polynomial: p(e) = det(M - eI) . But p(e) is a polynomial of degree N with REAL COEFFICIENTS since its coefficients are just sums of products of matrix elements of M. If you take the complex conjugate of p(e)=0 and use the fact that the operation of complex conjugation distributes across addition and multiplication, you should easily see that: 0 = Conjg[0] = Conjg[ p(e) ] = p( Conjg[e] ) = 0 , so both e and Conjg[e] must be roots together. - If we can make the following crucial assumption: The N eigenvectors of the NxN matrix M are linearly independent, then we can expand the initial vector X[0] (in fact, we can expand ANY vector) as a linear combination of eigenvectors of the matrix M as follows: X[0] = Sum[ c[i] E[i] , {i,1,N} ] . A matrix for which its eigenvectors are linearly independent is called a NON-DEFECTIVE matrix. Any matrix whose eigenvalues are all distinct is non-defective and this is the most typical (probable) situation. An example of a defective matrix is the 2x2 matrix: 1 1 0 1 , which has only a single eigenvector and so can't span 2-space. In the above superposition, the coefficients c[i] are either real numbers or one member of a pair of complex conjugate numbers. Similarly, the corresponding eigenvectors E[i] either have real components or complex components, with another eigenvector being a complex conjugate match. - Assuming that matrix M is non-defective (or, more restrictedly, that its eigenvalues e[i] are all distinct), we can write the vector X[i] at time step i as a linear combination of the eigenvalues E[j] of matrix M: X[i] = Sum[ c_{i,j} E[j] ] , where c_{i,j} are coefficients, possibly complex valued. Using this superposition, we can rewrite the dynamical equation X[i+1] = M X[i] in terms of the eigenvectors: X[i+1] = Sum[ c_{i+1,j} E[j] ] (**) = M X[i] = M Sum[ c_{i,j} E[j] ] = Sum[ e[j] c_{i,j} E[j] ] . (***) Since the eigenvectors E[j] are linearly independent, the expressions (**) and (***) can be equal if only if each coefficient is equal and so we deduce: c_{i+1,j} = e[j] c_{i,j} , which is a ONE-dimensional map of the form y[i+1] = c y[i] for which we immediately know the analytical solution: c_{i,j} = e[j]^i c_{0.j} . So the general analytical solution for the dynamical system X[i+1] = M X[i] is a superposition: X[i] = Sum[ c_{0,j} e[j]^i E[j] ] , with coefficients c_{0,j} that are obtained by calculating the superposition that represents the initial state X[0]: X[0] = Sum[ c_{0,j} E[j] ] . - What does a single component of the real vector X[i] look like? Such a component would be a typical time series measured in an experiment or computed in some simulation. Since the components of the N-dimensional vectors E[j] are just numbers (possibly complex), the kth component x[i;k] of X[i] looks like this: x[i;k] = Sum[ c[j] E[j;k] e[j]^k , {i,1,N} ] , where E[i;k] denotes the kth component of the ith eigenvalue. The number c[j] E[j;k] can be treated as a single number d[j], so a typical component has the form: x[i;k] = Sum[ d[i,j] e[j]^k , {i,1,N} ] . Each component can have a quite complicated time dependence since the eigenvalues can have magnitude less than one (decaying), magnitude greater than one (growing), magnitude equal to 1 but complex so that the phase advances in a periodic way. - If the eigenvalue e[j] is real, d[j] is real and we get the same behavior as in the 1d map x[i+1] = e[j] x[i]. If the eigenvalue e[j] is complex, we need to add a complex conjugate sister to get a real quantity: d[j] e[j]^k + Conjugate[ d[j+1] e[j+1]^k ] , where I have assumed for simplicity that the pair (j,j+1) are the complex conjugate pair. To get a simple expression, it is useful to work with amplitudes and phase, rather than with complex numbers directly. We can write the complex number d[i] as: d[j] = d Exp[I p1] , in terms of a real amplitude d and real phase p1, and similarly we can write the complex eigenvalue e[j] as: e[j] = e Exp[ I p2 ] . Then d[i] e[i]^k + Conjugate[ d[i+1] e[i+1]^k ] , = ( d Exp[I p1] ) ( e Exp[I p2] )^k + c.c. = 2 Real[ d e^k Exp[I(p1 + k p2) ] = 2 d e^k Cos[ p2 k + p1 ] . The notation "c.c." means "complex conjugate" and is heavily used in the physics literature. - CONCLUSION SO FAR: So the ith term that contributes to the time series x[i;j] grows as e^i where e is the MAGNITUDE |e[j]| of the eigenvalue and oscillates with angular frequency p2, which is the PHASE of the eigenvalue e[j]. This justifies my claim above, about what a general term in the component of a linear map looks like. - We can see geometric growth, geometric decay, growing oscillation, decaying oscillation, and pure oscillation, all with different frequencies involved, determined by the SPECTRUM (set of eigenvalues) of the matrix M. Decay or growth along a particular eigenvector direction is determined by whether the corresponding eigenvalue e[i] has magnitude less than or larger than one: | e[j] | < 1 , decay criterion for maps Pure oscillation occurs for eigenvalues with magnitudes exactly equal to one, rather special values. CONDITION FOR ZERO NON-TRANSIENT IN LINEAR MAPS X[i+1] = M X[i]: - To summarize so far: If M is a NxN non-defective matrix and e[i] are the N eigenvalues of the matrix M, then if the following condition holds: | e[i] | < 1 , for 1 <= i <= N , then every initial condition leads to an orbit X[i] that decays to zero, i.e., only a zero non-transient solution exists. This condition pops up so often when discussing matrices that mathematicians and numerical analysts have introduced a definition for the maximum magnitude of the eigenvalues of a matrix, this is the so-called SPECTRAL RADIUS rho(M) of the matrix M. (Recall that the set of eigenvalues of a matrix or of some linear operator is called its "spectrum".) So by definition: rho(M) = Max[ | e[i] | ] . Then the condition for all initial conditions to decay to zero is that the spectral radius is smaller than unity: rho(M) < 1 . It is easy to calculate the spectral radius of a NxN numerical matrix using Mathematica; it is given by the following expression: Max[ Abs[ Eigenvalues[ M ] ] ] To test this, consider the 2x2 diagonal matrix mx = { {2, 0}, {0, -3} } , which has eigenvalues e[1]=2 and e[2]=-3 since the diagonal elements of any diagonal or triangular matrix are directly the eigenvalues of the matrix (hope you can prove this!). Then: Max[ Abs[ Eigenvalues[ mx ] ] ] does indeed give the value rho(M)=3. CONDITION rho(M) > 1 DOES NOT IMPLY DIVERGENCE OF AN ORBIT X[i] - Does the converse hold: if rho(M) > 1 is it necessarily the case that all nonzero initial values will lead to an orbit that blows up, diverging to infinity? The answer is NO for a reason that is quite important for the course later on. To see what is going on, let's consider a 2x2 matrix with one eigenvalue larger than one and one eigenvalue smaller than one in magnitude, e.g.: M = { {2, 0}, {0, 1/2} } , which has eigenvalues e[1]=2 and e[2]=1/2. Then the general solution of the linear map X[i+1] = M X[i] will have the form: X[i] = c1 2^i E1 + c2 (1/2)^i E2 , E1 = (1, 0), E2 = (0, 1) , where the eigenvectors are, in fact, just the unit vectors in the x and y directions, E1=(1,0) and E2=(0,1). But now you see that if one chooses an initial vector X[0] to be parallel to E2 and orthogonal to E1, c1=0 the initial condition will decay even though rho(M) > 1. However, if you choose a "typical" vector X[0], without special a priori knowledge of E1 and E2, then it will generally have a component along both E1 and E2 (c1 != 0) and will diverge as the powers of 2^i get larger and larger. More abstractly, for a square M with N distinct eigenvalues, one can define a STABLE linear subspace S as the subspace spanned by all eigenvectors E[i] whose corresponding eigenvalues cause decay. One can also define an UNSTABLE linear subspace U which is spanned by all eigenvectors E[i] whose corresponding eigenvalues cause growth. Then any initial vector in S will decay. GENERAL SOLUTION FOR A N-DIMENSIONAL FIRST-ORDER FLOW - The analysis for flows is actually somewhat simpler than that for maps and it is insightful to see the differences that arise. To derive a similar result for a linear flow dX/dt = M X , we again observe that the general vector solution X[t] at time t can be expanded in terms of the linearly independent eigenvectors of the matrix M (again assuming that M is non-defective!): X[t] = Sum[ c_i[t] E[i] , {i, 1, N} ] , where the coefficients c_i(t) are now functions of time. Substituting this expression into the linear flow on both sides, we get: dX/dt = Sum[ c_i'(t) E[i] , {i, 1, N} ] ' = time derivative here = M X = Sum[ c_[i] e[i] E[i] , {i, 1, N} ] . But again two linear combinations of linearly independent vectors E[i] can be equal only if their coefficients are equal (this is basically the definition of linear independence). This implies: c_i'[t] = e[i] c_i , which is a simple 1d flow that has the familiar solution: c_i[t] = c_i[0] Exp[ e[i] t ]. Putting this all together, we see that the general solution of X' = M X is given by: X(t) = Sum[ c[i] Exp[e[i] t] E[i] , {i,1,N} ] Note the qualitative difference between the linear map and linear flow: here we now have an exponential of the eigenvalue and time, rather than an algebraic power e[i]^k of the eigenvalue. - The eigenvalues e[i] can be complex, say of the form a + I b, where a = Real[e[i]] is the real part and b = Imag[ e[i] ] is the imaginary part. Writing: Exp[ e[i] t ] = Exp[ (a + I b) t ] = Exp[ a t ] Exp[ I b t ] , we see that the real part of the eigenvalue acts as the decay constant, while the imaginary part of the eigenvalue acts as an angular frequency. - From our general solution, a criterion that, for a flow, all initial conditions asymptotically to the zero vector is now that: max Real[ e_i ] < 0 , i.e., that the real part of all eigenvalues be negative, in which case each component independently decays to zero exponentially, although perhaps with oscillations if there are complex eigenvalues. - It is again straightforward to evaluate this criterion using Mathematica: Max[ Re[ Eigenvalues[mx] ] ] < 0 where Re[] is the MMa function that extracts the real part of a number. - A good discussion for those who want some review is given in Section 38 of "Differential Equations with Applications and Historical Notes", George F. Simmons, MacGraw-Hill, 1972. There is also an elementary and readable discussion given in Leah Edelstein-Keshet's book "Mathematical Models in Biology", where much of the above material is reviewed. NON-TRANSIENT NON-ZERO BOUNDED BEHAVIOR OF LINEAR MAPS AND FLOWS: PERIODICITY AND QUASIPERIODICITY - Our analytical expressions for the general solution X(t) of a linear flow or X[i] of a linear map has some interesting implications: - All initial conditions converge to zero if the spectral radius is less than one (or Re[e[i]] < 0 for all i for flows). - If these conditions are not satisfied, and at least one eigenvalue has magnitude greater than one (maps) or real part greater than zero (flows), then most initial conditions grow without bound in magnitude and diverge. This is unphysical and sick since most diverging quantities imply infinite amounts of energy in a finite volume. This explains why linear dynamical systems make lousy models of nature. For most parameter values, solutions either die out to zero or explode. - If the spectral radius is 1 (maps) or if Max[Re[e[i]]] = 0, a rare marginal case, then the non-transient dynamics will consist of purely oscillatory terms. There are then two kinds of behavior: - PERIODIC: X(t+T) = X(t) for some period T > 0 - QUASIPERIODIC, X(t) = Sum[ A[i] Cos[w[i] t + phi[i]] ] where at least two of the angular frequencies w[i] have an irrational ratio: w[i] / w[j] != p / q for any integers p and q These two kind of dynamical behaviors are found as solutions to general nonlinear dynamical systems and so are important categories of dynamical behavior. - A more rigorous definition of quasiperiodicity: A real-valued function f(t) is quasiperiodic if it can be written in the form: f(t) = g(w1*t, w2*t, ..., wn*t) , where g(x1,x2,...,xn) is a function that is periodic in each of the variables x1, x2, ..., xn with period 2Pi, where at least two of the frequencies are irrationally related. - The frequencies w1, w2, ... , as imaginary parts of eigenvalues, are functions of the matrix elements and so vary continously as any matrix element is varied. The ratio of two frequencies w1/w2 is then also a continuous function of matrix elements. This seems obvious but, as we will see, this condition does NOT hold for nonlinear dynamical systems for which the ratio w1/w2 can lock at a rational value over a finite interval of some parameter, i.e., rather extraordinarily, w1(p) and w2(p) can be smooth functions of some parameter p and yet their ratio can be locked to a rational ratio! This is called ENTRAINMENT or MODE LOCKING and is one of several fascinating non-obvious and common features of nonlinear dynamical systems. Mode locking is commonly observed in astronomy, e.g., the rotation frequency of a moon along its axis often mode locks to the rotation frequency of the moon around a planet. Some of the structure in Saturn's rings also arises from nonlinear mode locking. - We will study a bit later how to carry out a numerical Fourier analysis to determine QUANTITATIVELY from an empirical time series whether the time series is periodic, quasiperiodic, or nonperiodic. The tools that we will use are the Fast Fourier Transform and the POWER SPECTRUM. STABILITY OF FIXED POINTS FOR NONLINEAR MAPS AND FLOWS - I might as well tell you that you now know the general and quite important criterion for the linear stability of a fixed point that arises for a general map X[i+1]=G(X[i]) or general flow X'=F(X). Let's discuss the case for flows and I will let you work out the case for maps. LINEAR STABILITY CRITERION FOR FLOWS: Consider a N-dimensional flow dX/dt=F(X) with F(X) a differentiable N-dimensional vector function (field) of a N-dimensional vector argument X. Let X* be a fixed point of this flow so that F(X*)=0 and let Jacobian[F](X*) be the NxN Jacobian matrix of the vector field F(X) evaluated at the fixed point X*. Finally, let's assume that the Jacobian matrix at X* is non-defective and let e[i] be its N eigenvalues (not necessarily distinct). Then the fixed point X* is linearly stable to perturbations if and only if: Max[ Real[ e[i] ] ] < 0 , i.e., all the eigenvalues have negative real parts. The fixed point is unstable if at least one eigenvalue has positive real part and has indeterminate instability if the maximum real part is zero (in which case a nonlinear analysis of higher-order terms is needed to determine stability). Let's derive this result to get further experience in learning how to analyze N-dimensional mathematics. The arguments will also be quite important when we start discussing the more interesting and general case of more complicated dynamics besides fixed points. Let's consider an arbitrary flow dX/dt=F(X) (remember that all sets of odes of arbitrary order can be brought into this first-order autonomous form by introducing sufficiently many "dummy" variables) and assume that we know a fixed point X* such that: F(X*) = 0 Then dX*/dt = 0 and so X* is a constant solution, i.e., "fixed" in time and so a fixed point. Let's ask whether the constant fixed point X* is stable w.r.t. infinitesimal perturbations. We can proceed conceptually just as we did for one-dimensional flow x'=f(x). We choose an initial condition X0 = X* + EPS , that is a tiny perturbation of the fixed point X*, where the vector EPS = ( eps1, ..., epsN ) contains tiny numbers eps1,...,epsN representing an arbitrary tiny perturbation of X*. We can mathematically say that EPS has a small magnitude, i.e., its norm is small: || EPS || << 1 , where I use some norm ||.|| for measuring the size of a vector. ================================================================= NOTE: This would be a really good time to review your knowledge about vector norms. Such norms are any function ||X|| from some vector space V of vectors X to the real numbers that acts just like a length or size: || X || = 0 iff X = 0 only the zero vector has zero length || X || > 0 if X != 0 lengths are non-negative || c X || = |c| ||X|| scaling law for vectors by real number c || X + Y || <= || X || + || Y || triangle inequality A commonly used vector norm for vectors with real or complex numbers is the Euclidean norm, also called the 2-norm: || X || = Sqrt[ Sum[ X[i]^2 ] ] , which can be calculated in Mathematica as follows: twonorm[ x_ ] := Sqrt[ X . X ] For numerical data, especially in computer codes, it is often more convenient to use the maximum absolute value norm since this tells one directly the maximum absolute error in some expression: || X || = Max[ | X[i] | ] . On finite-dimensional vector spaces, all norms are equivalent in that convergence w.r.t. one norm implies convergence w.r.t. any other norm. For infinite-dimensional vector spaces, e.g., the vector space of continuous functions on the interval [0,1], this is NOT true and one has to be careful which norm one uses to measure convergence of a sequence (basically, whether the ith element has a size that goes to zero with increasing i). The max abs norm can calculated conveniently in Mathematica as follows: infnorm[ x_ ] := Max[ Abs[ X ] ] CHALLENGE: Prove that the maximum absolute value norm (also called the infinity norm) is in fact a norm. CHALLENGE: If M is a NxN nonsingular matrix and ||X|| is some norm of N-dimensional vectors X, is the function g(X) = || M X || a norm? If X is a N-dimensional vector is the function f(X) = |x_1| that is the absolute value of the first component of X a norm? NOTE: Norms are nonlinear functions of the vectors space X to the vector space of real numbers. ================================================================= Back to our discussion: we were starting to ask how an infinitesimal vector perturbation EPS around a given fixed point X* of the flow dX/dt=F(X) might evolve. The fixed point will be stable if ALL sufficiently tiny perturbations decay with time and is unstable if at least one sufficiently tiny perturbation grows with time. Let's try to get an equation for the orbit X(t) growing out of the perturbed fixed point X* + EPS. By continuity, we can assume that over short times, the orbit X(t) remains close to X* + EPS and so it is convenient to express the general orbit X(t) in terms of the distance Y(t) from the fixed point. So let's define: X(t) = X* + Y(t) , Y(0) = EPS where over short times, by continuity and smallness of EPS, we expect: || Y(t) || << 1 . Since X(t) is an orbit satisfying dX/dt=F(X), we have: dX/dt = F(X) or d(X* + Y)/dt = F(X* + Y(t)) . But Y(t) is tiny over sufficiently short times and so we should explore whether we can approximate the general expression F(X* + Y(t)) in terms of some Taylor expansion around the fixed point X*. But how do we do this? ================================================================= USEFUL TRICK: The lowest order VECTOR Taylor expansion of a function F(X + H) around the vector X for some tiny vector H (||H|| << 1) is given in terms of the Jacobian matrix of F: F(X + H) = F(X) + Jacobian[F](X) . H + O( ||H|^2 ) . where the ij-th element of the NxN Jacobian matrix is given by: Jacobian[F](X)[i,j] = partial(F_i) / partial(X_j) , i.e, its the partial derivative of the ith component of the vector function F(X) w.r.t. the jth component of the argument vector X. These are elementary but extremely important formulas for many areas of applied mathematics and physics. Note: The big-oh notation O( ||H||^2 ) means terms that are second-order in the magnitude of the small vector H, in the limit that this magnitude goes to zero. You should be able to work out what the quadratic term looks like, it will involve a tensor partial^2(F_i) / partial(X_j) partial(X_k) , with three indices and one folds this into a summation over H_j and H_k. Please also review your knowledge of big-oh notation which was invented to give some crude knowledge about limits. A function f(x) is big-oh of a functiong g(x) in the limit x -> x0: f(x) = O( g(x) ) as x -> x0 , if the ratio |f(x)/g(x)| is bounded in that limit. It is crucial that you realize that big-oh notation ONLY makes sense if some specific limit is indicated. Thus O(||H||^2) has the limit implied of H going to zero. ================================================================= Using the vector Taylor expansion, we can write: d(X* + Y)/dt = F(X* + Y(t)) ~= F(X*) + Jacobian[F](X*).Y + O(||Y||^2) Since F(X*)=0 and d(X*)/dt=0 by definition of X* being a fixed point, the vector Taylor expansion reduces to: dY/dt = M Y , where M = Jacobian[F](X*) , is the NxN Jacobian matrix of the vector field F(X) evaluated at the fixed point X*. But THIS IS A CONSTANT-COEFFICIENT LINEAR DYNAMICAL SYSTEM FOR THE PERTURBATION VECTOR Y(t) and so we know an analytical criterion for when Y(t) will decay to zero: Max[ Real[ e[i] ] ] < 0 , which is a condition on the eigenvalues of the Jacobian matrix. CHALLENGE: LINEAR STABILITY CONDITION FOR MAPS - Using the vector Taylor series to lowest linear order, derive the criterion for linear stability of a fixed point X*=G(X*) of the map X[i+1]=G(X[i]), which turns out to be: rho( Jacobian[G](X*) ) < 1 , i.e, the spectral radius (largest magnitude of all eigenvalues) of the Jacobian matrix of the vector field G evaluated at the fixed point X* must be strictly less than 1. ILLUSTRATION OF LINEAR STABILITY ANALYSIS FOR THE LORENZ EQUATIONS - As an example of the kinds of insights one can get from the linear stability of fixed points, let's consider a 3-dimensional flow called the Lorenz equations (named after Edward Lorenz, one of the world's great theoretical meteorologists): dX/dt = F(X) , X = (x(t),y(t),z(t)) where the components are given by: x' = - s x + s y , y' = - y + r x - x z , z' = - b z + x y . We will discuss later in the course where these equations come from and what they mean but for now, we just treat them as a particular example to work with. The fixed points X* of the Lorenz equations can be found by trying to solve: F(X*) = 0 , and you can easily see that there are three fixed points: X1* = ( 0, 0, 0 ) , X2* = ( Sqrt[b(r-1)], Sqrt[b(r-1)], r - 1 ) , X3* = ( -Sqrt[b(r-1)], -Sqrt[b(r-1)], r - 1 ) . We can study the stability of each fixed point analytically as a function of the three parameters s, r, b by evaluating the 3x3 Jacobian matrix Jacobian[F](X): ( -s s 0 ) ( r - z -1 -x ) ( y x -b ) at each fixed point in turn and studying when the maximum real part changes sign from negative to positive. You do this in the current homework assignment. There is a symmetry of the equations (x,y) -> (-x,-y) so it actually is ok just to look at one of the nonzero fixed points. The Lorenz equations have chaotic solutions for r=28, s=10, b=8/3. Let's look at the eigenvalues and eigenvectors of each fixed point for these parameter values: - ZERO FIXED POINT: the Jacobian matrix is: {{-10., 10., 0}, {28., -1, 0}, {0, 0, -2.66667}} which has the following approximate eigenvalue, eigenvector pairs as found by using Eigensystem e1 = -22.8 , E1 = ( -0.61, 0.79, 0. ) , e2 = -2.7 , E2 = ( 0., 0., 1. ) . e3 = 11.8 , E3 = ( 0.42, -0.91, 0. ) , Since Max[Real[e[i]]] = 11.8, the fixed point is strongly unstable. The stable subspace is the linear subspace (plane) defined by all linear combinations of E1 and E2 while the unstable subspace is the one-dimensional line defined by multiples of E3. - NONZERO FIXED POINT: Jacobian matrix is: {{-10., 10., 0}, {1, -1, -8.5}, {8.5, 8.5, -2.7}} when evaluated at fixed point X* = (8.5, 8.5, 27.) which yields following eigenvalue, eigenvector pair (from Eigensystems applied to the above matrix) e1 = -13.9 , E1 = ( 0.86, -0.33, -0.40 ) , e2 = 0.094 + 10.2 I , E2 = ( 0.27 + 0.30 I, -0.032 + 0.57 I, 0.72 ) e3 = 0.094 - 10.2 I , E3 = ( 0.27 - 0.30 I, -0.032 - 0.57 I, 0.72 ) The stable subspace around this fixed point lies in the direction of eigenvector E1 and anything aligned along this direction gets pulled in towards the fixed point. There is a two-dimensional unstable subspace U consisting of linear combinations of the eigenvectors E2 and E3. The imaginary part of the eigenvalue is strongly nonzero with a value of 10.2 in magnitude, corresponding to an oscillation with a period of about 2Pi/10.2 ~= 0.62. This is, in fact, the approximate peak-to-peak distance of a time series trace of any one of the Lorenz variables when plotted as a function of time. These eigenvalues and eigenvectors explain part of the butterfly-wings of the Lorenz attractor as commonly shown in textbooks. Initial conditions get pulled in towards the fixed point along the axis E1 until the orbit enters the plane defined by vectors E2,E3 at which point the orbit spirals out with angular frequency 10.2 until the lowest-order Taylor series description fails and one jumps from one fixed point to the other. BIFURCATIONS OF FIXED POINTS OF FLOWS AND OF MAPS - From our analysis of the linear stability of fixed points, we can predict when a fixed point will bifurcate to some other kind of dynamics and even some details of the transient behavior as an instability turns on. Everything boils down to the behavior of eigenvalues e[i] of the Jacobian matrix Jacobian[F](X*) evaluated at some particular fixed point of the flow or map involving the vector function F. The eigenvalues are functions of the matrix elements of Jacobian[F](X*) which, in turn, are functions of any parameters that appear in the equations. For flows, something interesting happens when Re[e[i]] changes sign. For maps, something interesting happens when |e[i]| becomes larger than one. - A complimentary question to identifying states is how do the states qualitatively change their behavior as some parameter is varied, e.g., exponentially growing or decaying or purely periodic. Most mathematical equations have parameters (like length L, damping d, and acceleration g for the damped pendulum x'' + d x' + (g/L) Sin[x] = 0) , which relate equations to the "real world". Scientists are often more interested in HOW solutions qualitatively change as parameters are varied, rather than specific details of the solutions themselves. E.g., we may be interested in when there is periodic behavior, with less interest in the actual value of the period. - I will just give a non-mathematical descriptive summary of what happens to a constant-coefficient linear flow as ANY matrix element is varied. You can verify these conclusions by looking at analytical solutions, or by numerical integrations (which you will learn later how to do in this course): - For most changes of parameters, one gets the same qualitative behavior but with quantitatively different values. For example, a decaying oscillatory state will typically continue to be decaying and oscillatory, but both the decay rate and oscillation frequency may change slowly with the parameter. Recall for the damped linear oscillator: x'' + d x' + w^2 x = 0 , the analytic solution: x(t) = Exp[(-d/2) t] Cos[ w_h t + phase ] , w_h = Sqrt[ w^2 - (d/2)^2 ] . So as the damping constant d is varied, most of the time nothing happens, one continues to get decaying oscillations. For d/2 = w (critical damping), there is no overshoot of the x=0 solution, and the solution is given by: x(t) = Exp[ - (d/2) t ] ( c1 + t c2 ) , damping with a linearly growing coefficient. The linear term involving t, known as a SECULAR TERM, is typical of solutions of linear dynamical systems whose matrices do not have linearly independent eigenvectors. For overdamping d/2 > 2 there are no oscillations, just pure exponential decay with two different real decay constants, given by (d/2) ( - 1 +/- Sqrt[ 1 - (2w/d)^2 ] ). This is standard material in most introductory mathematics books on odes, or physics books on mechanics. - As some parameter is varied, the ratios of any two frequencies (imaginary parts of different eigenvalues) will change arbitrarily and smoothly between being commensurate and being incommensurate. There is nothing special in linear systems that can force two given frequencies to stay in a rational ratio. So the dynamics will switch between periodic and quasiperiodic behavior smoothly. - For SPECIAL (quite rare) values of the matrix elements, there is a BIG change in the behavior, e.g., an exponential decay may become purely oscillatory, and then start exponentially growing as some parameter is varied, or some purely oscillatory behavior will start decaying or growing. Just think of x' = a x, for a < 0, a=0, and a > 0. - This qualitative summary makes an important prediction for how complexity increases or decreases in a constant coefficient linear dynamical system: as some parameter is varied: TYPICALLY one will see only ONE frequency change its behavior at a time, complexity can NOT change arbitrarily as any knob on the system is varied. - A particular frequency may stop decaying, - A particular frequency may suddenly appear. What is very unlikely to occur is five or ten frequencies to change their behavior when a single parameter knob is turned. HYPERBOLIC MATRICES (FIXED POINTS) FOR FLOWS - To help understand the kinds of behavior as parameters are varied, it is often useful to draw a geometric plot in the complex plane of the N eigenvalues e[i] of the NxN matrix M. More specifically, one treats each eigenvalue e[i] = x[i] + I y[i] , as defining a point (x[i],y[i]) in the complex plane with axes x[i] and y[i]. One then gets N points and the possible dynamics depends on where those points are located. Let's discuss the case of flows and maps separately. - For a flow X' = M X, any initial condition will die out over time (only non-transient behavior is the zero vector) if all real parts of the eigenvalues are negative: Real[ e[i] ] < 0 for all i . - A matrix M associated with a flow is called HYPERBOLIC if none of its eigenvalues e[i] have exactly a zero real part, i.e., all components in the eigenvector superposition are either decaying or growing, none are purely oscillating. Thus: DEFINITION: Matrix M is hyperbolic if Real[e] != 0 for all eigenvalues belonging to M. DEFINITION: A fixed point X* of a flow is hyperbolic if the Jacobian matrix Jacobian[F](X*) is hyperbolic when evaluated at X*. - One can then try to classify the kinds of hyperbolic matrices by considering the locations of different kinds of eigenvalues. This helps to explain the many different ways in which a fixed point can go unstable, leading to different kinds of mathematical behavior. EXAMPLE 1: ONE-DIMENSIONAL CASE The 1d linear flow is x' = a x with 'a' a real number. The matrix M is a 1x1 matrix with single element a which is also its eigenvalue, e1=a. So there are only TWO different kinds of hyperbolic matrices: e1 < 0 all states decay , e1 > 0 all states grow exponentially. Pretty simple. EXAMPLE 2: TWO-DIMENSIONAL FLOW CASE For the 2d linear flow X' = M X with M a 2x2 real matrix, there are two eigenvalues e1 and e2. They are either both real or, if complex, they are complex conjugates of each other. There are then FIVE kinds of hyperbolic matrices and five kinds of fixed points corresponding to five kinds of dynamical solutions that one will get for any given initial condition: Case 1: e1, e2 < 0 all solutions decay exponentially Case 2: e1 < 0 < e2 one component decays, other grows Case 3: 0 < e1, e2 both components grow Case 4: Real[e1] < 0 oscillatory decay Case 5: Real[e1] > 0 oscillatory growth These different kinds of hyperbolic fixed point have been given colorful names. Case 1 is called a "SINK" since all nearby initial conditions are sucked into the fixed point, just like water going down a drain. Case 2 is called a SADDLE because one direction is attracting, the other repelling, just like motion of a drop of water on the saddle of a horse. Case 3 is called a REPELLOR since all initial conditions are repelled away from the fixed point. Finally Cases 4 and 5 are called stable and unstable SPIRALS since there is a characteristic spiral oscillatory motion of the dynamics away or towards the fixed point. Strogatz gives a similar discussion in Section 5.2, summarizing the different kinds of hyperbolic fixed points in Fig 5.2.8 on page 137, classified as a function of the matrix elements of the 2x2 matrix. For higher dimensional cases, the possibilities proliferate to the point that it is not so convenient to assign names to all the different kinds of hyperbolic fixed points. One would still identify sinks (all eigenvalues decaying) and repellers (all eigenvalues growing) but there are different kinds now depending on whether various eigenvalues are complex or not. EXAMPLE 3: THREE-DIMENSIONAL CASE For a 3d flow, X' = M X with M a 3x3 real matrix, there are now three eigenvalues e1, e2, and e3. As you learn in the present homework assignment, any real matrix of odd size has to have at least one real eigenvalue and so there are two possibilities: e1, e2, e3 all real e1 real, e2, e3 complex and conjugates of each other. There then turn out to be EIGHT distinct kinds of hyperbolic matrices and fixed points, which are the following: Case 1: e1, e2 , e3 < 0 all initial states decay (sink) Case 2: e1, e2 < 0 < e3 two components decay, one grows (saddle) Case 3: e1 < 0 < e2, e3 one component decays, two grow (saddle) Case 4: 0 < e1, e2, e3 all components grow (repeller) Case 5: e1, Real[e2] < 0 all components decay but one with oscillatory decay (sink) Case 6: e1 < 0 < Real[e2] one component decays, one grows with oscillation (saddle) Case 7: Real[e2] < 0 < e1 oscillatory decay, exponential growth (saddle) Case 8: 0 < e1, Real[e2] all components grow, one with oscillation (repellor) Each one of these cases illustrate how the solution growing out of some initial condition will evolve in time. For the Lorenz equations with parameter values r=28, sigma=10, b=8/3, the fixed point X*-(0,0,0) at the origin is Case 2 (a saddle with a 2-dimensional stable manifold) while the two nonzero fixed points X* are of type Case 6, with a two-dimensional unstable manifold with oscillatory growth. CHALLENGE: How many distinct kinds of 4x4 hyperbolic matrices are there for flows? HYPERBOLIC MATRICES (FIXED POINTS) FOR MAPS - Given a linear map X[i+1] = M X[i], we define the matrix M to be hyperbolic if all of its eigenvalues have magnitude distinct from one, i.e., |e[i]| != 1 for all i. It is then again useful to plot all the eigenvalues of M in the complex plane which aids identifying the possible dynamical behavior of a linear map. Let's consider a few cases: CASE 1: ONE-DIMENSIONAL LINEAR MAP Such a map has the form x[i+1] = e x[i] with e real. The 1x1 matrix M consists of a single real element e which is also the its only eigenvalue. There are then FOUR kinds of dynamics corresponding to four kinds of hyperbolic matrices M (or corresponding fixed points x*): -1 < e < 0 oscillatory decay e < -1 oscillatory growth 0 < e < 1 monotonic decay 1 < e monotonic growth CASE 3: TWO-DIMENSIONAL LINEAR MAP The 2x2 real matrix of a 2d linear map has 2 eigenvalues e1 and e2 which can either be both real or both complex and complex conjugate to each other. We then get the following TEN possible hyperbolic matrices e1, e2 < -1 both components grow with oscillation e1 < -1 < e2 < 0 one component grows with oscillation, one component decays with oscillation e1 < -1 < 0 < e2 < 1 osc growth and non-osc. decay e1 < -1 , 1 < e2 osc. growth and non-osc growth -1 < e1, e2 < 0 both components decay with oscillations -1 < e1 < 0 < e2 < 1 both components decay, only one with oscillation -1 < e1 < 0 , 1 < e2 one decays with oscillation, one grows without oscillation 0 < e1, e2 < 1 both decay without oscillation 0 < e1 < 1 < e2 one decays, other grows, neither with oscillation 1 < e1, e2 both grow, neither with oscillation | e1 | < 1, Im[e1] != 0 decay with oscillation | e1 | > 1 , Im[e1] != 0 growth with oscillation CHALLENGE: How many distinct hyperbolic matrices occur for a 3x3 matrix associated with a 3-variable map? THE LANDAU HYPOTHESIS FOR TRANSITION TO TURBULENCE - Until about 1970, our conclusions based on constant-coefficient linear dynamical systems were also thought to be correct for NONLINEAR dynamical systems, especially the Navier Stokes equations describing the flow of viscous fluids. The idea was roughly that linear instability would introduce oscillatory behavior or exponential behavior and nonlinearities would serve mainly to turn off the exponential growth but not stop oscillations if present. The oscillations could then again become unstable to further oscillations as some parameter was varied, with the new frequencies being incommensurate with previous frequencies. This would continue further and further until one had an extremely complex signal built up of many incommensurate frequencies all oscillating at the same time. For technical reasons that I explain later in the course, we should consider all modes with rationally related frequencies to be only one dynamical mode, so that only the number of incommensurate frequencies count in determining the complexity of a time signal. - This transition to turbulence is known as the LANDAU HYPOTHESIS, after the famous theoretical physicist Lev Davidovich Landau (1908-1968). Specifically, he postulated that fluid turbulence, and complex dynamics in general, was the result of a LARGE number of successive bifurcations that introduced more and more incommensurate frequencies. You can read this hypothesis in Landau's own words in the classic book "Fluid Mechanics", L. D. Landau and E. M. Lifshitz, Pergamon Press, 1975, Section 27. Although a reasonable guess and partially supported by plausible mathematical arguments, this hypothesis is experimentally found not to hold in any known laboratory system!!! - Relatively recently, in 1972, two pure mathematicians, David Ruelle of France and Floris Takens of the Netherlands, gave a surprising argument that predicted that the Landau hypothesis was fundamentally wrong. Their arguments suggested that there should be a sharp abrupt transition to chaos (bounded nonperiodicity) after at most THREE successive bifurcations that introduced three successive incommensurate frequencies. Such an abrupt transition is, in fact, what is often observed in many experiments, with certain qualifications. The theorem that predicted this novel behavior is known as the Ruelle-Takens theorem. Warning: The Ruelle-Takens paper is hard to read for non-pure-mathematicians. - The Landau hypothesis and the Ruelle-Takens theorem raise an important practical issue of identifying frequencies present in an empirical time series. This can be done routinely up to some accuracy using POWER SPECTRA, which we will talk about soon when we begin our discussion of the Gollub/Benson paper. ---------------------------------------------------------------- Following comments are supplemental and useful but not central to our discussion so read for your pleasure and insighte WHY "TYPICAL" MATRICES ARE NON-DEFECTIVE - If you choose a "typical" NxN real matrix in the sense that its matrix elements are all chosen randomly and uniformly from the interval [-1,1], then: - the matrix is almost certainly nonsingular; - the eigenvalues are almost certainly all different and so the matrix is almost certainly non-defective; One can see these results intuitively by observing that the eigenvalues of a NxN matrix satisfy the Nth order characteristic polynomial of real coefficients, and those coefficients are complicated functions of the matrix elements. In order to have a singular matrix, at least one of the roots of that matrix must be zero, which imposes a single condition on the coefficients of the polynomials. But this corresponds to a lower-dimensional set of points in the space of all matrix coefficients which has measure zero (probability zero) in the space of matrix coefficients. So typical matrices have distinct eigenvalues and are non-defective. This result is, of course, completely useless if you have a specific matrix to worry about, e.g., a matrix that arises from a specific physics or engineering problem. Similarly, the condition that some eigenvalues be the same also imposes one or a few algebraic conditions, which can be satisfied by only a measure-zero set of matrix coefficients. Thus "most" matrices have the desired property of having a complete set of linearly independent eigenvalues. If you are mathematically inclined and want to see a rigorous argument, check out M. W. Hirsch and S. Smale, "Differential Equations, Dynamical Systems, and Linear Algebra", (Academic Press, New York, 1974). HOW DOES ONE FIND EIGENVALUES? - Analytically, there is only one way to find eigenvalues of a NxN matrix M, which is to calculate the characteristic polynomial: p(e) = det(M - eI) , and to find (somehow) the N roots of this Nth-order polynomial p(e). This unfortunately only works in practice for 2x2 matrices because of two difficulties: - Although the roots of a 3rd- or 4th-order polynomial can be written down in closed form as algebraic functions of the polynomial coefficients (by analogy to the quadratic formula), these formula are exceedingly complicated and give little insight. - Rather deep mathematical theory (Galois theory) says that for 5th-degree or higher polynomials, it is generally impossible to write down the roots as algebraic functions of the coefficients and so analytical progress is simply not possible. - If you can't proceed analytically, one would be tempted to proceed by trying to find the roots of the polynomial p(e) numerically, e.g., by using numerical algorithms such as bisection, Newton's method, or the secant method. However, it has been well known since the 1940's (due to pioneering work of Wilkinson, a giant in numerical analysis), that the roots of polynomials are generally ill-conditioned functions of their coefficients, i.e., small relative errors such as round-off error in a polynomial coefficient leads to large relative errors in the values of the root. So finding eigenvalues numerically as roots of the characteristic polynomial is a BAD THING and no one does it this way. - Fortunately and impressively, numerical analysts have worked out iterative algorithms that generate a sequence of matrices M -> M1 -> M2 -> ... that converge to a matrix containing the eigenvalues explicitly WITHOUT EVER CALCULATING THE CHARACTERISTIC POLYNOMIAL. (See the book "Matrix Computations" by Golub and Van Loan for a mature discussion of these algorithms, although there are good discussion in almost any graduate-level numerical analysis text). These matrix-based algorithms are "robust" in the sense that you get roughly the same numerical answers even if there are small errors in the matrix elements or round-off errors. They are also fairly efficient. These matrix-based iterative algorithms have been programmed into Mathematica and you only need to type a single word to find the eigenvalues of a given matrix, e.g., you can type Eigenvalues[ Table[ Random[], {2}, {2} ] ] which will give the eigenvalues of a 2x2 matrix, with elements chosen randomly in the interval [0,1]. - A key thing that you have to know about the numerical algorithm on which Eigenvalues is based is that the time it takes to compute the eigenvalues goes as the CUBE of the size of the matrix (as N^3 for a NxN matrix for N sufficiently large; one would that the computational effort is O(N^3) for large N, pronounced "big oh of N cubed"). This means that it takes more and more time to find the eigenvalues of a larger and larger matrix. Mathematica on a Dec or Next workstation is probably good for eigenvalues of 500x500 matrices, which will take several minutes. - You can establish for yourself the cubic scaling with the Mathematica function "Timing" that calculates and prints out the CPU time used to evaluate a given Mathematica expression. (Please look up the Timing function in the Mathematica 3.0 book.) E.g., try the following code: times = {} ; For[ i = 50, i < 351 , i += 50 , AppendTo[ times, Timing[ Eigenvalues[ Table[ Random[], {i}, {i} ] ] ][[1]] ] ] ListPlot[ Expand[ times / Second ]^(1./3.) ] You should get an approximately straight line since the time raised to the 1/3 power should be roughly linear with the size i of the matrix.