Physics 213 Lecture Notes 10-22-03.txt REVIEW FROM LAST LECTURE - Introduced and illustrated the definitions of dissipative, conservative, and expansive orbits X(t) of a flow X'(t)=F(X) in terms of the time-averaged divergence of the Jacobian matrix: < Div[F(X)] > = < Trace[Jacobian[F](X)] > which is negative, zero, and positive respectively. Historically, the concept of conservative (energy-conserving) systems came first as the giants of classical mechanics such as Lagrange and Hamilton worked out deep ways to generalize Newton's laws of motion. Later, their formalism was extended to non-conserving systems and then it was quickly appreciated that a key distinguishing feature was that infinitesimal phase space volumes were no longer conserved. As with many important mathematical definitions, the concept did not appear out of thin air but grew out of thinking over several centuries of theoretical reasoning. - We applied this definition to several systems, e.g., to a damped harmonic oscillator and to the Lorenz equations. For problems which have a natural concept of friction, is negative because of that friction but our definition is much more general and can be applied to problems for which there is no natural concept of energy or friction or forces. - We derived the result that (1/dV) d(dV)/dt = Div[F(X)] , that the divergence of the vector field F(X) driving the flow gave the relative rate of change of infinitesimal volumes in phase space. DISSIPATION CRITERION FOR MAPS: - Just as we defined phase space for a first-order autonomous flow, we can define a phase space for any N-dimensional map X[i+1] = G( X[i] ). Provided G(X) is a smooth vector field as a function of the vector X, we automatically get existence and uniqueness of orbits through any initial condition X[0] in phase space. Existence just follows from the fact that we can evaluate G(X) explicitly. Uniqueness follows from the fact that G(X) is a single-valued function, hence can produce only a single output for any vector input. - The main difference of orbits from maps, compared to orbits from flows, is that the map values jump from one point in phase space to another with successive iterations. Periodic orbits of maps are really just cycles in which a finite number of points are visited in some order. Quasiperiodic or more complex (chaotic) orbits correspond to continuous curves or sets of points that are filled out in the limit of infinitely many iterations. Note that although one jumps from point to point in phase space under a map, any given point is still a continuous and differentiable function of the initial data, just as for flows. This follows since: X[i+1] = G( X[i] ) = G(G( X[i-1] )) = G(G(...G(X[0])...)) the "i+1" term can be written explicitly as i applications of the vector function G to the initial data X[0]. - Just as we defined dissipation for flows to be average contraction of phase space volumes, we can introduce a similar definition for maps. We consider some small ball of points in phase space centered on a given point of an orbit X[i] and take each point in the ball to be an initial condition for a new orbit. The whole ball then jumps around in phase space (remaining centered on X[i+1], then X[i+2] etc.) but deforms because of the different directions and magnitudes of the vector field G(X). - The analogous dissipation definition for maps X[i+1] = G(X[i]) is that the average-over-orbit magnitude of the determinant of the Jacobian of G(X) has magnitude less than one: < | Det[ Jacobian[ G(X[i]) ] ] | > < 1 for dissipative orbits < | Det[ Jacobian[ G(X[i]) ] ] | > = 1 for conservative orbits < | Det[ Jacobian[ G(X[i]) ] ] | > > 1 for expansive orbits where again explicitly: < A(X) > = (1/N) Sum[ A(X[i]) , {i,1,N} ] . If the average is larger than 1, the map is unphysical, although still possibly interesting as a mathematical problem. - For maps, the infinitesimal phase space volumes do not evolve continuously with time so we must have a different relation between the infinitesimal volume dV[i+1] at iteration i+1 in terms of dV[i]. I claim that: dV[i+1] / dV[i] = | Det[ Jacobian[ G(X(i)) ] ] | , the ratio of successive infinitesimal volumes is determined by the magnitude of the determinant of the Jacobian matrix; this involves the product of the eigenvalues of the Jacobian, as opposed to its sum as for flows. Dissipative orbits then again correspond to an average shrinking of phase space volumes along the orbit. Note the central role of the Jacobian matrix Jacobian[G](X) in the determination of dissipation: for flows, the average sum of the eigenvalues occurs while for maps, the average product of the eigenvalues is used. EXAMPLES OF DISSIPATIVE MAPS - Example 1. Logistic map, x[i+1] = a x[i] (1 - x[i]) is 1d map with: | Det[ Jacobian[G] ] | = a | (1 - 2 x) | , where 0 <= x[i] <= 1 for i, and 0 <= a <= 4 for orbits to be bounded. To be dissipative, the parameter a needs to satisfy the inequality: < | 1 - 2 x[i] | > < 1 / a . For small a, a < 1, the map is dissipative for all initial conditions since the left side can not be bigger than 1 if x[i] lives in the interval [0,1]. For larger a, there are dissipative and non-dissipative orbits and the criterion is harder to evaluate except numerically. The logistic map has two fixed points, x=0 and x=1-1/a. At the fixed point x=a, the determinant of the Jacobian has the value "a" and so this fixed point is dissipative provided a < 1. At the other fixed point, x=1-1/a, the dissipation criterion becomes "|2-a| < 1" and the fixed point is dissipative only for 1 < a < 3. So the two fixed points can't be simultaneously dissipative. - Example 2. The Henon map lives in a two-dimensional phase space and is defined by x[i+1] = y[i] + 1 - a X[i]^2 , y[i+1] = b x[i] , see Section 12.2 (pages 429-434) of Strogatz. The Jacobian matrix is Jacobian[G] ={ {- 2 a x[i] , 1}, {b, 0} } and this has determinant: Det[ Jacobian[G] ] = - b , which has the constant magnitude |b|. If this is less than one, |b| < 1, the Henon map is dissipative and contracts phase space volumes. From: dV[i] / dV[i-1] = b, we deduce that phase space volumes contract geometrically as b^i with successive iterations. - If |b| = 0, the map is conservative and phase space volumes are conserved along orbits. If |b| > 1, the map is unphysical and most orbits are unbounded. Pictures of the iterations of this map are given in Strogatz, Section 12.2, pages 429-434. One finds a complex orbit (believed but not known yet rigorously to be a strange attractor) for the parameter values (a=1.2, b=0.3) and for most initial conditions. - Example 3: Four-stage Runge-Kutta numerical algorithm for integrating flows: K1 = dt F[ X[t] ] K2 = dt F[ X[t] + (1/2) K1 ] K3 = dt F[ X[t] + (1/2) K2 ] K4 = dt F[ X[t] + K3 ] Xnew = X[t] + (1/6) ( K1 + 2 K2 + 2 K3 + K4 ) This famous algorithm (invented around 1900) takes an initial vector X(t) to a new vector Xnew that approximates the analytical solution X(t+dt) at time t+dt, the value at the next time step. The weighting of vectors K1, K2, K3, and K4 are chosen so that the Taylor series of each component of the vector Xnew matches the Taylor series of the vector X(t+dt;x(t)) to the highest possible order. The intermediate function evaluations K1,...,K4 are called "stages" by numerical analysts. With four stages, one can match Taylor series up to fourth-order but not better, so that the local truncation error behaves as: || X(t+dt) - Xnew || = O(dt^5) , which goes to zero with the FIFTH power of the time step. This is pretty good accuracy, e.g., if all quantities are of O(1) and dt=0.1, then the numerical error is of order dt^5 = 10^-5. Note: The coefficients of the vectors K1, K2, K3, and K4 are derived from a set of nonlinear equations whose number is larger than the number of coefficients and so there is an infinity of ways to choose these coefficients to achieve O(dt^5) accuracy. This implies that there is much flexibility in how to select a Runge-Kutta method, e.g., that satisfies some extra constraint such as energy conservation. - You can see that the 4-state Runge-Kutta algorithm for integrating a set of odes is a complicated but well-defined map, X[i+1] = G( X[i] ) , where we label Xnew = X[i+1] and X(t) = X[i]. So numerical integration of a flow dX/dt=F(X) is accomplished by evaluating some cleverly chosen map. We can then ask whether this map is conservative or dissipative for the harmonic oscillator, a conservative flow as shown above for the damped harmonic oscillator with d=0. Here is the calculation in Mathematica: f[ x_List ] := { x[[2]], - x[[1]] } (* oscillator vector field *) x0 := {x1, x2 } (* initial data *) k1 := dt f[ x0 ] (* first stage *) k2 := dt f[ x0 + (1/2) k1 ] (* second stage *) k3 := dt f[ x0 + (1/2) k2 ] (* third stage *) k4 := dt f[ x0 + k3 ] (* fourth stage *) xnew := Expand[ x0 + (1/6) ( k1 + 2 k2 + 2 k3 + k4 ) ] jacobian = Transpose[ { D[xnew, x1], D[xnew, x2] } ] Det[ jacobian ] which gives the following simple result when the dust clears: Det = 1 - (1/72) dt^6 + (1/576) dt^8 . The map is dissipative if the determinant has magnitude less than one, conservative if equal to one, and expanding if magnitude greater than one. - For small time steps, the negative dt^6 term dominates the smaller positive dt^8 term and the Runge-Kutta map is weakly dissipative, even though the flow it is trying to integrate is conservative. This is bad news: the numerical approximation effectively introduces a small but finite numerical friction and the numerical integration corresponds to a weakly damped oscillator that slowly loses energy. This explains your earlier homework problem, that the global error grows slowly but eventually stops growing when the numerical oscillator has decayed to zero. The global error does not decay for the driven damped oscillator of the homework, x'' + 4 x' + x = 4 Cos[t] since the numerical dissipation of the Runge Kutta algorithm is very small to the already existing dissipation of the flow, and the driving of the right side force continuously overcomes the dissipation. - For the harmonic oscillator, there is a special time step dt=2.82843 (a LARGE value comparable to the period of the oscillator), the map is conservative while for dt > 2.82843 the map will increase phase space volumes and the iterations should diverge. So we get a specific prediction for the largest time step with which divergence is avoided. You might be tempted to choose dt=2.82843 to make the algorithm conservative, but this time step is so large that one ends up with no numerical accuracy (the Taylor series does not converge). - The fact that the numerical integration algorithms become conservative for magical time steps is well known and important to scientists who study satellites and planetary motion. These scientists often study numerical energy conservation as a function of the time step, and find numerous values for the time step for which dE/dt is approximately zero. They then do their numerical integration at one of these special values. See papers of Jack Wisdom and collaborators of MIT, e.g., : `Chaotic Evolution of the Solar System,'' Gerald Sussman and Jack Wisdom, Science, 257, 3 July 1992. - People have discovered numerical method for integrating flows of interacting particles that not only conserve energy exactly, but also momentum and angular momentum (which should also be preserved for a closed system of particles). DERIVATION OF PHASE SPACE VOLUME CHANGES: MAPS - Consider a map X[i+1] = G( X[i] ), where the vector field G(X) is smooth. Consider an infinitesimal parallelepiped in N-dimensional phase space defined by N tiny vectors V1,...,VN with base at X[i]. The infinitesimal volume dV[i] enclosed by the simplex of points X, X+V1, ..., X+VN is then given by the determinant of the matrix whose columns are the vectors: dV[i] = Det[ {V1, ..., VN} ] , where the subscript i denotes that this is at the ith iteration of the map G(X). I will let you figure out how to prove this result by induction but hopefully you have seen this before. CHALLENGE: Assume that you have K < N N-dimensional vectors in a N-dimensional phase space. The tips of these K vectors together with the origin define a K-dimensional simplex of K+1 points. Prove that the volume of this simplex is given by: volume_K = Sqrt[ Det[ M^T M ] ] , where M = {V1,...,VK} is a NxK rectangular matrix whose columns are the given vectors, M^T denotes the KxN matrix transpose of M, and where M^T M denotes the matrix product of M^T with M and so is a KxK matrix. Neat! - Now under the map G(X), X[i] goes to X[i+1] and X[i] + V1 goes to G(X[i] + V1) approx G(X[i]) + Jacobian[G](X[i]) . V1 , where I have used the vector form of Taylor's theorem to lowest (linear) order in the small quantity V1. We see that each vector Vi is simply stretched and twisted by the Jacobian matrix to lowest order in a Taylor series. We have used the assumption that the vectors Vi are small to invoke the Taylor expansion. - The volume of the new infinitesimal parallelepiped is given by the same formula: dV[i+1] = Det[ {M.V1, ..., M.VN} ] , where I have use the matrix M to abbreviate the notation for the Jacobian matrix, Jacobian[G[X[i]]]. But by elementary linear algebra gives {M.V1, ..., M.VN} = M . {V1, ..., VN} , the matrix with column vectors M.Vi is the same as the matrix product of M with the matrix {V1,...,VN}. And since the determinant of the product of matrices is the product of determinants, we see: dV[i+1] = Det[M] dV[i] or |dV[i+1] / dV[i]| = | Det[ Jacobian[G[X[i]]] ] | . Quod erat demonstrandrum. CONCEPTS OF ATTRACTORS, INVARIANT SETS, BASINS OF ATTRACTION: - We now come to a key insight for the entire course: dissipation is a crucial concept in the mathematical theory of nonequilibrium systems because it greatly reduces the effective size of the space of initial conditions X0, in solutions X(t;X0;P): DISSIPATIVE SYSTEMS HAVE TRANSIENT BEHAVIOR THAT DECAYS, LEADING TO A STATISTICALLY STATIONARY STATE WHOSE STATISTICAL PROPERTIES ARE INDEPENDENT OF MOST INITIAL CONDITIONS. The transient behavior is an immediate consequence of volume contraction: any small cluster of initial conditions define a volume that shrinks to zero volume along an orbit. The transient is over when the volume is effectively zero (or below any observable limit). Thus the long-time properties of dissipative systems are weakly dependent on the choice of initial condition. This is an enormous and useful mathematical simplification compared to conservative systems, for which there are no transients and for which small changes in the initial conditions can lead to statistically different solutions. - Note the crucial point that it is the STATISTICAL properties, not the detailed properties, that are independent of the initial condition. Thus all initial conditions except x1=x2=0 lead to a limit cycle in the Van der Pol oscillator x'' + (x^2 - e) x' + x = 0,, so the final statistical property is periodic (a limit cycle). The period, amplitude, power spectrum, etc. are all the same. But the PHASE of the limit cycle, when you fall onto this attractor, is dependent on the initial condition, the final solutions are not identical. More precisely, the asymptotic set of points traced out by the orbit X(t) in the limit of infinite time is the same. - With this definition of dissipative systems, dissipative dynamical systems have ATTRACTORS in bounded regions of phase space. I.e., initial conditions get "pulled" or "sucked" into certain regions of phase space of zero dimensionality and settle on some set of points in phase space that is INVARIANT under the equation. An invariant set is one in which any initial condition in that set leads to orbits that remain in that set, i.e., it remains unchanged under the dynamics. - The set of all points in phase space that are attracted to a given attractor is called the BASIN OF ATTRACTION. This is an invariant set since any point in the basin moves onto another point that is also in the basin since they all end up following into the attractor. Basins of attraction often have infinite volume in phase space, even when the resulting attractor has zero volume. For some simple systems, e.g., damped harmonic oscillator or Van der Pol equation, all of space except for a few points (e.g., any fixed points) constitutes the basin of attraction. More generally, dynamical systems typically have many attractors in different parts of phase space and the phase space is broken up into different basins that lead to a given attractor. This intricate structure is often quite important for the understanding of some given physical or engineering system. EXAMPLES OF INVARIANT SETS - Any fixed point is a dynamical invariant since if you start on this set, you stay on this set for all time. - Any limit cycle (isolated periodic orbit or loop in phase space) is an invariant set. If you choose any initial condition on the limit cycle, you can't get off it by the periodicity. - Consider the 3d flow of the recent homework assignment for three interacting bird species, which were of the form: dx/dt = x f1(x,y,z) , dy/dt = y f2(x,y,z) , dz/dt = z f3(x,y,z) . But then the three planes x=0, y=0, and z=0 are invariant sets such that any initial condition lying in these sets generates an orbit under this flow that never escapes these planes. This is a simple consequence of the existence and uniqueness theorem: the vector field F(X) is smooth everywhere and if z=0 for some initial condition (x0,y0,0), then z stays zero forever by the third equation. Thus through each initial condition in the z=0 plane there is an orbit that lives within that plane. Note: These invariant planes are NOT attractors since initial conditions outside of these planes are not pulled into these planes in the limit of infinite time. The fact that these planes are invariant answer one of the recent homework questions: an initial condition (n1(t),n2(t),n3(t)) in the first positive octant of the phase space ( ni(t) > 0 for i=1,2,3) can never get out of this octant. This follows since the invariant planes are already filled with orbits that lie in those planes forever. Thus any orbit outside of those planes would have to intersect some other orbit in finite time to cross which is not allowed by the uniqueness theorem. You now see how this idea generalizes: if you have ANY invariant set S that forms a (N-1)-dimensional surface in a N-dimensional phase space, then it is impossible for orbits outside of that surface to cross that surface. - Mathematicians often add a further restriction to the assumption that an attractor is an invariant set. The also assume that the set is TOPOLOGICALLY TRANSITIVE (ergodic in the language of physicists). This means that there is at least one orbit on the attractor that comes arbitrarily close to all other points on the attractor. This property is obvious for fixed points, limit cycles, and N-tori, but rather subtle and non-obvious for strange attractors. There are many other mathematical subtleties about how to define these concepts but it turns out that the technical details and subtleties rarely seem to be important on a day-to-day basis for scientists and engineers working on nonlinear problems. EXAMPLES OF ATTRACTORS: - If you type any number into a pocket calculator and hit the COS key repeatedly (make sure that you are using radians, not degrees!), you always end up with the same number, approximately 0.73908513.... Mathematically, hitting the COS key repeatedly is the same as iterating the map x[i+1] = Cos[ x[i] ] . I will leave it to you as a moderate challenge to show that this fixed point x*=0.73908513* is a GLOBAL attractor: all values x0 on the real line lead to a sequence that converges to x*. The basin of attraction is then the entire real line and the only attractor is x*. - For the logistic flow, x' = x (1 - x), there are two fixed points that are invariant under the flow. The point x=1 is an attractor since there is a basin of attraction containing many points, which lead to orbits that terminate at x=1. The basin of attraction consists of all points x > 1 and all points x > 0 and x < 1. Points with x < 0 lead to orbits that diverge to minus infinity. - For the logistic map x[i+1] = 4 x[i] (1 - x[i]), most orbits are topologically transitive in the interval [0,1], i.e., they come arbitrarily close to all points in [0,1]. Strictly speaking, [0,1] is NOT an attractor since it is not contained in a superset of points that constitute its basin of attraction. This is an example of a strange but not fractal attractor. ("Strange" means that there is exponential magnification of perturbations to initial data, i.e., there is a positive Lyapunov exponent lambda_1.) - The Van der Pol oscillator x'' + (x^2 - eps) x' + x = 0 has two invariant sets for eps > 0: the fixed points x1=x2=0 at the origin and some limit cycle C that encloses the origin. C is an attractor with a basin of attraction consisting of all points inside C except the fixed point at the origin and of all points outside of C. - The Lorenz equations have a "strange" attractor for r >approx 25. Many points end up on the same geometric set, which has zero volume but infinite area, and is sensitive to the choice of initial conditions. The basin of attraction is somewhat complex, but has infinite volume in the 3d phase space. There are three fixed points which are invariant but are not attracting for the classical value r=28. EXAMPLE OF ATTRACTORS: CHAOS AND FLIPPING A COIN - A natural question that might have occurred to some of you is whether the randomness of getting heads or tails from flipping a coin has anything to do with chaos. The answer is NO! A simple model for flipping a coin is that you give it an initial torque (flicking it with your thumbnail) to get it spinning and then send it up into the air. The center of mass of the coin then follows a parabolic path while the coin spins at constant uniform angular frequency about its axis. If you flip the coin into a pail of sand (for simplicity), then whether you get heads or tails just depends on the orientation of the coin when it first hits the sand. The equation of motion of the spinning falling coin is a LINEAR ode: d(theta)/dt = omega , where omega is the constant angular rotation determined by how hard you flicked it. I.e., you can think of the coin as simply spinning at constant angular rotation from the time it leaves your hand until it hits the floor. The angle simple grows linearly with time: theta = omega t + theta0 , and whether you get heads or tail simply depends on the initial angle theta0 (zero if you hold the coin flat) and the initial strength of the flick omega0 (which is very hard to control and is where all the randomness comes from). In fact, flipping a coin is a conservative dynamical system to an excellent approximation (except for the final collision with the floor when the coin comes to a halt) and so there can not be any attractors. The randomness then arises from the fact that there are many tiny closely space sets of initial conditions that lead to heads or tail. There is no exponential stretching of nearby initial points in phase space and so chaos is irrelevant here. For a detailed and interesting discussion of this point, you might like to read the following paper: @article{Keller86, author = {Joseph B. Keller}, title = {The probability of heads}, journal = {Am. Math. Month}, volume = {3}, pages = {191--197}, month = {March}, year = {1986} } NEWTON'S METHOD FOR ROOTS AS A DYNAMICAL MAP - Isaac Newton's famous and important method for finding roots of functions of a single variable f(x) is given by the following iterative map, x[i+1] = x[i] - f(x[i]) / f'(x[i]) = g( x[i] ) , with g(x) = x - f(x)/f'(x). This gives a dissipative map that converges to an attractor (root of a function f(x)) for many but not all initial conditions. This generalizes to N-dimensional systems of equations F(X)=0 as follows: X[i+1] = X[i] - Inverse[Jacobian[F](X[i])]. F(X[i]) , which involves solving a NxN set of linear equations associated with the NxN Jacobian matrix of F(X). With your expert knowledge of multivariate Taylor series, you can derive this N-dimensional case in a flash. Assume that you have an approximate initial vector X[i] that approximates a solution of the vector equations G(X)=0, and assume that a tiny correction vector H, when added to the initial guess X[i], will give you exactly zero: G(X[i] + H) = 0 , || H || << 1 . Taylor expanding to lowest order in the small quantity H around the point X[i] gives: G(X[i]) + Jacobian[G](X[i]) . H = 0 , which is precisely the Newton iteration formula: you solve for the small vector H and then add H to X[i] to get a better approximation. You can easily see that any fixed point of the Newton map is indeed a zero of f(x) provided that f'(x) is not also zero at the same time, i.e., x* is a simple root. Indeed, assuming x[i]=x* is a fixed point we have: x* = x* - f(x*) / f'(x*) , f'(x*) != 0 which implies f(x*)=0. You can also see that the Newton map g(x) = x - f/f' is strongly dissipative near a root for any function f(x) since its Jacobian matrix g'(x) g'(x) = - f(x) f''(x) / (f'(x))^2 , evaluated at a simple zero of f(x) gives the value zero and so definitely has magnitude less than one. The vanishing of the Jacobian matrix at a fixed point is a rather rare and special occurrence for dissipative maps and is called SUPERCONVERGENCE. This vanishing explains a remarkable property of Newton's method: when it converges, it converges with lightning speed; each successive iteration approximately doubles the number of significant digits in the answer! CHALLENGE: Check this rapid quadratic convergence by iterating for the quadratic equation x^2 - 2 = 0 for Sqrt[2]. - We can now put on our nonlinear dynamical hats and think of Newton's method as a dynamical map. The roots are then attractors and it is then quite interesting to figure out what the basins of attraction are: what initial conditions flow to a given zero of F(X)=0 under the Newton iteration? The bad news is that typically the basins of attraction for the roots of a Newton method (and in fact for nearly any nonlinear dynamical system) can be QUITE complex and therefore nearly impossible to characterize. The good news is that there is much unexpected geometric beauty associated with the basins of attraction. Let's try an explicit example by trying to find the three cube roots of unity satisfying: z^3 - 1 = 0 , by Newton's method. The three roots are: z1 = 1 ; z2 = Exp[2Pi I / 3] = Cos[2Pi/3] + I Sin[2Pi/3] = - (1/2) + I Sqrt[3]/2 ; z3 = Exp[4Pi I / 3] = Cos[4Pi/3] + I Sin[4Pi/3] = - (1/2) - I Sqrt[3]/2 ; which live at 120 degrees w.r.t. each other in the complex z plane. We can try to locate these roots by using Newton's method as follows (which generalizes straightforwardly for complex-valued maps): z[n+1] = z[n] - ( z[n]^3 - 1 ) / (3 z[n]^2) , The initial condition z[1] must now be complex-valued to converge to any root other than z1=1. By symmetry and common sense, one would expect that the set of all initial values z0 that are closer to some given root z* than some other root will be in the basin of attraction of z*. If true, then the basins of attraction would be three symmetric wedge-shaped regions bounded by straight lines. In fact, the basins of attraction for each of the three roots are complex intricately woven objects of stunning complexity, they are fractals. These boundaries have the bewildering WADA PROPERTY (named after a Japanese mathematician who first pointed out this possibility), each point on the boundary of the basin of attraction is also on the boundary of the other two basins of attraction, a difficult situation to imagine indeed but rather common in nonlinear dynamics. Note: I can't show the figures easily in my notes but I show two slides of basins of attraction of roots in class. There is also a color picture in Gleick's Chaos book for the basins of the fourth roots of unity. ATTRACTORS ARE ZERO VOLUME SETS: - Attractors have zero N-dimensional volume in N-dimensional phase space. This is simple: enclose the bounded attractor in some finite union of N-dimensional boxes. Then follow where all the points in these boxes go. Since the flow is dissipative, the boxes shrink to zero volume, and yet these boxes contain the attractor. Thus: D < N , where D is a dimension of the attractor and N is the phase space dimension. - The decreased dimensionality means generally that it takes FEWER numbers than N to describe dynamics on the attractor, the state vector is effectively reduced in complexity!. Thus a fixed point has no degrees of freedom, no matter what size the phase space it lives in. A limit cycle has one-degree of freedom, one can parameterize the cycle by some angle that is 0 at the beginning and varies smoothly to 1 once around. Thus any limit cycle is effectively one dimensional since a single parameter can characterize all its points, e.g., arc length. Similarly, a torus is two-dimensional since two parameters (poloidal and toroidal angles) specify all its points. A bit more complicated for strange attractors, as we shall soon see. - Decreasing volume does NOT mean attractors are necessarily points or loops of zero volume. Volumes can shrink to zero in many ways, including expansion rapidly in some directions with stronger contractions in others. We will see that this leads typically to fractal objects, of dimension intermediate between volumes and areas. Volumes in phase space are stretched, twisted, and folded over and over again (boundedness!), leading to strange attractors. BASIN BOUNDARIES AND STABLE AND UNSTABLE MANIFOLDS HOMOCLINIC TANGLES - Consider unstable periodic orbit that is unstable fixed point of some Poincare map G(X). Stable and unstable manifolds W^s and W^u respectively intersect in homoclinic point of which there must then be an infinity of such points if there is a single intersection. Further, the manifolds must wind back and forth furiously to pass through successive homoclinic points as the points converge towards the fixed point in forward and backward time. CHALLENGE: Show that stable manifolds can not intersect themselves, nor can unstable manifolds intersect themselves. Show that the stable manifold of one fixed point can not intersect the stable manifold associated with a different fixed point but can intersect the other fixed point's unstable manifold. - You can find a brief discussion of homoclinic tangles in Section 4.3 of Ott's book "Chaos in Dynamical Systems" and a more mathematical and detailed discussion in the advanced book "Nonlinear Oscillations, dynamical systems, and bifurcations of vector fields" by J. Guckenheimer and P. Holmes. REVIEW OF POINTS SO FAR - Dissipation of orbits greatly simplifies analysis of dynamical systems by decreasing greatly the importance of initial conditions. - A set A in the phase space of a dynamical system X'=F(X) or X[i+1]=G(X[i])) (for a flow and map respectively) is called an ATTRACTOR if: - There are orbits X(t;X0) outside of the set A which end up in the set A in the limit of infinite time, t -> Infinity. - The set A is INVARIANT meaning that it is unchanged by evolving the points of the set as initial conditions. More casually, any point in A leads to an orbit that stays in A for all time. - The set A is minimal, i.e., no strict subset of A has the above properties. - The set of all initial conditions X0 that produce an orbit X(t;X0) that end up on a particular attractor A in the limit of infinite time is called the BASIN OF ATTRACTION of the attractor. - An attractor is called chaotic if there is average exponential separation of orbits starting from initial conditions on the attractor that are infinitesimally close to one another. Note: An attractor is called "strange" if the attractor is fractal with a non-integer dimension D. A strange attractor is NOT necessarily chaotic and a chaotic attractor is not necessarily fractal. E.g., the logistic map on the phase space [0,1], x[i+1] = r x[i] (1 - x[i]) , has a strange but non-chaotic attractor at the special value r* that is the accumulation point of all the period-doubling bifurcations. For the value r=4, the map is chaotic with Lyapunov exponent lambda_1=Log[2] > 0 but the chaotic orbits fill the interval [0,1] densely and so there is no attractor (or the attractor, as the unit interval, is not strange). - Attractors have zero volume in the N-dimensional phase space of the dynamical system (average contraction of phase-space volumes). - Attractors and basin boundaries are ofter complicated wrinkled fractal sets. In particular, it is essentially impossible to characterize the set of initial conditions that will approach a given attractor, e.g., because of the Wada property. - Discussed briefly concepts of stable and unstable manifolds, denoted by W^s and W^u respectively, for a fixed point. Something truly extraordinary happens if W^s intersects W^u at a point for a map: there is an infinity of intersections and the stable and unstable manifolds must wind back and forth in an extremely complex way. The resulting mess is called a homoclinic tangle and helps to explain how complex chaotic dynamics can arise from smooth innocent-looking flows. NEURAL NETWORKS: CONCEPTUAL APPLICATION OF ATTRACTORS TO NEUROBIOLOGY - The concepts of attractors and basins of attraction are already sufficient to lead to many interesting new insights. The power of these concepts is already enough to justify the study of nonlinear dynamics, whether or not exotic subjects like chaos and fractals prove important to you. - A growing and important application of nonlinear dynamics is understanding how the brain works, how 10^10 neurons with 10^13 interconnections for the human brain leads to its remarkable information processing and its robotic capabilities (e.g., drawing a picture, playing a musical instrument). One traditional approach for understanding brain has been "artificial intelligence" (or "artificial stupidity" by its cynics), which guesses that insights obtained from the theory and design of digital computers will help to explain and to design biological-like brains. A strong motivation for this approach is the Turing-Church hypothesis that all digital computers (i.e., computers that manipulate a finite number of bits with each operation but can have an infinite amount of memory) are equivalent and so the human brain, as a complex device that is still a computer, must be understandable by the same rigorous logic as digital computers. It is a deep and somewhat undefined problem whether physical systems like a human brain actually are equivalent to Turing machines. Certainly, our pattern recognition ability and high-level cognitive abilities far outshine any existing computers and any computer being currently designed (10 teraflops or so). - A different approach to brain design is biological and Darwinian: rather than worry about logical subtleties of Turing machines and computability, we ask how is it possible that evolution would lead to a brain-like structure out of cellular components? That the brain arose from evolution would imply that there were numerous little steps of incremental competitiveness and so the giant human brain evolved out of simple components without any particular guiding principles or deep mathematical principles. - In particular, a biological theory of brain has to take several facts into account: - Brains are made up of unreliable components. Neurons die at a high rate (hundreds of thousands per day for people over the age of 20 or so), are strongly influenced by drugs like alcohol, and yet the brain continues to function effectively. Large amounts of neural tissue can be excised and yet brain function such as memory or pattern recognition retained. - The brain is massively parallel: many computations of the brain, e.g., fright or flight, are carried out in a 500 msecs while the fastest neurons operate only about 100 times that fast. Given the slow speed of neuronal communication, this implies that each neuron gets to talk to only about 100 other neurons by the time a thought has been carried out. This is possible only if massively parallel computations are carried out. - The brain (and biological organisms in general) have to deal with a NOISY environment: one never has clean well-defined objects, e.g., a predator may have camouflage and move stealthily and it may be raining out and it may be near dusk when there is not much light and so discriminating between the background and a predator is difficult. CONTENT-ADDRESSABLE MEMORY - One of the most interesting, and fairly recent conceptual applications of nonlinear dynamics is the HOPFIELD MODEL OF ASSOCIATIVE MEMORY. Psychologists and computer scientists define associative memory (also called CONTENT-ADDRESSABLE MEMORY) as information that is stored by content, rather than by an index. In simple computer codes, one typically stores numbers or strings in arrays, and accesses this information by looking up an array index, e.g., a[3] = "Mary's lamb". Given the biological structure of brains, it seems rather unlikely that animals store information in this manner since an index like 3 has nothing to do with what is stored and accessing a particular memory would then be difficult. Instead, people have speculated that organisms store memory by content, so given partial information (e.g., the word "lamb"), the entire memory can be recovered. This is clearly more useful from an evolutionary point of view, since external stimuli can trigger the appropriate response. - People (presumably all mammals) have superb associative memories, given a smell, a sound, a touch, an odor, part of a word, part of a sentence, part of a figure, five notes from a song, etc., entire memories can be triggered in great detail. It is not yet known how this is accomplished in terms of actual neurophysiology. John Hopfield (who was at Bell labs in New Jersey in 1983 when he wrote his famous paper and who is presently at Cal Tech) had the insight that associative memory might be a nonlinear dynamical effect that could spontaneously arise by evolutionary selection from interconnected nonlinear neurons. Associations were simply different initial conditions in the same basin of attraction, and the memory (usually a fixed point) was the attractor towards which all points in the basin flowed. Hopfield rightfully became quite famous by showing how to CONSTRUCT nonlinear models whose attractors were given memories, and by being able to prove several important theorems about his model. His paper proved to be enormously stimulating and has led to thousands of other research papers in many different disciplines such as engineering, robotics, computer science, psychology, and physics. There is still much active research about Hopfield models to this day, as you can attest for yourself if you browse through journals in the Vesics Engineering Library or browse the world wide web. - Hopfield constructed his model from N mathematical neurons that were highly abstracted from actual biological neurons. Each "neuron" was a simple device whose state was discrete, with just two possible values -1 or 1. The ith neuron was connected to all the N-1 other neurons j with N-1 wires each of which were weighted by some numerical connection strength t[i,j] (so there are "N take 2" or N(N-1)/2 wires in all for a Hopfield net of N neurons.) Hopfield called these connection strengths the t[i,j] matrix and assumed unrealistically but insightfully and productively that this matrix was symmetric: t[i,j] = t[j,i] , that neuron i talked to neuron j with the same strength as neuron j talked to neuron i. In actual neurological tissue, a neuron is a complex electrical-chemical cellular blob with many dendrites ("wires") feeding into the main body and a single axon coming out. The connections of the dendrites to the neuronal body are themselves quite complex. When enough excitatory signals come in, the axon fires, sending action potentials of fixed magnitude but variable frequency down the axon to other neurons. (The outgoing single axon eventually ramifies and touches upon many other neurons.) A single neuron is itself very complicated, so Hopfield's model is a great simplification. This simplification is the essence of good modeling and good theoretical insight and is an important lesson for those of you who are budding theorists: the best progress, in the sense of the deepest insights, often come from simplifications of a physical system, rarely from direct frontal attacks on the full equations using all available details. We saw this previously throughout the course, e.g., Lorenz's discovery of his three equations as a caricature of the complicated dynamics of the five three-dimensional nonlinear partial differential equations (the so-called Boussinesq equations) that experiments show to give a quantitatively accurate description over many ranges of parameter values. - A memory then was defined by some fixed point of the dynamics, i.e., assignments of -1 or 1 to each neuron. The associations of that memory are all points in the basin of attraction. There must definitely be associations that give spurious memories, the points on boundaries between basins of attraction, but these are of measure zero in phase space. - The dynamics of the model is remarkably simple. At a random time, each neuron, say the ith neuron, "wakes up", and forms a weighted sum of the states of all other neurons except itself: h[i] = Sum[ t[i,j] s[j] , {j != i , 1 , N} ] . The neuron then changes its internal state according to the simple rule: If h[i] > 0 change state i to 1 If h[i] <= 0 change state i to -1 Using the language of neurobiology. this rule implies that positive matrix elements t[i,j] > 0 are "excitatory" and encourage a change to an active state (value 1), while negative matrix elements are "inhibitory" and push the cell towards being inactive (value -1). This random waking up and possibly changing a state repeats over and over again until some fixed point is approximately reached. Thus this generates an orbit in a discrete N-dimensional phase space where N is the total number of bits used to store any memory in the net. Unlike other phase spaces we have discussed previously in the course, this phase space has a finite number of states, namely 2^N patterns since each neuron has a 1 or -1 state. - This algorithm is ASYNCHRONOUS and PARALLEL, just as biological networks of neurons seem to be. Furthermore, the states are NON-LOCAL. Partial destruction of neurons or links has a modest affect on memory. This nonlocality has been shown in numerous experiments, e.g., Lashley's famous experiments of the 1940's, in which mice could still run mazes correctly with substantial portions of their brains removed. Nonlocality is not universal in the brain, there are special centers for speech, breathing, etc., for which local damage removes the associated capability. Anatomists have identified over 200 specialized regions of brain tissues, these are conjectured to be fairly independent processing systems. - Hopfield was able to show that this model had to settle down to some finite number of fixed points as long as t[i,j] was a symmetric matrix. The proof is rather simple, based on the idea that if a physical dynamical system is dissipative, one can sometimes find an energy-like quantity E[S], A LYAPUNOV FUNCTIONAL, that decreases along each orbit. If this quantity is bounded from below, all states must eventually reach a stationary state when the minimum value is attained. Thus for the damped harmonic oscillator m x'' + a x' + m w^2 x = 0, there is the total energy: E = (1/2) m (x')^2 + (1/2) m w^2 x^2 , and one easily verifies that E[x',x] decreases monotonically along any orbit except a fixed point: dE/dt = m x' x'' + m w^2 x x' = x' ( m x'' + m w^2 x ) = - a (x')^2 with a > 0. But E is bounded from below by zero, (x')^2 and x^2 can't become smaller than 0, so all dynamics tend towards the fixed point x1=x2=0. This trick works in more sophisticated contexts. Thus the more complicated two-dimensional pde, the Swift-Hohenberg model: Dt[ psi ] = ( eps - (DEL + 1)^2 ) psi - psi^3 has an associated Lyapunov functional: E[ psi ] = Integrate[ dx dy - (1/2) eps psi^2 + (1/4) psi^4 + (1/2) [ (grad + 1) psi ]^2 ] and using integration by parts, one can show that: dE/dt = - Integrate[ dx dy ( D(psi)/dt )^2 ] . Thus the right side is always negative until a minimum is reached, at which point psi becomes stationary in time. This proves the remarkable result that Swift-Hohenberg can only have complicated transients leading to stationary states. Again, Swift and Hohenberg did this backward, they CONSTRUCTED a model to have an associated Lyapunov functional, to aid their physical theory. - Similarly, Hopfield constructed his model to have a Lyapunov function . Thus he defined the quantity: E = - (1/2) Sum[ t[i,j] s[i] s[j] , {i,1,N}, {j,1,N} ] Although reasonable in hindsight, Hopfield was led to this expression by his background as a condensed matter theorist. This expression corresponds to the energy (Hamiltonian) of a physical material called a "spin glass", which is a magnetically disordered metal, e.g., dilute concentration of iron atoms randomly dissolved in a copper matrix. Spin glasses form strange physical states that are intermediate between ferromagnets and ordinary metals, the magnetic spins are locked in position but randomly positioned in space, and there are strange phase transitions as temperature or external magnetic fields are varied. Spin glass physics has been an important branch of theoretical physics in the last 25 years, with implications for mathematical optimization, computer science, and biology. - The change in energy when the ith neuron wakes up and changes its state is given by: dE = - d(s[i]) Sum[ t[i,j] s[j] , {j != i} ] , where d(s[i]) is the change in the state of the ith neuron. But by construction of the algorithm, if the sum is positive, the change in state d(s[i]) is either zero (1 to 1) or positive (-1 to 1), so the energy must stay the same or decrease. After some finite time, E must reaches its minimal value, implying a fixed point. The symmetry of the matrix t[i,j] is essential for this argument. Hopfield showed empirically in his paper that small asymmetric perturbations of a symmetric matrix did not change his conclusions of convergence towards a fixed point. - Another important contribution of Hopfield was to show how to construct the connection matrix t[i,j] explicitly so that desired memories or patterns occur at the fixed points. If we denote by V^s_i the desired n patterns, s=1,...,n, representing bits -1 and 1 at each site i=1,...,N of the N neurons. t[i,j] = Sum[ V^s_i V^s_j , {s, 1, n} ] for i != j t[i,i] = 0 for i = j You can think of these bits as pixels (black or white) on a screen, or as some quantization of data into certain kinds of objects. - In order to work, Hopfield's memories V^s_i have to be approximately ORTHOGONAL, another non-biological requirement. To see this, we can ask whether the above matrix and algorithm do indeed preserve any of the given states V^s_i if one starts in that state. Thus assume that the entire network is in the mth memory, V^m, and assume that the ith neuron wakes up to form its input sum and evaluation. Then this neuron constructs the sum: h[i] = Sum[ t[i,j] V^m_j , {j, 1, N, i != j} ] = Sum[ V^s_i Sum[ V^s_j V^m_j , {j,1,N} ] , {s, 1, n} ] If the bit vectors V^s are approximately orthogonal, their bits are roughly randomly chosen to be -1 and 1, we can approximate: Sum[ V^s_j V^m_j , {j,1,N} ] approx N delta[s,m] Substituting this, we see that h[i] approx V^m_i and the state V^m is approximately a fixed point of the dynamics. - The requirement that the states be roughly orthogonal is not too bad for a few states, but as one tries to store more and more memories, new memories are less and less orthogonal to existing memories and behave less and less well as fixed points. There are technical ways to eliminate orthogonality all together as a requirement (see various books by Kohonen, a Finnish theorist who has made many important contributions to neural networks). Biologically, one may conjecture that all important memories are COMPRESSED, i.e., redundant bits are removed (e.g., 100 1's would be replaced by the number of 1's, a much smaller amount of storage). Compression tends to make bits more random and hence the memories more orthogonal. CHALLENGE: Try working out a compression scheme and verify that states of -1's and 1's become more orthogonal under compression (look at the binary values of the compressed data as stored on the computer). REVIEW OF RECENT POINTS: - Discussed background and context of John Hopfield's neural network model of a content-addressable (associative) memory. Some key points: - Biological brains work are massively parallel, asynchronous, and work with slow, faulty components. - Biological organisms need to process information that is often noisy; precise logical algorithms are not going to work. As one example, one can not use gradient methods to optimize some feature of sensor input since noise prevents local derivatives from being well-defined. - Hopfield's key insight was to store memories as stable fixed points and to use the basins of attraction as the "associations" of the fixed points. He figured out a clever parallel asynchronous scheme with simple computing elements that stored patterns (memories) in the "N Take 2" interconnection strengths t[i,j] of N neurons whose states could be 1 or -1. Given P patterns of N pixels (i.e., P Boolean vectors of length N), Hopfield showed how to calculate the t[i,j] connection elements so that the given patterns were stable fixed points of the asynchronous parallel algorithm. - Strongly recommend that you try playing with Java or other code example to get a direct feeling for how Hopfield network performs. Try storing several letters (e.g., A, B, C, and D) in a 5x5 network and study how various starting values on the 25 pixels converge or don't converge towards these letters. For four memories, can you find spurious fixed points that are stable but shouldn't be there? STORAGE CAPACITY OF A HOPFIELD NETWORK - Given N neurons, Hopfield found that his model could only store a finite number of memories, empirically 0.15N. This is MUCH smaller than the total number of 2^N possible states. As the number of memories stored starts to exceed about 0.15N, the efficiency and accuracy of the associate memory starts to degrade: it takes longer to reach a given fixed point and more spurious fixed points appear (spurious means that the neural network settles into fixed points other than the intended ones V^s used to construct the t[i,j] matrix). - REM sleep: If human memory were based on attractors, Francis Crick, Hopfield and other researchers conjectured that, it would be important to forget information to clarify most memories, and he speculated that this was the reason for some kinds of dreaming (REM sleep). In the context of the Hopfield model, "clarify" means making memories more orthogonal so that convergence is more rapid and spurious memories reduced. SCALE INVARIANCE: A WEAKNESS OF THE HOPFIELD MODEL - Weakness of Hopfield models: in both vision and hearing, people can extract information from an object that is greatly transformed from some standard shape. Thus a pencil may be rotated, translated, and rescaled to some arbitrary point in space, the ambient light may be bright or dim, and with different colors, yet people have no trouble identifying the pencil (unless it is head on to the viewer, in which case it looks like a dot). Similarly, people understand language even when spoken loudly or softly or when badly slurred because of colds or drugs. This invariance of understanding under continuous transformations of rotation, translation, or dilation are remarkable feats of human and animal senses, and are HARD to build into a Hopfield or similar mathematical networks. Thus a Hopfield network, if exposed to a single image of a pencil (say a line segment), may converge to the wrong memory if the line segment is rotated 45 degrees or shifted over by a few pixels. - Scientists conjecture that animals use sophisticated specialized PREPROCESSORS to map an input into some canonical input and THIS processed input is what is then fed into an associative neural network. Hubel and Wiesel, two famous experimental neurophysiologists, showed that some neurons in the brain often perform edge or motion detection and nothing else, which are certainly some of the properties that a preprocessor would have. SCALABILITY: HOW TO ACHIEVE IT? - Another difficulty of Hopfield networks, and in fact of all current approaches to AI, either logical or dynamical, is how to scale from plausible models of a few million neurons and a few thousand memories to human-levels of computation which involve 10^10 neurons and hundreds of megabytes of sound, sight, touch, and taste. Presently, no one knows how to scale up a Hopfield network to perform with that much data. HOW MUCH MEMORY IS STORED IN AN ADULT HUMAN BRAIN? 200 MB? - Interjection: How much information do people actually have in their head, i.e., how many memories N might be stored in the human brain if we considered the brain to be a giant Hopfield network with 10^10 neurons and 10^13 interconnections? Using several different methods, the best guesses by psychologists approximately agree and give of order 10^9 bits, 1 gigabit or a few 100 megabytes. This is a pretty large amount but still less than the 10^10 neurons in your head and the 10^14 connections between neurons. This order of magnitude estimate is roughly compatible with Hopfield's estimate of 0.15N, although no one presently believes that Hopfield is even close to being biologically correct. - THOUGHT QUESTION: Does biology use dynamics along the lines of Hopfield, or does it just calculate nearest neighbors fixed points in phase space according to some metric? The latter seems more straightforward conceptually and mathematically but is awkward to achieve as a spontaneous algorithm arising from evolution. How would you keep track of all the fixed points, from which you would then try to look up the nearest neighbor? Basins of attraction may more complicated and interesting shapes which may be important. Perhaps some of you will succeed in answering this question with future research. - References: - "Neural networks and physical systems with emergent collective computational abilities" by J. J. Hopfield, Proc. Natl. Acad. Sci 79, 2554-2558 (1982) - "Unlearning has a stabilizing effect in collective memories" J. J. Hopfield, D. I. Feinstein, and R. G. Palmer, Nature 304, 158-159 (1983) - Thomas K. Landauer, J. of Cognitive Psychology, 1988, "How Much Do People Remember? Some Estimates of the Quantity of Learned Information in Long-Term Memory". - "Neural Networks" by Simon Haykin (Macmillan, New York, 1994); reasonably up-to-date and readable textbook about mathematical neural networks: REVIEW OF IMPORTANT RECENT POINTS: - Finished our discussion about Hopfield networks. Discussed fact that N-neuron network can store about 0.15N memories before convergence becomes slow or many spurious memories start to appear. - For a N-dimensional flow X'=F(X), a (N-1)-dimensional Poincare map is obtained by using the flow to take one point to a successive point on some (N-1)-dimensional surface (e.g., a hyperplane) that intersects some bounded orbit of the flow. There is an infinity of Poincare maps corresponding to many different ways of choosing the Poincare section or surface but they all have the same dynamics: they are all periodic, quasiperiodic, or chaotic together. FRACTALS AND FRACTAL DIMENSIONS - I now turn to the next two topics of the course: fractals and fractal dimensions. Fractals themselves could (and often do) take up an entire semester of its own because of the wide-spread appearance of fractals in nature and in many artificial phenomena. Here, I address mainly the parts of the subject that help to understand the complexity of a dynamical flow. - The big picture is this. A bounded orbit of some dynamical dissipative system traces out some kind of attractor in phase space, after transients have decayed. It is possible both theoretically and empirically to assign a number to the attractor---its fractal dimension D---that indicates its "complexity". The dimension D is often not an integer, hence the term "fractal" meaning a dimension with non-zero fractional part. A simple mathematical theorem says that any set of fractal dimension D must be contained in a phase space of at least dimension Ceiling[D] (smallest integer larger than D), and so gives a lower bound for the smallest possible phase space that could generate the observed dynamics. Chaotic systems are typically defined to be deterministic systems with D <= 5, i.e., with modest fractal dimension. This number D is a natural way to estimate the complexity of a dynamical system via time series measurements. - A final piece in the puzzle is that one can recover fractal dimensions of attractors from empirical time series WITHOUT KNOWING THE DYNAMICAL SYSTEM. This amazing result holds under general and weak mathematical assumptions, e.g., no experimental noise, a deterministic origin of the signal, and smooth nonlinear transformations of a given signal. As usual with mathematical theorems, there are many delicacies in extracting fractal dimensions from actual time series. You explore some of these subtleties in the last homework assignment. - Our discussion of fractal dimensions of sets and of attractors will complete an important issue raised earlier in the course: how to distinguish time series that have broad-band power spectra. We saw before that power-spectra lose their usefulness when there are broad-band features, nothing substantial seems to change in the spectrum as parameters are varied. Unlike power spectra, the fractal dimension (provided that one can calculate it reliably) does vary with parameters and often in an interesting way. WHAT IS A FRACTAL SET? - Fractals are sets of points that are SELF-SIMILAR or SCALE-INVARIANT (physicist's language). Little pieces of a fractal set, when magnified, look identical under rotation and translation to the original set, i.e., they have no characteristic length scale. - A circle is NOT fractal since any small piece, when magnified, is an arc and not a circle. Atoms and people are not fractal, they have a characteristic length scale. A finite set of points can not be fractal since magnifying the vicinity of any one point does not reproduce the original set. A single point, a line, and a plane are fractal but have a boring geometric structure (integer dimension). - Essence of quantitative definition of a fractal involves the concept of SCALING, a specific term introduced by physicists when developing a theory of equilibrium phase transitions but long understood by many people over the centuries. Thus a fractal is a set such that some scalar attribute, e.g., the volume or mass contained in some ball centered on a point in the fractal, varies as a POWER LAW as the radius of the ball diminishes to zero. Nearly anytime a power-law behavior is observed physically between two quantities, say y = c x^b for constants c and b, there is likely a fractal involved in the physics. EXAMPLES: - BOX-COUNTING DIMENSION of a set S: carve all of space up into non-overlapping cubes of size eps and let N(eps) be the number of boxes that contain at least one point of the set. Then the box-counting dimension is defined by the following scaling relation. Limit_{eps -> 0} N(eps) propto eps^{-D} - CAPACITY C: somewhat harder to calculate is the capacity of a set S. Instead of carving space into preexisting boxes, independent of any set S under consideration, now let N(eps) denote the MINIMUM number of boxes of radius eps needed to cover the set. Then the scaling exponent N(eps) -> eps^{-C} as eps -> 0 defines the capacity if this limit exists. For many simple sets, the capacity equals the box-counting dimension but this is rarely true for strange attractors and natural fractals. - CORRELATION EXPONENT nu: the box-counting dimension and the capacity definitions don't take into account the frequency of visiting different parts of phase space. It turns out that strange attractors are typically quite singular sets from a dynamical point of view in that nearly all the time of a given orbit is given in a small fraction of the attractor, i.e., the different parts of the attractor are not visited with equal probability. A still different dimension---called historically the "correlation exponent" and denoted by the lower case Greek letter nu---comes from asking how many points of a given set are within distance eps of each other: N(eps) = "# of pairs of points closer than eps" This is a measure-based estimate of the fractal dimension and is often the easiest and most robust to compute numerically since it automatically takes into account the frequency of visitation of the set. - There is an uncountable infinity of other fractal dimensions that can be defined. It turns out that all the fractal dimensions are related to a special function D_q (D subscript q) where q is a real number. Thus D_0 is the capacity of a set, D_1 is the so-called information dimension of a set, and D_2 is the correlation exponent of the set. One can prove an inequality that D_q >= D_q' for q' >= q. Sets for which D_q is not a constant are called MULTIFRACTALS and this is the most common case in nature. - True mathematical fractals can not exist in nature, for the simple reason that quantum mechanics provides a cut-off in length scales of order the Bohr radius (about one Angstrom). The best one can say of a natural system is that it exhibits approximate fractal scaling over some RANGE of length or time scales. - Fractals can also be statistically but not geometrically self-similar in that magnified pieces have identical statistical properties as the original piece. This is the case for the path traced by a particle undergoing Brownian diffusion. The dollar-yen exchange rates were found by you to be statistically fractal, an extremely interesting result. TWO MATHEMATICAL EXAMPLES OF FRACTALS - Decimation construction: two-third's Cantor set obtained by taking unit interval and recursively wiping out middle third open interval. This reduces length by 1/3 at each step, so length of limiting set must be zero. Yet there are uncountably many points in the Cantor set, since these points can be put in a one-to-one correspondence with all decimal real numbers in based three whose digits only has 0's and 2's (no 1's). This set is self-similar, any small segment of this set can be magnified to obtain a set that is identical to the original one. We will see that one of the fractal dimensions for this set, the capacity, has the value D=ln(2)/ln(3) approx 0.63093, so this set is closer to being a line (D=1) than a point (D=0). CHALLENGE: To test your understanding of the Cantor set geometry, find a point in the middle-third Cantor set that is NOT a boundary point for one of the closed intervals used in taking the limit to construct the set. - Similarly the Koch snowflake is a limiting set obtained by inserting line segments recursively into an equilateral triangle. This leads to a bounded object of finite area but infinite length, of fractal dimension (capacity) D=ln(4)/ln(3)=1.262. . You should be able to show that the set obtained from the unit triangle (sides 1) must lie in a square of sides (2/3) Sqrt[3], so total area must be less than 4/3. FRACTALS IN NONLINEAR DYNAMICS - Attractors of bounded flows and maps are often fractal. NOTE: Strange attractors, defined to have sensitive dependence on initial conditions, are NOT necessarily fractal. Thus the logistic map has a fractal but non-strange attractor for the accumulation point of period doublings near r=3.7. - Basins of attraction are often fractal, e.g., for the cubic roots of unity under Newton's numerical algorithm. For example from engineering literature, see Moon and Li, Physical Review Letters 55(14), 1439-1442 (1985), of periodically forced particle in two-well potential. - Set of parameter values that lead to given behavior (e.g., periodic limit cycle) are often fractal. - These three common occurrences of fractal sets explain why traditional perturbative analytical approaches are doomed to fail in developing theories of nonlinear equations. One can not capture infinite details of fractals in smooth expressions. This does NOT mean analytical analysis is hopeless, just that a completely different and nontraditional approach is required. PHYSICAL EXAMPLES OF FRACTALS - It is a marvelous and not fully understood fact that fractals are common in many aspects of nature, a point most forcefully put forward by Benoit Mandelbrot of IBM. Some examples: - mixing of fluids (ink in glycerine) - clouds (D approx 2.5) - turbulent fluids (constant-velocity-component surfaces) - surfaces of mountains - seashores - clusters of stars in sky, clusters of galaxies - branching in biological systems (arteries, lungs, brain, trees) clearly arises from need to maximize ratio of surface area to volume, i.e., transport. - earthquakes (Gutenberg-Richter law of intensities) - Fractal star distributions: check out the MPEG movie available at the site: http://oposite.stsci.edu/pubinfo/education/amazing-space/hdf/intro13.htm which shows successive magnifications of a part of the sky made possible by the power of the Hubble space telescope. This movie gives a keen sense of the remarkable distribution of stars throughout the universe. MECHANISM FOR MAKING FRACTALS: MIXING BY STRETCHING AND FOLDING - A very important and applied example of fractals arises from the literal MIXING of two fluids by mechanical stirring. If the fluids are not immiscible (e.g., oily ink and water), one gets intricate folded sheets of a fractal structure. What is extremely important is that the mixing is not uniform no matter how long one mixes, there are many thin regions where no mixing occurs. The fractal dimension gives a measure of the efficiency of mixing. - Strange attractors are fractal orbits that arise from mixing in phase space, a consequence of boundedness, dissipation, and sensitive dependence on initial conditions. Dissipation implies that phase-space volumes shrink to zero volume while s.i.c. implies that the volume contraction is not uniform, some parts stretch greatly while others contract greatly. Boundedness implies folding over and over again of phase-space volumes. PHYSICAL MECHANISM FOR MAKING FRACTALS: DIFFUSION LIMITED AGGREGATION (DLA) - Particularly elegant example of fractal since a physical mechanism is known that is trivial: random walk a point on a lattice (any kind) and let it 'stick' if it comes in contact with a seed point. Repeat for large numbers of points. Obtain wispy structure of dimension about 1.7, independent of the lattice geometry (e.g., square or hexagonal) or of the random number generator. This is the same dimension observed in many experiments, e.g., in electrochemical deposition to grow metal crystals at an electrode immersed in a solution. - DLA fractals have odd property that their densities DECREASE as more mass is added. Most objects have constant density as they get bigger, e.g., a glass of water or a bar of steel. To see this - Each particle has mass 1, total mass is N particles - Fractal implies that N proportional to L^D, i.e., number of particles grows with length with fractal dimension D. - Actual volume is L^E, E is dimension of space, with E > D. - Then density = L^D / L^E = L^(D-E) which decreases with increasing size L of system. - DLA fractals are relevant to: - electrochemical deposition - thin-film morphology - dendritic solidification - dielectric breakdown - viscous fingering. THE BIG MYSTERY: WHY FRACTALS? - The mystery is why fractal structures occur. In most cases the fractal structure has never been predicted a priori, and is not understood a posteriori. The presence of fractals means that the physics is scale invariant over many length or time scales, but why this property should hold is not understood in any deep way. This is a crucial unsolved mathematical problem. It is irritating that we can't prove that a fractal set exists even for the simplest case of DLA. HOW TO CHARACTERIZE FRACTAL SETS? USE DYNAMICAL INVARIANTS - Visual insight of fractal sets is limited to phase spaces of dimension E <= 3, which is too restrictive. We need more quantitative tools for extracting properties of fractal sets. - The ability to recover some information about attractors by embedding a time series (or several time series) raises the mathematical question about what information is preserved. Experiments generally measure complicated functions of the original observables, e.g., a light beam may enter a fluid, be scattered, be focused, then digitized to get a time series. This means that any useful quantity to characterize a time series or the reconstructed attractor should be INVARIANT under general smooth NONLINEAR transformations. - This rules out common concepts like positions or heights of peaks in a power spectrum, the size of a set, the period of an orbit, since these are all distorted substantially by measurement. - Instead, one must study more subtle properties of sets. One simple property that should be preserved is the topological structure of Poincare sections. Thus general smooth mappings should not change a periodic orbit (consisting of a finite number of points) into a nonperiodic orbit, e.g., a continuous curve as occurs for a torus. - The simplest invariant is the DIMENSION of the set of points constituting the attractor in phase space. There are many ways of defining dimension of a set, but they all involve studying how some LOCAL property of the set varies as a function of the "diameter" of some ball. Most dimensions are clearly invariant since they are based on counting how many points are in a given ball. This number does not change when the ball and any points inside this ball are mapped to new points by smooth arbitrary nonlinear transformations. - Another property is how nearby points separate over time. If the nonlinear transformation is continuous, clearly points that stay close in time should stay close in time under arbitrary transformations of orbits. Further, the rate at which points separate should be independent of the coordinate system. This observation leads to the idea of LYAPUNOV EXPONENTS, which characterize dynamical properties of flows. - We have already discussed the first and largest Lyapunov exponent, the average rate of separation of nearby orbits. One can also consider groups of three points in a plane, four points in a volume, etc., where the points start very close to each other. This leads us to define the average growth rate of lines, areas, volumes, etc., from which we can define further exponents. A system living in a N-dimensional phase space has N Lyapunov exponents, lambda[i]. There is a rather extraordinary and beautiful formula that gives a fractal dimension in terms of the Lyapunov exponents. Because of this formula, the Lyapunov exponents are considered the more fundamental of the invariants of a set. FRACTAL DIMENSIONS OF A SET - The common concept of dimensionality, as number of terms in vector needed to locate point in space, can be generalized to non-integer values. A set is mathematically called FRACTAL if this generalized dimension is not an integer, i.e., it is greater than the usual definition. - More precisely, your familiar concept of dimension is what mathematicians call the TOPOLOGICAL DIMENSION, which is recursively defined. A point has dimension zero by definition. A line or any one-dimensional set is such that removing a point breaks the set into two distinct pieces. A surface or any two-dimensional set is broken into two pieces by removing some one-dimensional set, e.g., a line, and so on. Topological dimension can be tricky to apply, e.g., what is the topological dimension of a set of lines or of a line and a square? - CAPACITY, also called the BOX-DIMENSION, is obtained by estimating how the number of boxes of size epsilon in phase space (or Euclidean space) needed to cover all points in the set increases (scales) as the size of the boxes decreases. One defines: C = Limit[ Log[N[eps]] / Log[1/eps] , eps -> 0 ] if this limit exists. More intuitively, for many sets, N[eps] has a POWER-LAW SCALING form as eps -> 0: N[eps] approx a eps^{-C} where a is a constant or a slowly varying function of eps. The exponent C is the usual Euclidean dimension for familiar sets (a point, a line, a plane, a circle, a sphere) but generalizes to more interesting sets. - The capacity is a weak summary of the properties of a set. Just as lines can be of all different shapes and curves, but all with dimension 1, there is an infinity of different fractal sets that all have the same fractal dimension. - Analytical example of capacity: the Cantor set, the Koch snowflake, and the set of points {1/i: i >= 1}. These have capacities of ln(2)/ln(3), ln(4)/ln(3), and 1/2 respectively. As an example, consider how many boxes of size eps are needed to cover the Cantor set up to M iterations of the constructive algorithm. The first step creates N(eps)=2 pieces of size eps=1/3. The second step creates four pieces N(eps)=4 of size eps=(1/3)^2. After M recursions, we have N(eps)=2^M and eps=(1/3)^M. We then try to take the limit of M -> Infinity in the definition of the capacity: D = Limit[ eps -> 0, Log[ N[eps] ] / Log[ 1/eps] ] = Limit[ M -> Infinity, M Log[2] / (M Log[3]) = Log[2] / Log[3] approx 0.63093... You hopefully see the general trick. We have not been rigorous in this argument in that we have not shown that this limit is the same as the limit of the MINIMUM number of boxes of size eps needed to cover the Cantor set, some more work is needed. The answer remains correct. - The crucial defining property of any fractal is that some quantity varies as a POWER-LAW under some scaling of size. Power-laws of any kind are the essential characteristic of scale-invariant, hence fractal, behavior. For example, we can define another fractal dimension, the CORRELATION DIMENSION, by defining some new quantity C(eps) to be the total number of points in a set that are closer than eps. Then if C(eps) acts as a power law for small eps, i.e., if D = Limit[ Log[ C(eps) ] / Log[ eps ] ] exists as eps -> 0, we again get a fractal dimension, which differs in general from the capacity. - Similarly, you could try looking at the smallest distance between points in a set as a function of the number of points in the set. - In the homework, I use still another variation of a fractal dimension, the CLUSTER dimension of a set. This is determined as a kind of inverse of the correlation dimension. Instead of counting how many pairs of points lie within distance eps (fixing a distance and then counting), one determines the average radius of a ball (centered on a given point of the set) that contains N points (fixing a number and then choosing a typical distance). Instead of C(eps), we end up with R(N). This now behaves as a power law as the radius goes to zero, and the scaling index gives a dimension that is generally different than both the capacity and the correlation dimensions. - Mathematically, there is no deep significance between the various (infinitely many) definitions of dimension. HOWEVER, numerically, there can be quite large and practical differences in the effort needed to estimate a particular dimension. The capacity and other geometric dimensions (Hausdorff, potential, etc.) do not converge rapidly enough for time series associated with strange attractors and are not practical for numerical studies. A discussion is given in H. S. Greenside et al, Physical Review A Vol. 25, 3453 (1982). A brief explanation is as follows. An orbit fills out a set of points (the attractor) in the limit of infinite time. However, the orbits does not fill out this set uniformly, it visits parts of the set rarely, other parts of the set more frequently. One can mathematically show (see analytical example in Farmer, Ott, Yorke paper) that attractors are in fact singular in the sense that the orbit spends most of its time visiting a vanishingly small part of the set. For this reason, dimensions such as the capacity are difficult to apply since one must integrate for extremely long periods to count the boxes that contain parts of the attractor that are visited quite rarely. The correlation dimension takes this probability of visitation into account automatically and converges faster. NUMERICAL ESTIMATES OF FRACTAL DIMENSIONS - The power-law functional form suggests how to identify fractal dimensions numerically. One plots some log-log plot such as Log[10,N[eps]] versus Log[10,1/eps] , for the capacity, or Log[10,C[eps]] versus Log[eps] , for the correlation dimension and then one looks for linear behavior in the limit of eps -> 0. This leads to all kinds of numerical subtleties, since this limit can not be taken with a finite amount of data. In practice, one plots the local slope of this log-log plot: D[log(N(eps))]/D(log(1/eps)) versus log(1/eps) , where the local slope can be estimated by simple finite differences: D[log(N(eps))]/D(log(1/eps)) approx ( log( N(eps2) ) - log( N(eps1) ) ) / ( log( 1/eps2 ) - log( 1/eps1 ) ) , where eps1 and eps2 are successive values of size for which the box count N(eps) are determined. Wherever a power-law form is obeyed, this local slope becomes a horizontal plateau and one can read directly off the vertical axis the capacity of the set. - Dimensions based on invariant measures of flows and maps: correlation, cluster, information, and an infinity of others. If these measure-based dimensions are not all the same, the fractal is called a MULTIFRACTAL. Multifractals have different scaling exponents in different parts of the attractor, they are not homogeneous like the Cantor set or Koch snowflake. - Lyapunov dimension, based on Lyapunov exponents. This is best way to calculate fractal dimensions when the dynamical equations are known, but can't be used with just experimental data. ESTIMATING FRACTAL DIMENSIONS OF TIME SERIES: - So far, we have assumed that we know a flow or map and that we know the set of points filled out by an orbit in the limit of infinite time. In an experiment or numerically, we may never know the underlying equations or phase space, and can only measure time series representing some complex combination (projection) of the phase-space orbit. - Key point of this whole business: attractors can be reconstructed from single time series and the phase space dimension of the original flow or map is given approximately by the fractal dimension of this attractor. Thus one can estimate how many variables must be used in modeling an experiment or computer simulation WITHOUT KNOWING THE ORIGINAL DYNAMICAL SYSTEM. This provides a powerful empirical test for chaos, whether a given time series is produced by a low-dimensional attractor. - What makes this work are two simple observations. If a set B contains a set A, then dim(A) <= dim(B). Thus any phase space B must be at least the dimension of any attractor lying in that space. The other observation is that dim(A) = dim(F(A)) for arbitrary smooth nonlinear maps F(X) of E-dim space to E-dim space. Thus dimensions are preserved under general experiments and changes of coordinates. - This result has stimulated hundreds of papers in different disciplines that have tried to estimate the fractal dimension of various time series, with the goal of discovering whether there is substantially less complexity than originally assumed. - Procedure to extract a fractal dimension proceeds as follows. First, one embeds data of one-dimensional time series in spaces of increasing dimension, E, using some choice of delay time. The dimension E is called the embedding dimension, and must be systematically increased. - For each choice of embedding dimension E, estimate fractal dimension, D[E]. Best not to use definition of capacity, but rather definition based on measure, e.g., pointwise dimension or some average, e.g., the correlation exponent or cluster dimension. - Fractal dimension D[E] obtained by plotting log-log plot and looking for straight lines. Alternatively, better obtained by plotting derivative (slope) of log-log plot directly, shows fine structure. This curve generally shows a lot of structure, which makes the determination of a scaling (power law) region quite difficult or ambiguous. Even for a 'simple' low-dimensional set like that generated by the Henon map, one finds that the slope versus log plot does not become perfectly flat in the limit of infinitely many data points because of the singular measure of the Henon set. - Assuming the ambiguities of the scaling region are resolved (or accepted) in some way, one then plots D[E] versus E. If Df is the actual fractal dimension, one expects: D[E] = E for E < Df D[E] = Df for E > Df Thus easy to read off fractal dimension from this plot. FINITE-DIMENSIONAL STOCHASTIC TIME SERIES: - WARNING. Unfortunately a finite fractal dimension does NOT indicate by itself that the dynamics is deterministic, although it is true that infinity- distributed sequences have infinitely large fractal dimensions. See the discussion by A. R. Osborne and A. Provenzale, "Finite correlation dimension for stochastic systems with power-law spectra", Physica 35D, 357-381 (1989). - Stochastic time series can have finite fractal dimensions, e.g., Brownian motion (D=2) or, more generally, fractional Brownian motion, which can have arbitrarily large fractal dimensions with D>1. See Mandelbrot's book or Feder's book for elementary discussions. - Possible to have short-time fractal (nondifferentiable) structure that fools algorithms based on long-time recurrence. But only known examples have power-law power spectra so are easily identified. AMOUNT OF DATA NEEDED GROWS EXPONENTIALLY WITH D ==> D <= 7 - For any definition of dimension, the amount of data needed for a numerical dimension estimate grows EXPONENTIALLY rapidly with the dimension of the attractor. This is easily seen for the capacity, where the number of boxes of a given size eps needed to cover a set grows as Exp[ - log(eps) C ] with increasing C. This means that as a set becomes higher and higher dimensional, it takes more and more points to determine this high-dimensional structure. - In all studies to date, there is a casual consensus that a fractal dimension D >= 7 can not be estimated accurately since the amount of data needed in a time series would be tens or hundreds of millions of points, too much. This restriction on dimension estimates does not hold for the so-called Lyapunov dimension, but this can ONLY be calculated if the underlying equations are known. - This should not be too surprising. A given orbit traces out an attractor in the limit of infinite time. If the attractor has high dimension, one has to follow the attractor longer and longer (more and more points) to generate the points that effectively flesh out the high-dimensional structure. Thus think of an embedded quasiperiodic time series tracing out a donut-shaped attractor. For short times, the orbit is just a line. For longer times, the orbit starts coming arbitrarily close to points that have been generated before and it is precisely these near repeats separated by large intervals of time that slowly flesh out the two-dimensional structure of the attractor. - What happens if one does not have enough data, e.g., what happens if one tries to estimate fractal dimension of 10-dimensional attractor given just 500 points? Unfortunately, one still often observes reasonably unambiguous scaling regions that are invariant under increasing embedding dimension. This is a simple artifact: a small set of points at some resolution will always look approximately like a line or a plane, etc. Thus one has to work very hard and systematically to verify that a dimension estimate is reasonable. NUMERICAL RESOLUTION GROWS RAPIDLY WITH REYNOLDS NUMBER: - The Kolmogorov theory of homogeneous turbulence makes an interesting prediction about the the effective fractal dimension or size of phase space needed to simulate fluid turbulence. A nice summary is given on page 408 of the Manneville book, with further references. - In many fluid experiments, there is a flow with a characteristic velocity V (e.g., the speed of fluid entering a pipe or the mean flow of air past an airplane wing). The fluid flows past an object with characteristic length L (e.g., the diameter of the pipe or wing). The fluid is generally viscous, with a kinetic viscosity nu with dimensions meter^2/sec. From these three variables, one can form a single dimensionless combination known as the Reynolds number: Re = V L / nu . This ratio measures the degree that the system is driven out of equilibrium, with the numerator being proportional to the driving (velocity) and the denominator giving the dissipation. - The Kolmogorov theory predicts that the fractal dimension of the flow is related to the Reynolds number Re in the following way: D = a Re^(9/4) , it grows as the 2.25 power of the Reynolds number. This is bad news for fundamental fluid simulations since typical Reynolds numbers may be of order 10^5 so D may be of order 10^10 within two orders of magnitude, a huge number. In fact, the above relation is an overestimate but the exact relation is not understood. - Numerical simulations of pdes such as Navier-Stokes begin by discretizing the spatial domain. In finite differences, e.g., one assumes that all quantities (pressure, temperature, velocity fields) are known only at discrete mesh points. The numerical simulation introduces a phase space that is the total number of spatial degrees of freedom, e.g., Nx Ny Nz where Nx, Ny, Nz are the number of spatial mesh points in the three-dimensional domain. Since any attractors of the numerical simulation live in this phase space, one must have: Nx Ny Nz >= Re^(9/4) , which provides a lower bound on the numerical resolution. In fact, one can not really choose the spatial resolution close to the minimal number of modes indicated by the fractal dimension, so numerical simulation must have even more degrees of freedom to resolve correctly a turbulent flow. - This leads to an interesting but relatively unexplored use of fractal dimensions, as ways of verifying that adequate numerical resolution is used. Just as one increases the embedding dimension E of a time series until the fractal dimension becomes independent of E, one should increase spatial resolution in a pde solver until the fractal dimension of the dynamics becomes independent of that resolution. At that point, the dynamics are correctly resolved. An extremely interesting but unsolved question is how to design numerical simulations that have the smallest numerical mesh consistent with the fractal dimension of the dynamics. A solution to this problem could increase computational capabilities by orders of magnitude. UNSOLVED QUESTIONS CONCERNING FRACTALS: - Not understood why fractals occur so often in nature, e.g., there is not yet a good theory that explains why diffusion limited aggregation leads to fractals or, even assuming fractal behavior, why the fractal dimension is about 1.7. Similarly, it is poorly understood why galaxies are organized in a fractal. Simple mechanisms are known but not proofs why these mechanisms work. - Not understood yet how to extract actual few degrees of freedom implied by small fractal dimension, i.e., how to obtain dynamical equations. Fractal dimension is only a very simple way to summarize a complex situation. - Not understood yet how fractal dimension varies generally with bifurcation parameters e.g., can it jump rapidly from small to large values? - Very expensive and difficult to calculate fractal dimensions for higher dimensional fractals since amount of points in S must grow exponentially with the dimension D. Are there better ways to study the data? What other invariants are experimentally accessible?