9-29-03 Physics 213 HIGH PRECISION STUDY OF NONLINEAR DYNAMICS: GOLLUB AND BENSON - For several lectures, we will discuss an historically important and elegant experimental paper that illustrates some non-classical nonlinear behavior. This paper will serve three functions: - Give an idea of the quality of data that can be obtained under the most ideal of circumstances, for a nonlinear dynamical system. This sets a standard by which more difficult subjects such as ecology, economics, meteorology, neurobiology can be compared. - Give you an exposure to just how unexpectedly rich and subtle nonlinear dynamics can be experimentally. - Introduce you to a widely used and important tool for evaluating complexity in time series, the power spectrum. WHAT SYSTEMS TO STUDY EXPERIMENTALLY? - We should discuss some philosophy first. Theory like the Landau hypothesis and the Ruelle-Takens theorem, curiosity, and just plain desire to look at nature more carefully than anyone has done previously, is part of the motivation for looking at nature experimentally. Since the world is a big and complex place, composed of thousands or millions of interacting nonlinear dynamical systems, one has to think very very carefully about what kinds of experiments will produce clear insights that may generalize to other nonlinear systems. - Physicists have thought roughly along these lines: chemical, biological, engineering, economical, and meteorological systems are mostly too complex to be worthy of high precision study. For example, most chemists still can not determine all the reactions that take place in solution because of the transience of short-lived intermediate states that typically occur. This means that chemists have a hard time making truly quantitative statements about their time-dependent experiments. For the same reason, biologists have an incredibly hard time in finding even simple "in vitro" systems that reproduce interesting biological features (like reproduction or protein folding), in such a way that the in vitro system itself can be characterized fairly thoroughly. - In the language of an earlier lecture, chemical and biological systems are difficult to work with because an adequate set of states is not known. - If we give up chemical and biological systems, we are left with simple mechanical or electronic systems (e.g., pendulums and springs, or resistors, capacitors, inductors, diodes) or with homogeneous continuous media such as plasmas, gases, fluids, and solids. Mechanical or electronic systems, and their corresponding equations such as the damped driven nonlinear pendulum, were a major early source of insight into nonlinear dynamics. They continue to be of importance, especially pedagogically, but they have one possible drawback: by construction, simple mechanical and electronic systems have only a few degrees of freedom (a finite number of odes), and so it is impossible to develop a truly great complexity, no matter how various parameters are varied. Mechanical systems leave out the possibility that the number of degrees of freedom can themselves change as part of the dynamics. - So we could study just mechanical systems, but we would have to be wary that such systems may not be representative of some of the most interesting cases. Note: the book "Chaotic Dynamics, 2td edition" by Gregory Baker and Jerry Gollub (Cambridge U. Press, 1996) is based mainly on considerations of the driven damped nonlinear pendulum, and is one example of a discussion completely based on a mechanical system. - A remaining alternative to mechanical and electronic systems are approximately continuous homogeneous media such as solids, plasmas, gases, and fluids. Solids do not undergo interesting dynamics and we can still not solve the Schrodinger equation in its entirety to get a complete quantum description. Plasmas (ionized collections of atoms) are the most common state in the universe but are are hard to create cleanly in the laboratory since they are contaminated or modified by confining surfaces like glass walls. - We are then left with gases or fluids which, fortunately, do undergo transitions (bifurcations) to complex behavior under many conditions. Further, under a wide set of conditions, we know a highly accurate mathematical description of a flowing gas or liquid, namely the Navier-Stokes equations. Further, gas and fluid flow is of great technological significance. So a first guess for probing fundamental properties of nonlinear dynamical systems would be to work with gases of fluids and with their mathematical description, the Navier Stokes equations. Will show it class how pdes should be thought of as infinite-dimensional dynamical systems since they can be written in 1st-order autonomous form dU/dt=F(U) with U an infinitely long vector which one can think of as the time-dependent Fourier coefficients of a field u(t,x) expanded in a Fourier basis of spatial modes exp(i m k x) where m is all possible integers and the wavenumber k = 2 Pi / L is associated with the length L of the domain over which the field u lives. Specific example: the Kuramoto-Sivashinsky equation u_t = - u_{xx} - u_{xxxx} - u u_x , on the interval x in [0,L] with periodic boundary conditions: u(t,x) = u(t,x+L) for all x . Substitute Fourier expansion: u(t,x) = Sum[ u_m(t) Exp[i m k x], {m,-Infinity,Infinity} ] and obtain coupled ode equations for the complex coefficients u_m(t). - History has led to two successful choices in fluid experiments for exploring basic nonlinear dynamics. One is called Rayleigh-Benard convection, named after a leading theorist (Lord Rayleigh) and leading experimentalist (Henri Benard of France) who did some of the pioneering work around 1915. (Note: Benard's name is not Bernard, and Rayleigh is different from the city of Raleigh.) This consists of a fluid in a box, heated from below and cooled from above. The second fluid experiment is Tayor-Couette flow, named again for a leading theorist (Taylor) who made and confirmed the first theoretical analysis of an experimental apparatus invented by Couette for measuring the viscosity of a fluid. In this case, one has two concentric fluid cylinders with the gap between them filled with a fluid like water. The inner fluid is then spun at a constant angular frequency Omega (e.g., by turning the cylinder with a motor) while the outer cylinder is held fixed. The experiment then consists of observing the state of the fluid as the angular velocity is increased from zero (fluid in equilibrium) to larger and larger values, driving the fluid out of equilibrium into a sustained nonequilibrium state. - In Rayleigh-Benard convection, a box is carefully constructed and then filled with a gas or fluid, often water, but sometimes more exotic fluids like oils (Benard used spermaceti, a clear oil obtained from whales) or liquid helium gas near absolute zero. The top and bottom of the box are made of flat parallel plates that are extremely good conductors of heat. Gold-plated copper is frequently used, but experimentalists also use sapphire plates, which are among the best optically-transparent thermal conductors. The fluid is then heated from below and cooled from the top by applying a heat uniformly to the bottom plate and withdrawing heat from the top plate, e.g., by flowing cold water over the top plate. This creates a static, spatially uniform temperature difference dT across the fluid which "drives" the fluid out of equilibrium. If this temperature difference dT becomes large enough (we will discuss momentarily what "large" means), the fluid becomes linearly unstable and spontaneously develops motion and structure in the form of convection rolls consisting of approximately periodic rising and falling of warm and cold plumes of fluid. - Experimental Rayleigh-Benard convection is characterized by approximately time-independent, approximately spatially homogeneous boundary conditions, which is about as simple as you can get. There is no net flow of fluid past the observer (as in turbulent pipe flow), and the boxes can be engineered to extremely high precision. It is easy to visualize the fluid flow through sapphire plates, and high-precision data is easily obtained over long intervals of time. For mainly these reasons, convection was and continues to be one of the major experimental sources of insight into nonlinear dynamics and nonequilibrium physics. - Rayleigh-Benard convection is extremely important in many engineering and natural science contexts: convection underlies plate tectonics in geology, it explains the transfer of heat from the nuclear core of the sun to outside the sun, it is a dominant affect in weather, and it plays an important role in removing heat from electronic and biological systems. ATTRACTIVE FEATURES OF RAYLEIGH-BENARD AS EXPERIMENTAL SYSTEM - Time independent boundary conditions - Spatially homogeneous boundary conditions - Nonequilibrium system because of fixed vertical temperature gradient. - No mean flow of fluid which simplifies observation. - Complexity increases slowly as Rayleigh number Ra is increased (unlike pipe flow). POWER SPECTRUM OF TIME SERIES - Would like to discuss in a little more detail the process of taking the power spectrum of a time series since this is such an essential part of Gollub and Benson's data analysis. - The input to a power spectrum calculation is a finite sequence of numbers x[i] for 1 <= i <= N , representing a time series of some observable at equally spaced times: t[i] = i dt , where dt is a constant sampling rate or time step. If data is not evenly spaced, one can try to interpolate the data to evenly spaced times. It takes some trial and error, some experience, and a good knowledge of the physical characteristics of a problem to know how to choose a good sampling rate dt. Crudely, the sampling rate dt has to be shorter than any natural time scale in the problem, say the time for an oscillation or reciprocal of exponential growth or decay rate. - The output of a power spectrum calculation is a new sequence of NONNEGATIVE numbers P[j] that is roughly half as long as x[i], representing a compression of the input data: P[j] for 1 <= j <= N/2 . One is roughly throwing out HALF the available information. Any given sine wave has an amplitude A and a phase phi: x(t) = A Cos[ w t + phi ] . The power spectrum holds onto A^2 and throws out the phase phi. MANY DIFFERENT TIME SIGNALS CAN LEAD TO THE SAME POWER SPECTRUM, all differing by different choices of phases. - The discrete power spectrum P[j] is an approximation to the magnitude square |A[j]|^2 of different Fourier modes A[j] Exp[I w[j] t ] of frequency w[j] that make up the signal x[i]. The discrete frequencies are given by: w[j] = j dw = j / (N dt) where the frequency increment dw is given by dw = 1 / (N dt) = 1 / T . The time T = N dt is the total duration the time series. THERE IS THEREFORE A LARGEST POSSIBLE FREQUENCY THAT CAN BE OBSERVED in a discrete time series, w_max = (N/2) dw = 1 / (2 dt) which is called the NYQUIST FREQUENCY. If you try to take the power spectrum of a sample that has frequencies larger than the Nyquist frequency, you observe incorrectly peaks below the Nyquist frequency, a process called "ALIASING". CALCULATING POWER SPECTRA USING MATHEMATICA - One can calculate power spectra quite easily using Mathematica by reading in an external function that I have already written, PowerSpectrum.m, which is part of the following notebook: http://www.phy.duke.edu/~hsg/213/misc//213/power-spectnum-examples.nb Thus you can try these statements with this new notebook: dt = 0.2 (* set time step *) x = Table[ Sin[(2. Pi) (i dt)] , {i, 128} ] ; ListPlot[ x, PlotJoined -> True ] psd = PowerSpectrum[ x, dt ] (* calculate power spectrum *) Dimensions[ psd ] (* shape of psd variable *) PlotSpectrum[ psd ] (* plot spectrum *) PlotLogLinSpectrum[ psd ] (* log10-linear plot *) PlotLogLogSpectrum[ psd ] (* log10-log10 plot *) This time series x[[i]] has period frequency f=1. An infinite continuous function sin(2 Pi t) would have a power spectrum consisting of a delta function centered on 1, P(f) = c delta(f - 1) , where the coefficient c is not necessarily one since there are various normalizations to P(f) that one can apply. The finite length of the time series smears the delta function into a broad peak (this happens even for a continuous function) and causes oscillatory "wings" or "shoulders" that decay off to both sides. It is then somewhat less clear what a single frequency looks like in the power spectrum. UTILITY OF THE POWER SPECTRUM - From the most practical point of view, power spectra are used for three purposes: 1. Classify type of temporal evolution: - periodic: peaks with multiple integer spacings - quasiperiodic, peaks with n1 w1 + n2 w2 positions - broad-band or continuous features. 2. Identify characteristic time scales. 3. Identify transitions from one 'phase' to some other 'phase', i.e., bifurcations which are generalizations of phase transitions. - A simple physical example of the power spectrum is singing into a piano with the damper opened. If you stop your voice suddenly, the magnitude of vibration of each piano string is an indication of how much your voice is generating energy at that frequency. - To avoid getting lost in mathematical details, I will first give examples of how power spectra are used to analyze empirical data, then return after two lectures to give some mathematical discussion of their properties. GUIDELINES FOR CALCULATING POWER SPECTRA: - Without explanation (which will come later), one wants to strive towards these goals for taking accurate numerical power spectra: - Take samples at a rate dt as small as possible to capture high frequencies and to avoid aliasing. - Take samples as long as possible, T = N dt, since this determines both the frequency resolution 1/T and the lowest observable frequencies. - Be prepared to take samples at different sampling rates to detect aliasing (high frequencies appearing erroneously as low frequencies). THE BOUSSINESQ EQUATIONS - What parameters should an experimentalist vary to carry out a convection experiment? One needs to appeal to theory at this point since the dynamical equations in dimensionless form suggest two fundamental dimensionless parameters to vary - the Rayleigh number R - the Prandtl number Pr. These are not the only two parameters but are the most important ones. - Rayleigh-Benard convection is described by five nonlinear pdes known as the Boussinesq equations after a French fluid dynamicist Boussinesq. These equations consist of four equations for the three velocity components and the pressure plus an extra equation from the need to describe how the temperature T(x,y,z,t) varies in space and time, and from the assumption that the fluid is incompressible (density is constant). These pdes have not been solved analytically except when the driving is weak, i.e., when the temperature difference dT is relatively small (compared to what we shall also discuss). Just to be somewhat complete, I give here the coupled nonlinear partial differential equations known as the Boussinesq equations that describe buoyancy-induced convection. These consist of five equations for the following five fields: Ux(x,y,z,t) x-component of the velocity U , Uy(x,y,z,t) y-component of the velocity U , Uz(x,y,z,t) z-component of the velocity U , p(x,y,z,t) pressure field , T(x,y,z,t) temperature deviation field , where "temperature deviation" field means the difference between the actual temperature and the linear profile "a + b z" obtained by assuming conduction without convection. The Boussinesq equations are then given by D[U,t] = - (U . Grad) U - Grad[p] Newton's F=MA, momentum conservation + Pr T zhat + Pr DEL^2 U , D[T,t] = - (U . Grad) T entropy generation, heat diffusion equation + Ra U . zhat + DEL^2 T , Div[ U ] = 0 . fluid incompressibility These symbols have the following meanings: Grad denotes the gradient operator that converts a scalar potential into a vector zhat is a unit vector opposite to the direction of gravity D[T,t] is the partial derivative of T w.r.t. time t DEL^2 is the 3D Laplacian operator Dxx + Dyy + Dzz Pr is the dimensionless Prandtl number, the ratio of fluid viscosity nu to thermal conductivity kappa. This number if about 1 for most gases, is huge for the earth's magma, tiny for plasmas or mercury. Ra is the dimensionless Rayleigh number and has the value g alpha d^3 (T2-T1) / (nu kappa) , where g = gravitational acceleration (M/sec^2) alpha = coefficient of expansion (buoyancy) d = depth of fluid (meters) T2 = temperature at top of fluid (degrees Kelvin) T1 = temperature at bottom of fluid (degrees Kelvin) nu = fluid viscosity (M^2/sec) kappa = fluid thermal conductivity (M^2/sec) - The fundamental nonlinearity of the Boussinesq equations comes from ADVECTION (the U.Grad terms). Warm fluid is brought up from the bottom to the top and this modifies the temperature gradient that determines the buoyancy force. These are quite difficult mathematical equations and little was known about their mathematical solution until about 1980, when supercomputers started to become powerful enough to solve the time-evolution problem in three-space dimensions. Actually, most insights about these equations have come from experiments. Modern convection experiments have reached such a high-level of accuracy and reliability that one can usefully think of the experiment as an analog computation that integrates the Boussinesq equations. Experiments remain much more efficient than any supercomputer simulation for exploring possible dynamics and this is likely to continue for the next five years. - CHALLENGE: See if you remember your buoyancy physics and solve the following problem: Given that an iceberg has density 0.92 gm/cm^3 and that sea water has density 1.03 gm/cm^3, show that only 11% of any iceberg is exposed above the water. The proverbial tip of a real iceberg is always the same fractional amount, no matter the size or shape. THE BOUSSINESQ APPROXIMATION - A key physical assumption in the derivation of the Boussinesq equations are that all fluid parameters like the viscosity nu and thermal conductivity kappa are CONSTANTS throughout the fluid. These actually depend on temperature and pressure rather weakly but this experimentalists have been clever enough to work in regimes where the fluid is Boussinesq to high accuracy. The Bousinesq approximation (constant parameters) is mainly of historical interest since it allowed some analytical progress. Numerical algorithms work just as well with and without this approximation and so there is much less motivation to require this condition nowadays. WHAT TO EXPECT BEFORE THE EXPERIMENT IS CARRIED OUT - "Back-of-the-envelope" estimates of the characteristic time and length scales as suggested by the geometry of the problem and by simple physical mechanisms at work. To make some estimates, need: rho ~= 1 g / cm^3 nu ~= 0.01 cm^2/sec g ~= 980 cm/sec^2 alpha ~= 10^-4 at room temperature d ~= 1 cm (7.9mm and 11.9mm in the actual expts) Some time scales: - vertical viscous diffusion time d^2/nu - vertical thermal diffusion time d^2/kappa = Pr d^2/nu - horizontal thermal diffusion time L^2/kappa = Pr L^2/nu = Pr Gamma^2 d^2/nu - buoyancy acceleration time scale Sqrt[d /(g alpha dT)] - free-fall time scale Sqrt[d / g] - Analogies to the logistic equation: - Infinitely many bifurcations possible on finite parameter interval. - Particular bifurcation diagram, e.g., period doublings accumulating to complex dynamics, many complicated windows. - Analogies to linear constant-matrix flows X' = M X . - Landau hypothesis: see one freq, then two freqs, then 3 freqs, etc. until many freqs are present. - Analogies to the Lorenz equation. - Shows no quasiperiodic behavior! - Infinitely many bifurcations in finite parameter interval - One sees transitions from S -> P -> C -> P -> I in all kinds of orders. IMPORTANT POINTS: - Discussed some features of the Boussinesq equations (five coupled nonlinear pdes) that describe quantitatively buoyancy-induced thermal convection. We saw that the momentum conservation equation (Newton's laws) can be derived heuristically by looking at the various forces acting on a small parcel of fluid. A definitive treatment of this topic is given in the book "Fluid Mechanics, 2nd Edition" by Landau and Lifshitz. A more readable and more experimentally oriented discussion can be found in the book "Physical Fluid Dynamics, 2nd Edition" by David Tritton. Fluid dynamics is a really interesting subject in its own right and is a great way to learn how to think about nonlinear partial differential equations. - Idea of identifying characteristic time scales in terms of basic physical constants: diffusions constants like kinematic viscosity nu and thermal diffusivity kappa (both with units of Length^2/sec) give TWO characteristic time scales: a vertical time scale across the depth d of the fluid and a horizontal time scale across the width of the fluid, L. Similarly, buoyancy forces produce an acceleration per unit mass of "alpha g dT" where alpha is the thermal coefficient of expansion, g is the acceleration due to gravity, and dT is the variation of temperature over the small volume. This leads to a characteristic time scale determined by a fluid parcel accelerating under buoyancy force over the depth d of the fluid, ~ Sqrt[d/(alpha g dT)]. - Introducing dimensionless variables into the Boussinesq equations leads to two important dimensionless parameters that characterize a convecting flow: - The Rayleigh number R = (alpha g d^3 (T2-T1))/(nu kappa) - The Prandtl number Pr = nu / kappa For an infinitely wide fluid layer, the motionless (zero-velocity) conducting state of a fluid becomes unstable to convection (usually in the form of convection rolls) when the Rayleigh number exceeds the value of about 1708. BOUNDARY CONDITIONS FOR RAYLEIGH-BENARD CONVECTION - Just as odes require initial data to obtain a well-defined mathematical problem, any set of time-dependent pdes requires in addition boundary conditions (bcs) on the variables to specify a unique mathematical problem. For the Boussinesq equations, the boundary conditions are: U = 0 on all walls ("sticky" or "non-slip" bcs) , T = 0 on top plate , T = 0 on bottom plate with T2 > T1. There are also thermal boundary conditions on the lateral wall but they depend on details of the actual material. Two commonly used lateral thermal bcs are insulating (no heat flux so grad(T).nhat = 0 where nhat is the normal unit vector at the lateral boundary) and perfectly conducting (T=0 at lateral walls). It turns out that there is no boundary condition on the pressure field p(t,x,y,z), a point of mathematical and numerical subtlety. The pressure adjusts instantly to changes in U and T and technically acts like a Lagrangian multiplier to enforce the condition of incompressibility. - The sticky or non-slip bc, U=0 in the frame of reference of a solid boundary is not at all obvious and was originally guessed at and verified by fluid dynamicists. I know of no fundamental derivation of this boundary condition and its justification came only rather recently when computational scientists carried out detailed molecular dynamics calculations of fluids as huge numbers of molecules colliding together. Various hypotheses of how the particles collide with a rough rigid wall then explain how U=0 arises. - For atmospheric and planetary convection, one fluid often is convecting while in contact with another and a different kind of boundary condition called "no-stress" or "free-slip" is imposed at the interface between the fluids. For a fluid in a box, free-slip boundary conditions correspond to the fact that no shear is possible between a fluid-fluid interface (or fluid-vacuum interface) and so nondiagonal components of the stress tensor, e.g., sigma_{x,z} must vanish at the interface This ends up imposing a normal derivative condition on the tangential components of the velocity. - For an actual experiment, one has to specify other parameters, e.g., - lengths of box, Lx, Ly, and Lz - lateral thermal boundary conditions (e.g., insulating in which case Grad[T].nhat = 0). EXPERIMENTAL GOAL FOR RAYLEIGH-BENARD CONVECTION - One goal of a good Rayleigh-Benard experiment is to be as consistent as possible with the various theoretical assumptions. This is NOT easy, e.g., the above equations implicitly assume: - Horizontal plates on top and bottom have infinite thermal conductivity, which can't be satisfied in practice so there are slight temperature differences in across the plates. For optical access, experimentalists often choose large sapphire plates, about $20K each. - Perfectly flat and parallel plates that are perfectly perpendicular to gravity. One consequence of a cell that is not aligned vertically is that one can get large convection cells of size comparable to the width of the system, rather than determined by the depth of the fluid. - Idealized simple lateral thermal boundary conditions, with no horizontal temperature gradients. Hard to achieve because of thermal mismatch of horizontal plates with lateral walls. - Constant viscosity and thermal conductivity throughout the fluid (both quantities vary weakly with temperature, about 10% in many experiments) - In a typical experiment, one selects a particular fluid like water which roughly fixes the Prandtl number Pr. (I say roughly fixes since Pr varies with temperature and so can be tuned for a fixed fluid over some modest range by varying the mean temperature of the fluid, i.e., the value (T1+T2)/2. One then varies the Rayleigh number in small constant steps: R1 for time T1 , R2 for time T2 , ... and one observes the non-transient behavior after each increase in R. One then attempts to measure various kinds of time series to quantify what happens in the experiment: - One could measure a component of the fluid velocity field or the temperature field at a fixed point (x0,y0,z0) in space: x(t) = Vx(x0,y0,z0,t) - One could measure the temperature or pressure at a fixed point in space. - One could measure the total heat transported from the bottom plate to the top plate. When made dimensionless by normalizing to the heat transport of a conducting fluid with the same boundary conditions, this normalized heat transport is called the Nusselt number N(t). Engineers are especially interested in this quantity. - At a fixed time t0, one might try to map out the spatial patterns of the fields, e.g., T(x,y,z,t0). GOLLUB AND BENSON: APPLICATION OF POWER SPECTRA - Will now discuss power spectra of actual data in a fluid convection experiment: "Many routes to turbulent convection", J. P. Gollub and S. V. Benson, J. Fluid Mech., Vol 100, 449-470 (1980) This is a highly quoted, historically important paper. The entire experiment would have been impossible without computers to control apparatus, to take data, and to analyze. Computer used at the time was PDP11, before the PC revolution which began in 1981 when IBM released the first PC with 64 kbytes of RAM and a floppy disk drive. - The experiments consisted of studying time dependence of a component of the fluid velocity field Vy(t,X) at a fixed spatial point X in a small convection cell. The authors studies several such time series, varying the temperature difference across the cell (which drives the fluid more and more out of equilibrium), the geometry of the cell (different kinds of rectangular boxes), and different kinds of water (different mean temperatures to change the Prandtl number parameter, Pr). - From the fluid equations for momentum and heat, one can derive several dimensionless parameters. It turns out that one ideally has to study a FOUR-dimensional parameter space, once all variables and parameters are made dimensionless: - Rayleigh number R = alpha g d^3 dT / ( nu kappa ) , - Prandtl number Pr = nu / kappa - Aspect ratio Gamma = Lx / d, Ly / d - Initial conditions for velocity, temperature, pressure fields. - Since there were no quantitative theoretical predictions around 1980 (there are still very few even to date, although it is now become routine to be able to solve the 3D Boussinesq equations on paraller computers, Gollub and Benson were on their own in terms of what to expect. - The Prandtl number is essentially fixed once a fluid is selected for study, and the easiest parameter to vary is the vertical temperature difference dT. So the bifurcation parameter usually used is the Rayleigh number. For most experiments, there is a first magical value of the Rayleigh number at which point the fluid undergoes a bifurcation from a steady conducting state (no motion) to a steady convecting state (finite fluid velocity, but the velocity itself does not change with time). This special value is denoted by: R_c = the critical Rayleigh number which a challenging linear stability analysis shows to have the approximate value: R_c ~= 1708 , a pure number which should be the same for all fluids in the limit of large aspect ratio. For small containers, such as the Gollub-Benson paper, the critical Rayleigh number is pushed to larger values by the damping influence of the lateral walls (recall that U=0 on these walls and so the fluid velocity is diminished near the walls by continuity) and depends on a complex way with the size of the system. - The initial conditions are largely unknown. Experiment and theory relies heavily on the fact that Rayleigh-Benard convection is dissipative, so that transients die out, and the final dynamics are weakly dependent on the initial state, i.e., there are attractors. We will explain this important insight a bit later in the course. OBSERVABLES: VELOCITY-COMPONENTS BY LDV - Gollub and Benson used an experimental technique called Laser Doppler Velocimetry (LDV) which is noninvasive, i.e., it doesn't perturb the fluid flow. Older fluid experiments, say in the 1950's, would measure velocities by inserting small wire probes that perturbed the experiment. This technique (hot-wire anemometry) is rarely used anymore. Combination of careful computer control and LDV gave research community great faith in accuracy and correctness of experimental results. With LDV, it is actually easier to measure velocities accurately and noninvasively than to measure temperature or pressure, which would require some kind of physical probe inserted into the fluid that would then distort the fluid motion. A drawback of LDV is that it is difficult to scan quickly through a region of fluid and so is difficult to map out the spatial field U(x,y,z,t) as a function of time. CHALLENGE: For the experimentalists in the class, see if you can understand in detail how LDV works, also what its limitations are: what sets the time scale at which you can scan a fluid? How does the accuracy of the velocity depend on the density of plastic particles in solution? - Measurements consisted of time series of 4096 = 2^12 sequential data points Vy[i dt, X] for fixed sampling rate dt at a fixed position X in the fluid. These are the time series x[i] with N=4096 that are input to the power spectrum algorithm. - Gollub and Benson could also vary the position of measurement X (although not quickly) and map out the velocity field in space, at least in a horizontal plane in the fluid. The magnitudes of the velocities are quite small on a human scale, Vy = O( 2 mm / sec ) . As previously discussed, this order of magnitude can be estimated by dividing the characteristic length (depth of the fluid d) by the characteristic time associated with buoyancy acceleration applied over a length d, t ~= Sqrt[d/(alpha g dT)], so v = O( Sqrt[d alpha g dT) ~= 5 mm/sec, not too bad for a crude estimate. The temperature differences are typically a few degrees centigrade. The time variations are also rather slow, typical oscillations have periods of order 5 seconds. The experimental apparatus is hand sized: 16.4mm x 27.7mm x 7.9 mm for the width, length, and height respectively. This cell has aspect ratio 16.4/7.9 = 3.5 - Experiments take a LONG time to complete, up to several weeks, thus the need for computer control of experiment and data acquisition. EXPERIMENTAL RESULTS OF GOLLUB AND BENSON - Authors observed DIFFERENT MEAN FLOWS, i.e., time-averaged pattern of fluid was different for fixed Rayleigh number but different cell sizes (aspect ratios), different Prandtl number, and even different initial conditions for fixed physical parameters. The sequence of instabilities depend on the particular mean flow. See Gollub and Benson's Fig 4 on page 457. - Authors observed different kinds of time series by analyzing power spectra, see Figures 7, 8, 11, and 12: - Periodic: sharp peaks separated by equal distances in frequency. These represent harmonics w_m = m w_b of some basic frequency w_b for different positive integers m. Fig 7a shows a single peak and one harmonic; one would generally expect an infinity of harmonics of decreasing magnitude for nonlinear systems. Note: What is meant by a "sharp" peak"? This is the width of a peak in the power spectrum that one would get if one used a pure sine wave (no harmonics) at the same frequency as the peak, sampled at the same rate and for the same amount of time. The resulting peak is as good as you can do with the given sampling rate and observation time and is often called "instrumentally sharp". - Incommensurate quasiperiodic evolution: sharp peaks that are separated by different differences in frequency, representing linear combinations of two or more frequencies whose ratio is irrational. In principle, one should be able to choose any two peaks and identify all the others as linear combinations of their frequencies, m f1 + n f2, for positive, zero, and negative integers m and n. Not all predicted peaks are present, not sure why. - How does one know that the ratio of frequencies in quasiperiodic time dependence is really irrational? Experimentally, know frequencies only to about 4 digits by cleverness, so all ratios seem rational. Essential clue is how frequencies f1 and f2 of fundamental peaks vary with temperature difference (the parameter R/Rc in Gollub and Benson), and how RATIO of two peaks vary with R/Rc. Figure 8 of Gollub clearly shows smooth variation most of time, so there is no special ratio for most parameters of the Rayleigh number. - Commensurate quasiperiodic evolution: actually identical to periodic since if w1/w2=p/q for two integers p and q, then m w1 + n w2 = (w2/q)(mp + nq) = (w1/p)(mp + nq) are all harmonics of basic frequency w2/q=w1/p. Can identify as commensurate since w1/w2 become rational as some parameter is varied. - ENTRAINMENT or PHASE LOCKING: rational ratio of frequencies PERSISTS or locks as Rayleigh number Ra is varied, even thought the frequencies themselves are still changing with Dt. This amazing affect must be purely dynamical, there is no way to write down frequencies that are simple smooth functions of parameters such that entrainment can take place. Fig 8 implies that fluid somehow can distinguish rational from irrational numbers. Will discuss theoretical basis later on, but the eager among you can try to understand Appendix C of Berge, Pomeau, and Vidal or the chapter on quasiperiodic behavior in Ott's book. - BROAD BAND: Practical test for existence of nonperiodic behavior, although this is not strictly a test for chaos since there are random processes (e.g., random walks) that produce broad-band continuous features without deriving from bounded deterministic dynamics. Note that peaks broaden but remain present as chaos begins, the system still knows about its basic oscillations at this point. For larger Ra, the peaks eventually disappear and there are no obvious periodicities remaining. Essential to using this test is the ability to distinguish SHARP peaks from BROAD peaks. - For large Rayleigh numbers, R/R_c > 60, the power spectrum has no sharp peaks and is called "featureless". According to Gollub and Benson, the spectrum P(w) falls off algebraically with increasing frequency, roughly as w^{-4.3}. This is a quite interesting result. As we will see later in the course, any signal whose power decays as a power (algebraically) at high frequency is likely associated with a stochastic or random process. Deterministic processes have power spectra P(w) that decay more rapidly, viz., exponentially rapidly for large enough frequencies. You can read about this in a paper that I wrote with some colleagues at Bell Labs,, H. S. Greenside, G. Ahlers, P. Hohenberg, and R. Walden, "A simple stochastic model for the onset of turbulence in Rayleigh-Benard convection", Physica 5D:322-334 '82. - Important point: The power spectrum gives little quantitative information about a time series once the series becomes broad band, need deeper tools of analysis. Fractal dimension of a time series will be the key new tool we develop later in the course. - Gollub and Benson used widths of peaks in spectrum to quantitatively identify when transition to turbulence begins. Beautiful example of how one can extract important information from power spectrum. Their analysis also shows that the transition from quasiperiodic state to chaotic one is continuous, the linewidths grow smoothly with increasing Ra. Also suggests that the transition is sharp. SUBHARMONIC BIFURCATIONS - The subharmonic bifurcations are completely new to our discussion. In our earlier discussions, we only saw a general new frequency appear or disappear, but not one with a specific integer ratio like 1/2. Although subharmonic bifurcations can't occur in constant-coefficient linear dynamical systems, they can occur in linear dynamical systems with time-dependent coefficients. Thus we can consider the linear pendulum with a time-dependent force: x'' = - ( g(t) / L ) x , where the gravitational force g(t) = g0 + g1 Cos[2 w t] varies periodically with period 2 w. This is known as the Mathieu equation, and has a subharmonic bifurcation as the external driving frequency w is varied, see Section II.2 of Berge et al. book, "Order Within Chaos" - It is known theoretically and observed partially experimentally that there can be an infinite number of period doublings, leading directly to chaos. - This behavior is already apparent in simple maps like the logistic map, as we will discuss later in this course. Physicists and mathematicians have shown that the subharmonic cascade has UNIVERSAL features, where universal has a certain technical meaning in the theory of phase transitions. - As an example, for nearly all nonlinear systems, IF a subharmonic cascade occurs, the control parameter values at successive subharmonic bifurcations, say p1, p2, ..., form a geometrically converging sequence whose ratio is universal, i.e., independent of specific details of the nonlinear system. Thus Mitchell Feigenbaum discovered and proved that: p[i] - p[i - 1] ----------------- --> 4.669 201 609 102 990 ... p[i + 1] - p[i] This has been verified experimentally up to i=4 before experimental noise washed out succeeding bifurcations. MISCELLANEOUS COMMENTS ON GOLLUB-BENSON EXPERIMENT - 4096 numbers are a lot of data points for a time series. Physics experiments have a significant advantage over many experiments in biology, chemistry, and economics, in generating large amounts of high quality data. Typical economic time series are at most 1000 points, typical ecological time series are at most 500 points and in both cases it is usually impossible to talk about the reproducibility of the data. One is then quite limited in how many insights can be deduced from such little data. In particular, the information available in the power spectrum, e.g., the resolving power in frequency, is directly determined by the amount of data. - Implicit in the experiment is that dissipation removes much of the need to identify the initial conditions. It is clearly impossible to measure the simultaneous values of V, p, and T everywhere in a 3d domain. Such data is available, however, in related supercomputer simulations. - Power spectrum indicates relative strengths of different frequencies. Experiment shows that these strengths (but not the frequencies) vary as one moves the LDV probe region around, so the oscillators are spatially dependent in a complex way. A careful mapping out and visualization of the oscillators has yet to be attempted, a interesting project for someone inclined to computer simulation or with a new generation experimental method. CHALLENGE: Use a 3D convection code for parameters corresponding to the Gollub/Benson experiment and try to map out the spatial dependence of the Fourier coefficients in the P and QP2 regimes. Is the spatial dependence simply that of the roll structure? - Empirically, one can determine what an instrumentally sharp peak is by feeding into an algorithm a single sine wave of known angular frequency. If peaks in a power spectrum are substantially broader than this, one says that the peaks are not sharp. Broad peaks indicate the presence of noise of some kind. - A careful convection experiment has revealed FOUR incommensurate frequencies (QP4) in a convection experiment and this is the current world's record. See "Nonchaotic Rayleigh-Benard Convection with Four and Five Incommensurate Frequencies", Phys. Rev. Lett Vol. 53, 242 (1984) by R. Walden et al. QUESTIONS - Where does this rich time dependence come from? Remember that the boundary conditions are time-independent at all times, and the fluid starts off in a resting state (fluid velocity is zero everywhere). - What constitute the "oscillators" that are periodic or quasiperiodic? - What is going on when the power spectrum becomes broad-band, with a continuum of frequencies? Experimentally, the fluid still looks likes a few convection rolls, when data is averaged over time. - Why does the fluid switch from S to P to QP2 to N or to other states? We know how to study the linear stability of a fixed point but can study the stability of a periodic or quasiperiodic or chaotic behavior? Does this even make sense? - Are sequences always in this order, e.g., can N come before P or QP2? - Are sequences unique in different experiments? - Are there any mathematical models that have sequences of this form? - We still do not know how to predict bifurcation scenarios for either experiments or mathematical models, this seems like a bad question to ask. But we do know simple models that produce nearly all these kinds of scenarios, e.g., the intermittency of QP2 with N. By analogy with equilibrium physics, we can calculate many properties of phase transitions but rarely can we calculate WHEN a phase transition will take place. - Why are there different bifurcation sequences in this experiment? There are MULTIPLE STATES for fixed external parameters, and different states lead to different sequences. In the present experiment, multiple states occur in the form of different numbers of rolls (2 or 3) and rolls of different shapes. This is illustrated in Figure 4. What is plotted are mean flows, the AVERAGE velocity component. Thus some of these plots correspond to time-dependent behavior, which are weak variations around an underlying roll structure. - Relation of time signals to spatial structure? All time signals seem to represent weak perturbations around average roll structure, but more careful experiments and simulations are still needed. IMPORTANT POINTS FROM PREVIOUS LECTURE: - To me, the biggest surprise or shock of the Gollub-Benson experiment is the great complexity that arises from a simple fluid, in a simple box, driven out of equilibrium by two flat surfaces, one hotter than the other. Where does this complexity come from? Is there a simple way to see why this structure appears and what kind? Is the complex dynamics of other systems such as the weather, the human brain, and economic markets of the same kind? How do we go about answering these questions? - Boundary conditions for the 5 variables of the Boussinesq equations that describe buoyancy-induced convection. - Discussion of details and figures in Gollub-Benson paper - Use of laser Doppler velocimetry (LDV) to measure velocity time series Vy(x0,y0,z0,t) at fixed point (x0,y0,z0) in fluid without perturbing the fluid. - Expt involves high accuracy: temperature difference can be held constant within 0.05% as function of time and within 0.5% as function of space. - Gollub and Benson had a nice trick for locating the position of a peak in the power spectrum to a much better accuracy (factor of 100) than what would be naively implied by the resolution dw=1/T where T is the total observation time (N dT where N is the number of equally spaced observations of fixed time step dT). They chose the two points (w1,P1) and (w2,P2) of largest power P(w) near a given peak and then formed the weighted average: w = (w1 P1 + w2 P2) / (P1 + P2) . Fig 3 of Gollub and Benson illustrate how these authors verified this claim. - There are different time-averaged states (convection rolls) for the same external conditions of Rayleigh number, Prandtl number, and box geometry. Small changes in these parameters can lead to different mean states. - Discussion of mysterious INTERMITTENCY: time series that switches "randomly" between quasiperiodic dynamics and chaotic dynamics. See Fig 13, page 467 of Gollub + Benson. THE EXPERIMENTAL BIFURCATION SEQUENCES - Gollub and Benson identified QUANTITATIVELY and reproducibly transitions or bifurcations in their convecting flow, using power spectra that show structure that is invisible to unaided eye. The power spectrum can pick up small changes in time series that are completely invisible to the unaided eye, and certainly invisible by just direct observation of an experiment. - Authors labeled different regimes in Figure 6, e.g.: S -> P -> P2 -> P4 -> N S -> P -> QP2 -> L -> N S -> P -> QP2 -> QP3 -> N S -> P -> QP2 -> QP2/N S -> P -> QP2 -> S -> ... as the Rayleigh number is varied. Here: S = stationary or time independent P = periodic P2 = periodic with 1/2 the previous frequency (subharmonic transition) P4 = periodic with 1/4 the first frequency QP2 = quasiperiodic with two different frequencies QP3 = quasiperiodic with three different frequencies L = locking regime of commensurate quasiperiodic behavior. N = nonperiodic or broad-band. - The notation QP2/N is an abbreviation for intermittent noise, which had just previously been discovered by some other experiments. In this case, illustrated in Fig. 13 of Gollub and Benson, there are intervals of quasiperiodic signals and intervals of noise (N), and they switch back and forth irregularly. This is a new kind of complexity, in that the signal is not periodic or quasiperiodic, but has parts that are structured and parts that are unstructured. The power spectrum (not shown) has a few broad peaks coming from the QP2 part, and continuous features coming from the noisy part. CHALLENGE: How should one interpret the power spectrum of an intermittent signal like QP2/N? If you are given just the power spectrum of a signal x(t), would you be able to tell that intermittency was occurring? - It is quite intriguing that the transition to chaos occurs after locking of frequencies ends. This is possibly related to the Ruelle-Takens theorem discussed further below but actually represents a new phenomena which no prior theory had anticipated. - These bifurcation sequences are reproducible for a given mean flow and given Prandtl number. There is a slight hysteresis in the precise values of the Rayleigh number at which these bifurcations occur, not sure this is a significant effect worth worrying about. BIFURCATION SEQUENCE FOR THE LORENZ EQUATIONS WITH SIGMA=10, B=8/3 - We can perform numerical experiments quite similar in spirit to those of Gollub and Benson; integrating flows numerically is trivial given todays computers and modern numerical algorithms. You have already played with the Runge-Kutta algorithm in a previous homework assignment, which required as input the odes, some initial data, a local truncation error, and an integration time (no decisions or subtle mathematics to integrate!). We choose some initial condition, integrate numerically the Lorenz equations to get a long time series, drop the first part of the time series so as to ignore transients, take the power spectrum of the time series, and classify the resulting dynamics. Here is a quick survey of what is found (more details are given in Appendix D of the Berge et al book "Order within Chaos"): 0 < r < 1 S i.e, time independent fixed point 1 < r < 24.06 2S two stable fixed points 24.06 < r < 24.74 2S + N N = strange attractor; hysteresis 24.74 < r < 30.1 N only 30.1 < r < 99.52 N + P (alternating N and periodic attractors) 99.52 < r < 100.80 P only 100.80 < r < 145 P / N 145 < r < 166 P only 166 < r < 214.36 P / N 214.36 < r P where we use our previous notation of S for stationary, P for periodic, and N for nonperiodic broad-band features (chaos or some kind of turbulence). - One never sees quasiperiodic behavior of any kind! Later, we will actually prove this fact for the Lorenz equations with these parameters in two lines of mathematical argument, once we understand the ideas of dissipation and attractors in phase space. So the Lorenz models are missing some of the bifurcations observed in most fluid experiments in small boxes, which is possibly worrisome. - We also never see the behavior observed experimentally by Gollub and Benson and in other experiments in which one only sees chaos (broad-band spectra) for sufficiently large Rayleigh numbers, i.e., sufficiently strong driving from equilibrium. The Lorenz model only has periodic behavior for sufficiently large reduced Rayleigh numbers! So again something is wrong. - Technical aside: The Lorenz equations are obviously limited in their ability to mimic real convection because of the way in which they are derived from the Boussinesq equations: (1) A physical approximation: Lorenz assumes that the convection is TWO-DIMENSIONAL in that all fields fields p, T, Vx, Vy, and Vz depend on the spatial variables x and z only, and not on the variable y. This is not too bad an approximation in a big convection box near onset (for Rayleigh numbers just larger than the critical value when conduction becomes unstable to convection) but is clearly flawed at higher Rayleigh numbers since theory and experiment show that two-dimensional rolls become unstable to three-dimensional disturbances and then become three-dimensional themselves. (2) A second physical approximation is that the boundary conditions for the fluid velocity are free-slip rather than rigid. NOT a good approximation for the Gollub-Benson expt in which the fluid comes in contact with rigid horizontal plates. (3) In addition to these two physical approximations, there is a mathematical one in which the fields p, T, Vx, Vy, and Vz are expanded in Fourier series in the spatial variables x and z and then all Fourier coefficients are set to zero except three. This severe mathematical truncation is, again, not so bad sufficiently close to the onset of convection but is badly wrong further above onset when finer and finer spatial structure appears in an actual fluid, which requires higher mode numbers in the Fourier expansion. It turns out that if one uses many Fourier terms (rather than the three used by Lorenz) and if one assumes two-dimensional dynamics with free-slip bcs, then there is no chaos in the Boussinesq equations until the Rayleigh number is huge, hundreds times the critical value R_c at onset. This implies that if Lorenz had carried out his studies with modern powerful computers, in which he easily could have held onto hundreds of Fourier modes rather than just three, he never would have discovered chaos in the first place! IMPORTANT POINT: THE HOW OF BIFURCATIONS IS MUCH EASIER THAN THE WHEN - An extremely important conclusion of many nonlinear dynamic studies, both experimental and theoretical, is that it is much easier to ask HOW a bifurcation takes place rather than WHEN it takes place (i.e., for what parameter values). When a bifurcation takes place seems to be a dirty problem that depends on subtle details of the actual nonlinearities and values of coefficients of Taylor series. How a bifurcation takes place, i.e., the mathematical structure of a bifurcation, seems to depend weakly on details of the equations and so one can say something useful and universal. The situation is rather similar to phase transitions in ordinary matter. It is extremely difficult to predict the temperature at which a crystal melts (with heating) or at which a metal becomes superconducting upon cooling. But the details of the phase transition seem to be the same at whatever temperature that it takes place. MATHEMATICS OF CALCULATING POWER SPECTRA - We have discussed power spectra qualitatively so far, mainly so that you could learn how to read them and to distinguish periodic, quasiperiodic, and chaotic behavior. From Gollub and Benson's paper, we saw that power spectra are far more sensitive to dynamical behavior than our eyes, and that power spectra give a fairly unambiguous criterion for judging the onset of chaos: continuous (broad-band) features. - Before turning to phase space and attractors---the key part of this course---we should discuss a little bit the mathematics of how power spectra are actually calculated from a discrete time series of numbers. Part of this discussion is needed just to show you clearly how the spectrum is obtained, but an even more important part of this discussion is to discuss an important and somewhat subtle fact: THE APPEARANCE OF BROAD-BAND CONTINUOUS FEATURES IN A POWER SPECTRUM IS USUALLY RELATED TO THE SIGNAL BECOMING RANDOM. So something truly new is happening when a bifurcation to chaos (defined as a broad-band power spectrum with continuous features) takes place: the signal has become unpredictable in a quantitative way that we now discuss. - Power spectra are best defined and analyzed in an ideal pure mathematical context. There are numerous serious challenges in calculating power spectra (or any statistic) from FINITE SETS of data of finite precision. In particular, one can never make rigorous conclusions from a finite amount of data, so some experience rather than mathematical theory is necessary. Rigorous conclusions are rarely possible from experimental or numerical data, so the mathematicians among you may feel uncomfortable with the style of argument used. - A power spectrum depends essentially on Fourier analysis, breaking a function into periodic pieces, so we need to review more carefully some basics of this theory. A good review is given in Strang's book on "Introduction To Applied Mathematics". The beauty of Fourier analysis is that it leads to expressions that have useful physical meaning (the kinds of oscillations present) AND the expressions are simple to integrate and to differentiate. This is NOT true for many other kinds of complete eigenfunctions, e.g, Chebyshev polynomials, Bessel functions, etc. - You should keep keep in mind that Fourier modes are not the only way to decompose a signal. A modern and "hot" technique uses other basis functions known as WAVELETS, which are complete orthonormal functions that have maximum locality in real space and Fourier space simultaneously. Wavelets, unlike other functions that you have worked with previously, are defined recursively and so can not be written down or evaluated explicitly. - Fourier analysis has two branches, depending on whether a given signal x(t) is periodic with period T or is nonperiodic and lives on the entire real axis, -Infinity < t < Infinity: - Periodic signals have FOURIER SERIES, consisting of infinite sums of oscillators with all frequencies being integer harmonics of some fundamental frequency. - Nonperiodic signals have FOURIER INTEGRALS, with a continuous range of frequencies present. Fourier integrals can be derived as a limit of Fourier series, in which the periodicity interval T goes to infinity. FOURIER SERIES: - If a signal x(t) is periodic with period T, so x(t + T) = x(t) for all times t, then a basic theorem of Fourier analysis says that: x(t) = Sum[ x[m] Exp[ I m w t ], {m, -Infinity, Infinity} ] , where w = 2 Pi / T is the basic angular frequency. The Fourier coefficients are denoted by x[m] and are generally complex numbers. The notation of using the same time series symbol x with a different argument is common in the physics literature and will hopefully not be confusing to you. If the signal x(t) is real-valued, then you should be able to prove that: x[m] = Conjugate[ x[-m] ] , the Fourier coefficients of positive and negative index m are related to each other. This is often called the "reality condition" for a Fourier series to represens a real periodic function. - The power spectrum of the signal is then defined to be: P[m w] = | x[m] |^2 , it is simply the magnitude squared of the mth Fourier coefficient. Periodic signals have discrete power spectra at frequencies w[m] = m w. - The Fourier coefficients x[m] can be recovered from the signal itself by using orthogonality of the Fourier modes Exp[I m w t]. Multiplying both sides of the Fourier series by Exp[- I m' w t] and integrating both sides over the periodicity interval t=0 to t=2Pi/w, we find: Integral[ x(t) Exp[- I m' w t] dt ] = Integral[ Sum[ x[m] Exp[I (m - m') w t] ] dt ] = Sum[ x[m] Integral[ Exp[I (m - m') w t] dt ] ] = Sum[ x[m] (2 Pi / w) delta[m,m'] ] = (2 Pi / w) x[m'] Here I have interchanged the order of summation and integration, and have used orthogonality of the exponentials. We have derived the famous formula for the Fourier coefficients: x[m] = (w / (2 Pi)) Integral[ x(t) Exp[- I m w t] dt ] where the integration bounds go from -Pi/w to Pi/w. - Because the Fourier coefficients are given by the signal and the signal is given by the Fourier coefficients, both are complete and equivalent descriptions of the signal. Further Fourier analysis is a LINEAR operation from a signal to the Fourier coefficients and back. - Example: periodic saw-tooth consisting of replications of signal x(t) = t for t=-T/2 to t=T/2. An easy calculation shows that the Fourier coefficients are given by (please verify this for yourself): x[0] = 0 , x[m] = 1 / (I m w) for m odd , x[m] = - 1 / (I m w) for m even . The Fourier coefficients decrease quite slowly in magnitude as 1/m with increasing mode number m because this function is not infinitely differentiable, there are jump discontinuities at ever integer multiple of T/2. The power spectrum is the magnitude squared of these terms: P[ w[m] ] = 2 / (m^2 w^2) , where the 2 in the denominator comes from the inclusion of +m and -m mode numbers. - These coefficients lead to the following real expression for the periodic saw-tooth: x(t) = (2/w) ( sin(w t) - (1/2) sin(2 w t) + (1/3) sin(3 w t) - ... ) and you should compare this Fourier series with the periodic wave over the interval [-Pi/w,Pi/w] using Mathematica: Plot[ Sin[t] - (1/2) Sin[2 t] + (1/3) Sin[3 t] - (1/4) Sin[4 t] + (1/5) Sin[5 t] , {t, -Pi, Pi} ] FOURIER INTEGRALS - Now consider these Fourier series in the limit that the interval of periodicity goes to infinity, T -> infinity, in which case the fundamental frequency w goes to zero, approaching some infinitesimal quantity dw. Then the Fourier coefficient x[m] can be thought of as coming from a continuous density x(w) where the frequency w is now treated as a continuous variable. Over a small interval of frequency dw, we have: x[m] = x(w) dw where we set w = m*w. The Fourier sum then becomes an integral: x(t) = Sum[ x[m] Exp[I m w t] ] -> Integral[ ( x(w) dw ) Exp[ I w t ] ] Similarly, the formula for the Fourier coefficient also turns into an integral: x(w) dw = (dw / (2 Pi)) Integral[ x(t) Exp[- I m w t] dt ] or x(w) = (1 / (2 Pi)) Integral[ x(t) Exp[- I w t] dt ] Again, x(t) and x(w) are completely equivalent descriptions of the signal, and are related by forward and inverse Fourier transforms which are linear operations. - For the continuous case of a nonperiodic signal, we again define the power spectrum to be the magnitude squared of the Fourier coefficient: P[w] = | x(w) |^2 . - For a real signal x(w) = Conj[x(-w)] and so: P[-w] = | x(-w) | ^2 = x(-w) Conjg[x(-w)] = x(-w) x(w) = Conjg[x[w]] x(w) = P(w) , and so the power spectrum is an EVEN function of frequency. THE DIRAC DELTA-FUNCTION: CONSISTENCY OF THE FOURIER THEOREM - We can substitute the expression for the Fourier coefficient x(w) into the expression for the Fourier integral of x(t). We should expect to recover the signal x(t), so let's see what we get: x(t) = Integral[ x(w) Exp[I w t] dw ] = Integral[ (1/(2 PI)) Integral[ x(t') Exp[-I w t'] dt'] Exp[I w t] dw ] If we interchange the orders of integration (not an obvious step, but let's be the blunt physicist and go for it), we can write this as follows: x(t) = Integral[ dt' x(t') { (1/(2Pi)) Integral[ Exp[I w (t-t')] dw ] } ] If this expression is to make sense, the quantity in braces must have a strange behavior, it must be localized in time in just the right way so as to give the contribution x(t) when the integral is performed. This expression is known as the Dirac delta function: delta[t] = (1 / (2 Pi)) Integral[ Exp[I w t] dw ] , and can be treated as a spike centered at time t of zero width and infinite height with total area one. Of course, no such thing exists directly, but mathematicians have defined generalized functions that give this concept rigorous meaning, as limiting cases of well-defined functions. - You should appreciate that delta-functions are only useful inside of integrals. SOME EXAMPLES: - What is the power spectrum of a constant signal x=c? Substituting x=c into the formula for the Fourier coefficient of a signal over all time, we get: x[w] = (1/(2Pi)) Integral[ c Exp[-I w t] dt ] = c delta[w] So we get a delta-function at zero frequency of weight c, and the power spectrum is: P[w] = |c|^2 delta[w]^2 A signal that is constant for all time has only a single nonzero Fourier coefficient at w=0. CHALLENGE: How do you make sense out of the square of a delta function? - The inverse transform tells us that a delta-function signal, something that turns on and off over a very short time, has finite and constant Fourier coefficients. This is what happens in a nuclear explosion, which generates electromagnetic waves at all frequencies and causes immense damage to electrical equipment of all kinds. Many of you presumably have read that a strategy for beginning a nuclear war is to explode a nuclear device above the atmosphere over a given country to cripple its communications by the resulting broad brand electromagnetic impulse or EMP. - What is the power spectrum for a sinusoidal signal of frequency w and phase p, x(t) = A Sin[w t + p] ? This can be considered to be the real part of a complex signal x(t) = Real[ A Exp[I(w t + p)], and substituting into our formula: x[w] = (1/(2Pi)) Integral[ Exp[I(w t + p)] Exp[- I w t] dt ] = Exp[I p] (1/(2Pi)) Integral[ Exp[I (w - w) t ] dt ] = A Exp[I p] delta[w - w] with power spectrum P[w] = | x[w] |^2 = |A|^2 delta[w - w]^2 We get a single spike at frequency w=w of magnitude |A|^2. - Similarly, you can show for a decaying exponential x(t) = Real[ Exp[- d t + I w t] that the power spectrum is the Lorentzian curve P[w] = 1 / ( 4 Pi^2 ( d^2 + (w - w)^2 ) ) and is continuous. CONSEQUENCES OF FINITE DISCRETE DATA - Given this review, we can now turn to our discussion of the consequences of finite discrete data. - Experimental and numerical time series consists of a finite set of numbers x[i], taken (usually) at constant time intervals, t[i] = i dt, where dt is a constant SAMPLING RATE. Unlike a continuous signal defined for all time, an empirical signal is discrete and is known over a finite time interval T = N dt. Each of these restrictions have separate implications for calculation of a power spectrum. - In trying to determine what frequencies are present in a signal, one has to separate out the two effects. Discretizing a continuous time signal x(t) by sampling at constant time creates a phenomenon known as ALIASING, in which frequencies higher than the Nyquist frequency 1/(2 dt) appear at frequencies lower than the Nyquist frequency. This can confuse the interpretation of the spectrum. The easiest way to avoid aliasing problems is to use the smallest possible sampling rate dT, to try several incommensurate sampling rates, or to filter the data first to remove all frequencies higher than the Nyquist frequency. All three tricks were used by Gollub and Benson. - A second and independent difficulty in identifying frequencies present in a signal arises from the FINITE amount of signal (whether continuous or not). This finiteness of observation time has THREE effects: - Peaks in the power spectrum are broadened, so one loses resolution in frequency space. This is undesirable. - Extra smaller peaks called side lobes are created by the finite duration of the signal. The side lobes can appear incorrectly like other frequencies in the signal, or can mask other frequencies in the signal. This is also undesirable. - The frequencies that one can observe become a FINITE set rather than a continuum. - We will see that side lobes can be reduced substantially by multiplying the signal by a WINDOW, a function that forces the signal to zero at the beginning and end. This makes the signal look more periodic. - In conclusion, sampling of a signal at a constant rate dT will cause aliasing effects (high frequencies appear as low frequencies) while the finite observation time of a signal (continuous or not) will broaden peaks and create extra lobes that can obscure weak peaks. ALIASING - Sidelobes, peak broadening, and discretization of frequency arise from the finite observation time of a signal. A completely independent and important effect that prevents an easy interpretation of a power spectrum---aliasing---comes from the finite sampling rate of the signal, from the fact that data is collected only at discrete times t[i] = i Dt where the constant time interval Dt is the sampling rate. This creates an upper bound for the highest observable frequencies in a signal. - To see how aliasing arises, consider a nonperiodic signal x(t) which is known only at discrete time intervals t[i] = i Dt, say x[i] = x( t[i] ) = x( i Dt ) . We don't need to worry about whether there is a finite amount of data at this point, so we will discuss the most general case for which the index i goes from -Infinity to Infinity. - To get the power spectrum of this discrete signal, we first have to calculate the Fourier coefficients according to our formula above: x(w) = (1/(2Pi)) Integral[ x[t] Exp[-I w t] dt ] , then take the magnitude squared. The bounds in the integral over t go from -Infinity to Infinity. How do we evaluate this formula if x[t] is known only at discrete times? There are several possible approaches, each leading to the same result. - One approach is to APPROXIMATE the continuous integral for the Fourier coefficient by a discrete numerical approximation, e.g., by the rectangle rule of elementary calculus, also known as the mid-point rule. We approximate the original continuous signal in terms of the known values at times t[i] by assuming that the function is constant over the time interval [t[i] - Dt/2, t[i] + Dt/2] with midpoint value x[i] = x[t[i]]. This approximation converts the continuous integral into a sum of areas of little rectangles of width Dt and height x[i]: Integral[ x[t] Exp[-I w t] dt ] approx Dt Sum[ x[i] Exp[-I w i Dt], {i,-Infinity,Infinity} ] I use the notation Dt to distinguish the finite sampling rate from the differential over time, dt. - This approximation of the integral by a sum is actually VERY good, even though the rectangles don't look like the smooth function, provided x(t) is going to zero as t goes to -Infinity or +Infinity. In this case, the difference between the sum and integral decreases as Exp[-c/Dt] where c is some constant, i.e., exponentially rapidly as the sampling time Dt decreases towards zero. This is a consequence of a deep cancellation that is captured in the so-called Euler-Maclaurin formula of classical mathematics (see Section 7.4.4 of the book "Numerical Methods" by Germund Dahlquist and Ake Bjorck for further explanation). This same theorem proves the same rapid convergence of the sum towards the integral for PERIODIC integrals, so we should be able to approximate Fourier coefficients in Fourier series by the rectangle rule also. For nonperiodic integrals over finite intervals, the rectangle-rule converges much more slowly towards the true integral as DT -> 0, quadratically as Dt^2. - Another way to approximate the integral for the Fourier coefficient in terms of finite amounts of data is to think of the discrete data x[i] at time t[i]=i Dt as the fundamental signal, in which case x(t) is really a sum of delta-functions centered at time t[i]: x(t) = Dt Sum[ x[i] delta[t - t[i]] , {i,-Infinity,Infinity} ] where t[i] = i Dt. Substituting this into the Fourier integral gives the exact same result as the rectangle rule, as you should verify. - Now let us think carefully about the expression we get by either route for the Fourier coefficient for the discrete signal x[i] = x[t[i]]: x(w) = Dt Sum[ x[i] Exp[-I w i Dt] ] I claim that this formula is PERIODIC IN FREQUENCY WITH PERIOD 2Pi/Dt where Dt is the sampling rate. The periodicity is evident since for each term in the sum, Exp[-I (w + 2Pi/Dt) i Dt] = Exp[-I w i Dt] Exp[-I 2Pi i] , = Exp[-I w i Dt] for any integer i, which in turn implies x[w + 2Pi/Dt] = x(w) , for all frequencies w. This periodicity is a direct consequence of the discrete sampling and implies that we have lost a lot of information about the continuous signal, whose Fourier coefficients were a continuous function x(w) over all negative and positive frequencies. ALIASING CONTINUED: - We were discussing one of two important effects on the power spectrum of a continuous signal that is: (1) discretized with sampling rate dT and (2) observed for a finite time interval T. The first implication leads to aliasing, with frequencies higher than the Nyquist being folded back into lower frequencies. - At the end of the last lecture, I showed you that the Fourier coefficients x(w) with angular frequency w of a discretized signal are periodic functions of frequency with period 2Pi/Dt where Dt is the sampling rate. By periodicity, we can think of x(w) as living on the interval [-Pi/Dt,Pi/Dt] since information on any interval of length 2Pi/Dt contains all the information of the periodic function. Since the power spectrum P(w) is defined to be the magnitude squared of a Fourier coefficient: P(w) = | x(w) |^2 , we see that the power spectrum of a discretized signal is also periodic, with period 2Pi/Dt. - But the power spectrum of a real signal x(t) is also an EVEN FUNCTION of the frequency w. This follows since the Fourier coefficients x(w) of a real signal x(t) obey the "reality condition" x(-w) = Conjugate[ x(w) ] , and so: P(-w) = | x(-w) |^2 = x(-w) Conjg[x(-w)] = Conjg[x(w)] x(w) = P(w) . - Here is an important conclusion, which justifies the above analysis: THE POWER SPECTRUM FOR A DISCRETE SIGNAL (FINITE OR INFINITE IN DURATION) IS A PERIODIC EVEN FUNCTION OVER THE INTERVAL [-Pi/Dt,Pi/Dt]. This is in contrast to the power spectrum of the underlying continuous function x(t) which is also an even function of w but is nonperiodic over all frequencies. The main implication of this periodicity is that angular frequencies w higher than Pi/Dt (or frequencies f higher than the Nyquist frequency 1/(2Dt)) are folded back to the basic interval [-Pi/Dt,Pi/DT], and then appear only in the interval [0,Pi/DT] by the even symmetry. We conclude: 1. We can not observe frequencies higher than 1/(2Dt) in a discrete signal (ang. freqs higher than Pi/Dt). 2. Frequencies higher than this cutoff appear incorrectly at lower frequencies, you get the aliased value by subtracting off integer multiples of the frequency periodicity length 2Pi/Dt until a given frequency lies within the interval [-Pi/Dt,Pi/Dt]. We say that the high frequency is ALIASED to a lower frequency. SIMPLE EXAMPLE OF ALIASING - A simple example: Consider a time series x[i] with sampling rate Dt=1 and Nyquist frequency fmax = 1/(2Dt) = 0.5. A frequency of 0.6 can be mapped into the interval [-0.5,0.5] by subtracting 1/Dt = 1, giving the frequency 0.6 - 1 = -0.4. Then this peak appears in the power spectrum with lower frequency f=0.4 since the power spectrum P[f] of a real signal is an even function of frequency f. If instead the frequency is still higher, with value f=1.2, this is aliased into the fundamental interval as 1.2 - 1 = 0.2 and so appears as a frequency f=0.2. - Another famous example of aliasing is the STROBOSCOPE, invented by Harold Edgarton of MIT. Here an electrical discharge in a xenon gas tube produces a repeatable brilliant short-lived light pulse. These periodic pulses can produce visual aliasing, moving the frequency of rapidly moving mechanical systems (motors, bullets, athletes) into a range that people can observe. Imagine someone jumping up and down but you turn on the room lights only for a very short time when the person is at the top of his or her jump. Then a sinusoidal behavior appears incorrectly as a constant behavior, a finite frequency has been converted to a zero frequency. You are sampling too slowly to pick up the oscillation. DETECTING AND CURING ALIASING - One can not detect aliasing from a single time series without analytical knowledge. To cure: - Take TWO time series with different incommensurate sample rates, Dt1 and Dt2. Aliased peaks will change position (why?!) while resolved peaks will not change their frequencies. - Or repeat time series with much smaller sampling rate, i.e., make Nyquist frequency high enough to prevent aliasing. - Or filter time series through low-pass filter BEFORE taking the power spectrum (and be careful that filter does not distort low frequencies). - Or all of the above. Often takes several tries to determine small enough sampling rate. ALIASED POWER SPECTRA OF PERIODIC SIGNAL OFTEN LOOK QUASIPERIODIC - A serious consequence of aliasing is to make periodic spectra look like quasiperiodic spectra, with harmonics beyond the Nyquist frequency folded back into frequencies that are not integer multiples of the lowest frequency. See if you can work out the general formula for the positions of frequencies m w1 when aliased by a sampling rate of Dt=1. Another consequence of aliasing is that the power spectrum does not decay to a white noise spectrum at high frequencies. BROADENING OF PEAKS AND SIDE LOBES: - Let us now study the second effect of experimental signals, the consequences of a finite observation time whether the signal is continuous or discrete. For simplicity, we assume the signal is the real part of a continuous sinusoidal function: x(t) = Exp[ I w1 t ] and that we observe the signal over the time interval -T/2 to T/2, where T has no relation to the frequency w. This is typical of laboratory and numerical experiments on nonlinear equations, for which the resulting nontransient periodic behavior is NOT known ahead of time, so the angular frequency w1 is not known until AFTER the data are taken. - We can consider this finite observation time as really coming from an infinite signal that is zero beyond the observation interval. The integral over -Infinity to Infinity then reduces to an integral over a finite time -T/2 to T/2. x[w] = (1/(2Pi)) Integral[ dw' Exp[- I w t] Exp[I w1 t] ] = (T / Pi) Sin[ 2 (w - w1) T ] / (2 (w - w1) T) - For the infinite signal, we expect an infinitely sharp delta function, indicating that just a single frequency contributes to the time signal. For the finite signal, we get a spectrum that is proportional to: ( Sin(x) / x )^2 where x = (w1-w)T. This function has a broadened central peak of width 1/T but something new and annoying, subsidiary peaks called SIDE-LOBES that decrease slowly in magnitude as 1/x^2. - These sidelobes are annoying since they can hide features such as harmonic peaks of low magnitude. The broadening of the central peak is also annoying since it limits your ability to resolve two frequencies that are close in value. - You can understand the presence of sidelobes intuitively with the following argument. A sinusoidal function Exp[I*w1*t] has infinite duration and only one Fourier component, the power spectrum looks like a delta function delta[w-w1]. A truncated sinusoidal function, the same signal except all values are zero beyond the observation interval, requires NEW Fourier modes to bring the signal abruptly to zero at the beginning and end of the observation interval. The sidelobes indicate the presence of these new oscillations whose main purpose is to bring the signal to zero abruptly. - This observation suggests that if one could smooth out the function so that it does not jump discontinuously to zero at the ends of the observation time, one might reduce the sidelobes. This idea turns out to be a good one and leads to the idea of WINDOWING the signal BEFORE taking the Fourier transform to analyze the Fourier components. A window is a smooth, slowly varying function that starts at zero and increases to a maximum, and then decreases to zero again over the observation time. The actual shape of the windowing function is not so important compared to the fact that its product with a time signal makes the signal look more periodic. There is a large literature in the signal analysis community (especially electrical engineering) about how best to choose the shape of a windowing function so as to tradeoff between minimizing sidelobes and preventing the peak from becoming fatter. - As an example, you should be able to verify that the Fourier transform of a Gaussian window of width s, W(t) = Exp[ - t^2 / (2 s^2) ] , times our favorite sinusoidal signal Exp[I w0 t]: x(t) = Exp[ - t^2 / (2 s^2) ] Exp[ I w0 t ] over a finite interval [-T/2,T/2] is approximately again a Gaussian: x(w) = Exp[ -(w - w0)^2 s^2 ] , which has NO side-lobes. One gets this expression by extending the interval of integration from [-T/2,T/2] to [-Infinity,Infinity] which corresponds to ignoring small terms of order ignoring terms of order Gaussian[-(T/s)^2], which can be made arbitrarily small by making the width s of the Gaussian small compared to the length of the time interval.) The Gaussian has forced the finite signal to go to zero (more precisely, approximately to zero) at the beginning and ends of the observation window. This much smoother function does not require the frequencies determined by the sidelobes. In practice, other simple windowing functions are used to reduce or to eliminate sidelobes, e.g., a simple triangle function. Gollub and Benson used a standard window called a "GEO window" to suppress side lobes. DISCRETE FOURIER TRANSFORMS AND NUMERICAL POWER SPECTRA - We now finish our discussion of the mathematics of spectral analysis by discussing the general case of discrete data observed over a finite observation time of span T. We then expect sidelobes, peak broadening, and aliasing all happening together. We now have N data points x[i] at times t=t0 + i Dt with i = 0, ..., N-1. How do we estimate the Fourier coefficients and power spectrum? - We can proceed as we did for discrete data, assuming that the data comes from some underlying and unknown continuous signal x(t). Then the expression for the Fourier coefficient x(w) is as before, except now the integration bound goes from [0,T] instead of from [-Infinity,Infinity]. Applying the rectangle integration rule gives a discrete sum: x(w) = (1/(2Pi)) Integral[ dt x(t) Exp[-I w t], {t,0,T} ] approx (Dt/(2Pi)) Sum[ x[i] Exp[-I w i t], {i,0,N-1} ] which involves a quantity known as the Discrete Fourier Transform (abbreviated DFT) of the series x[i] x[w] = Sum[ x[i] Exp[-I i w t] , {i, 0, N-1} ] . - This looks like the discrete version of our formula for a Fourier series. If we remember that Fourier transforms give equivalent descriptions of data, then if we start with N numbers x[0],...,x[N-1], we should end up with N Fourier coefficients for a completely discrete and equivalent form of the data. The tricky part is to decide on which discrete frequencies to work with. This can be resolved by observing that: - Only N frequencies are useful and that they should be symmetrically positive and negative so that real functions can be correctly represented by complex terms of positive and negative frequency. - Given a sampling rate Dt, there is a smallest negative frequency and largest positive frequency determined by the Nyquist frequency 1 / (2 Dt). - We can therefore try to estimate the Fourier coefficients x[m] = x[m Dw] at the N discrete frequencies: f[k] = k / (N Dt) , k = -N/2 , ..., 0 , ... , N/2 This still doesn't look right because the Fourier coefficients are COMPLEX numbers and we seem to have N+1 complex numbers (equivalent to 2N real numbers) coming from N real numbers. But if the finite series x[i] is real-valued, one can derive a reality condition on the Fourier coefficients: Conjg[ x[k] ] = x[N - k] , and so the Fourier coefficients are not independent, they have the same information for k=1,...,N/2 as for k=-1,...,-N/2. This still looks like too many numbers since there are N complex numbers 1,...,N/2 and still the number x[0]. - Putting everything together, our simple discrete approximation to the integral in terms of x[i], and the choice of N discrete frequencies between +/- the Nyquist frequency, suggest discrete transformations of the following form (index i represents time data, index k denotes frequency data by the usual physicists notation): x[ i ] = Sum[ x[i] Exp[ 2 Pi I k i / N] , {k, 1, N-1} ] , x[ k ] = (1 / N) Sum[ x[i] Exp[- 2 Pi I k i / N] , {i, 0, N-1} ] - Note how all references to time and frequency have cancelled in these DFT's, leaving expressions that are only functions of the number of points. Time and frequency can be restored in the following way: Exp[2 Pi I k i / N] = Exp[I (2 Pi k/(N Dt)) (i Dt) ] so this term involves discrete time t[i]=i Dt and angular frequency w[k] = (2 Pi k)/(N Dt). - Mathematica implements a DFT using the function Fourier, which acts on lists of symbols or numbers. Mathematica uses a different normalization from that given above, distributing the 1/N factor equally between the DFT and its inverse. In terms of our notation: DFT[ x ] = (1/ Sqrt[N]) Fourier[ x ] , i.e., Fourier[ x ] = (1 / Sqrt[N]) Sum[ x[i] Exp[...] ] - For practice with DFTs, let us check that the forward and inverse DFT's are consistent. We substitute the expression for the Fourier coefficient x[k] into the discrete Fourier series for x[i] and see if it reduces correctly to x[i]: x[i] = Sum[ x[k] Exp[2 Pi I k i / N] , {k, 0, N-1} ] = Sum[ (1/N) Sum[ x[i'] Exp[-2 Pi I k i' / N] , {i',0,N-1} ] Exp[2 Pi I k i / N] , {k, 0, N-1} ] Now interchange the order of summation for i' and for k, and combine all terms that involve the index k. We get: x[i] = (1/N) Sum[ Sum[ Exp[2 Pi I k (i-i') / N], {k,0,N-1} ] x[i'] , {i',0,N-1} ] Now what does the inner sum reduce to? We would hope that it gives something like a delta function, zero if i != i' and something nonzero for i=i'. This would then pick out just the right term of x[i'] to give the correct value on the left. To see this, we observe that: Exp[2 Pi I k (i - i') / N ] = z^k , where z = Exp[2 Pi I (i -i') / N] is a Nth-root of unity, z^N = 1. The inner sum then becomes a sum of a geometric sequence, which can be evaluated in closed form as you hopefully remember from high school: Sum[ Exp[2 Pi I k (i - i') / N], {k, 0, N-1} ] = 1 + z + z^2 + ... + z^(N-1) = (1 - z^N) / (1 - z) If i-i' is not zero, then z^N = 1 and this sum vanishes. If i-i' is zero, then the sum gives trivially the sum of N one's, namely N. We have proved: Sum[ Exp[2 Pi I k (i - i') / N], {k, 0, N-1} ] = N delta[i,i'] , where the Kronecker delta function for integer indices only is defined to be: delta[i,i'] = 0 if i != i' = 1 if i = i' . This trick of identifying a Nth-root of unity and summing geometric series is the basic way to manipulate discrete Fourier transforms. This same proof demonstrates discrete orthogonality of the Fourier modes Exp[2 Pi I i k / N]. DISCRETE POWER SPECTRUM OF SINE WAVE: - In Chapter 3 of the Berge et al that I handed out to you, these authors explicitly calculate the discrete Fourier transform of a periodic function Exp[I w0 t], and use this example to illustrate broadening of peaks and generation of side lobes. Their discussion is almost identical to the continuous signal case that we worked out above, except in the limit in which there are very few data points, in which case the discrete spectrum is not close to the continuous spectrum. The other significant difference is that the power spectrum is a discrete sampling in frequency space of the original continuous power spectrum. COMPUTATIONAL COMPLEXITY OF THE DFT: O(N Log[2,N]) - Given N real numbers x[i] in a discrete time series, we see that each of the N discrete Fourier coefficients x[k] requires a sum of N terms to be evaluated. The total number of additions and multiplications then grows as N^2 as N gets large. Mathematicians say that the complexity is big-oh of N^2, written as O(N^2), to transform the N numbers x[i] to the N/ Fourier coefficients x[k]. - This is a large computational cost. For Gollub and Benson's case, they had to study MANY time series of order 4096 points, and a direct use of the DFT would require O(4096^2) calculations per time series, quite expensive. - There is an amazing alternative algorithm for evaluating N DFT's of a time series known as the Fast Fourier Transform, abbreviated FFT. The computational cost is O(N Log[2,N]), a factor of about N/Log[2,N] less than the DFT. This decrease can be quite substantial, e.g., for N=4096=2^12, the FFT is about 4096/12 = 341 times more rapid. - The FFT is one of the great algorithms of the 20th century. Although known to Gauss and possibly by earlier researchers, it forms the foundation of many modern efficient algorithms for digital computation, e.g., rapid ways to solve Poisson's equation on 2d and 3d domains. It is one of the examples frequently put forward to demonstrate that software is as important as hardware in developing ever more powerful computers. - The speedup of the FFT depends on how many prime factors appear in the integer N that gives the total number of terms in the time series. The more prime factors, the faster the FFT. The optimal speed is attained for a pure power of 2, hence the common choice of 256, 1024, 4096, etc for the lengths of time series. If a time series has a length that is not a power of two, one can extend the time series with zeros to the nearest power of two, a technique known as "padding the series". - The Mathematica DFT function "Fourier" switches to a FFT algorithm automatically upon detecting multiple primes in the factorization. For taking the DFT of a large sequence of zeros, I get the following timing data with MMa 3.0 on a 166 MHz Pentium machine running Linux with 32 MB of RAM Timing[ Fourier[ Table[ 0., {4096} ] ] (* 4096 = 2^12 *) ] Find 0.15 sec. Timing[ Fourier[ Table[ 0., {4099} ] ] (* 4099 is prime *) ] Find 32.3 sec. The number 4099 is just a little bigger than 4096 yet the ratio of times to evaluate the DFT of a lot of zeros is about 32.3 / 0.15 ~= 215, which is of order the expected speedup of 4096/Log[2,4096] ~ 341. You can see directly the enormous advantage of using an FFT with a number that is a power of two. HOW TO ACTUALLY CALCULATE THE POWER SPECTRUM: - I outline the basic ingredients in any power spectrum algorithm. You should look at my code PowerSpectrum.m ?PowerSpectrum (* look at description *) ??PowerSpectrum (* look at code itself *) to see the full details. Also, you should read the relevant chapter on spectral methods in "Numerical Recipes in C". - First, ONE REMOVES THE MEAN OF THE SERIES so that zero-frequency power is zero, P(0) = 0. In many time series, the oscillations are small perturbations around a large mean value and the power spectrum plot would be dominated by the zero frequency peak if the mean were not removed. - Note: in Mathematica, the mean can be removed from a time series in one statement without looping: xminusmean = x - Apply[Plus,x] / Length[x] The Apply function replaces the head of the second argument, List, with the first argument, Plus, which sums all its elements. Similarly Apply[Times,x] forms the product of all the elements of x. Apply is used so heavily that Mathematica has a special abbreviation @@. Thus xminusmean = x - (Plus @@ x) / Length[x] accomplishes the same goal, and is often more convenient when x is a large expression, since you don't have to enclose x in brackets. - Next ONE WINDOWS THE ZERO-MEAN DATA by multiplying this data with some slowly varying continuous function (e.g., a Gaussian or tent function) that vanishes at t=0 and at t=T with zero derivative. The continuous window is evaluated at the same discrete times as the time series itself. The windowed time series produces a more periodic signal and gives broader peaks in the spectrum, with smaller sidelobes. - Next, ONE TAKES THE DFT OF FINITE DATA x[i], 0 <= i <= N-1. If possible, one uses a number of points that is a power of two so FFT can be used more efficiently. If your data is not a power of two, one can try to take the largest subset that is a two-power or one can "pad" the series with zeros to get to the next larger power of two. CHALLENGE: Can you figure out the implications of padding a time series with extra zeros? How does this affect resolution in frequency, side-lobes, etc.? - Given the discrete Fourier coefficients, these are converted into the power spectrum by taking the magnitude squared of corresponding first N/2+1 Fourier coefficients: P[k] = | x[k] |^2 , 0 <= k <= N/2 - The spectrum can then be plotted, P[k] versus frequency f[k], where f[k] = k / (N Dt) , 0 < k <= N/2 With the PowerSpectrum package that I have written, you would plot the power spectrum as follows: data = Table[ Random[], {128} ] ; psd = PowerSpectrum[ data, 1. ] ; (* Dt = 1 *) PlotSpectrum[ psd ] (* lin-lin plot *) PlotLogLinSpectrum[ psd ] (* log10-lin plot *) PlotLogLogSpectrum[ psd ] (* log10-log10 plot *) The first plot is best for observing the largest peaks and harmonics. The second plot is useful for identifying regions that decay exponentially, which appear roughly linearly. The third plot is useful for finding parts of the power spectrum that act as a power law of frequency, omega^alpha, which appears linearly on such a plot. WHEN NOT TO CONSIDER USING POWER SPECTRA: - Before proceeding to calculate a power spectrum, it is crucial to verify or at least explicitly to assume that the signal x[i] is STATISTICALLY STATIONARY. This means that all transients have been allowed to decay and that the underlying physical mechanism is not changing with time. A time series that is not statistically stationary can be considered to be a Fourier series whose coefficients are slowly changing with time. If this occurs, Fourier analysis is inappropriate and a different technique should be tried, e.g., studying numerical derivatives of the time series (which tends to make the series stationary) or using wavelets. - The hypothesis of statistical stationarity is hard to test empirically, but one can try to see if various moments change with time. Thus define a time-dependent average F(s) and time-dependent variance, G(s), by including only data from time t = 0 up to times t <= s <= T: F(s) = < x >_s G(s) = < ( x - _s )^2 >_s where <..>_s denotes an average w.r.t times t <= s, for s <= T, the total observation time. Then if x(t) is statistically stationary, one should find F(s) and G(s) to be roughly constant as a function of s. Two difficulties with these tests are that values of F and G get more accurate as s increases. Another is how to quantify "roughly constant". If F has a few wiggles in it, how does one tell whether the wiggles are important or not? - Note: G(s) is called the ROUGHNESS of a function x(t) and is often used in condensed matter physics to measure how rough a surface is, where x(t) is understood as the height above some baseline. - A common example of a nonstationary time series is the RANDOM WALK. Consider a mathematical insect that jumps from one integer to an adjacent integer on the axis of real numbers. The insect goes left or right one unit randomly by flipping a coin. Let x[i] be the time series which is the POSITION of the bug after i coin tosses. Then one can show that x[i] is nonstationary since its second moment < x[i]^ > grows linearly with i, representing a diffusive motion. This is also the origin of diffusive motion of ink in water, of Brownian motion of pollen grains in water. - When a time series is nonstationary, the random walk suggests that it might be fruitful to explore the spectrum of INCREMENTS of the time series, y[i] = x[i + j] - x[i] for different fixed intervals of jumps j. Such increments are often stationary even though the time series is not. In the case of the random walk, the increments for j=1 are just +/- 1, which is a stationary sequence. Economic time series are often nonstationary with stationary increments. - Another case where power spectra do not give useful summaries is INTERMITTENT time series in which one kind of statistically stationary behavior alternates with another kind of statistically stationary behavior, e.g., QP2 pieces alternating with N pieces using the notation of Gollub and Benson. Human speech is an example of an intermittent signal, someone may be quiet for many minutes then say a single word. Fluid turbulence is also intermittent, things are quiet in space for some period of time and then violently disrupted by a burst of motion and energy. - Intermittency can be statistically stationary, but irregular in time and in frequency. Usually intermittent time series satisfy scaling laws, e.g., the mean time between changes in behavior often scales as (p - p_c)^alpha (alpha < 0), diverging as a power law as some parameter approaches a critical value p_c. Theory can predict the exponent alpha but not the critical value p_c of the parameter. - For intermittent signals, phase information is more important and Fourier analysis and power spectra is less useful. A recent approach on how to quantify intermittent signals has led to the theory of WAVELETS, orthogonal functions that are localized in both time and frequency (or space and wavenumber). A recent reference on wavelets is: "Wavelets and Dilation Equations: A Brief Introduction" by Gilbert Strang, Siam Review Vol. 31(4), 614-627 (1989). CORRELATION FUNCTIONS AND RANDOMNESS: THE WIENER-KHINTCHINE THEOREM - Another reason that I have spent all this time discussing Fourier analysis and power spectra is to derive and to discuss a famous theorem of the 1940's, the Wiener-Khintchine theorem. This theorem suggests an important relation between broad-band power spectra and RANDOMNESS or lack of predictability in a signal. Generally, the appearance of broad-band features in a spectrum implies a specific limit on accurate forecasting. - Berge et al derive the discrete version of the Wiener-Khintchine theorem, which says that the power spectrum has an additional meaning besides the magnitude-squared of Fourier coefficients. The power spectrum is also the Fourier transform of a so-called CORRELATION FUNCTION: C[j] = < ( x[i] - ) ( x[i + j] - ) > = (1/ N) Sum[ (x[i] - ) (x[i + j] - ), {i, 0, N-1} ] where x[i+j] for i+j > N-1 is defined by the obvious periodic extension of the time series. denotes the average of x[i] over all known values of x. This bracket notation for averaging is a physics notation, other people often use a bar over the letter (which is more awkward to indicate with the simple ASCII notations that I am using in these class notes). - One can prove that the correlation function satisfies the following properties: C[j] = C[ -j ] ; even around 0 | C[j] | <= | C[0] | ; is nonincreasing C[0] = < (x - )^2 > = variance . normalization The correlation function is symmetric and can never exceed the zero-time value. There is nothing that forbids C[j] from later equaling C[0], which is the case for sinusoidal and quasiperiodic signals. - In the continuum limit of a smooth periodic signal, one would define: C(tau) = < ( x(t) - ) ( x(t + tau) - ) > , averaged over one basic period T, and where x(t) = Sum[ x[m] Exp[I m w0 t] ]. Substituting Fourier series for x(t) and x(t+tau) and interchanging summations with integrations gives immediately: C[tau] = Sum[ |x[m]|^2 Exp[I m w t] ] , the Fourier coefficients of the correlation function are precisely the power spectrum. Similarly, the power spectrum is given by the Fourier transform of the correlation function. - Note: the notation <...> is used by physicists to denote some kind of average. In the case where a single time series is known, the average is over all time samples. Physicists often find it useful to think of the average in a deeper idealized sense, over realizations of time series drawn from some mystical huge set of all possible time series with the same statistical properties. Such ensemble averages are not necessarily equal to time averages, but for many time series this is the case. Such time series are then called ERGODIC. It is NOT easy to prove theoretically that time series from some physical model are or are not ergodic. - THE CORRELATION FUNCTION MEASURES CORRELATIONS BETWEEN FUTURE PARTS OF THE SIGNAL AND PAST PARTS OF THE SIGNAL. This function decreases to zero when the interval of time tau is so large that the future is decoupled from the past: < ( x(t) - ) ( x(t + tau) - ) > approx < x(t) - > < x(t + tau) - > = 0 for tau sufficiently large and x(t) a random signal. You can see this for the case of x[i] being -1 for heads and 1 for tails and flipping many coins; the average value is then zero, =0. But x[i] x[i+j] is positive or negative with equal probability and so the sum giving the correlation function averages to zero (or at least a tiny number that goes to zero with high probability as the time series gets longer and longer). DELTA-FUNCTION CORRELATED NOISE - An extreme case of randomness as measured by correlation functions is a delta-function (the Kronecker delta function, not the Dirac one) correlation function. One can think of this as a limiting case of noise with exponentially decaying correlations with a correlation time tau -> 0. Delta-function correlated noise is a good approximation to flipping coins many times, in which new events are completely independent of previous events. We then expect: C[i] = Delta[i,0] , which says that future parts of the time series (i > 0) are completely uncorrelated with past parts. WARNING: The correlation function is one of many ways to measure the concept of "correlation" and not all definitions agree. So one has to be quite careful to choose a definition appropriate for a given problem. As one example, one can define correlations in terms of something called "mutual information" (tendency of a joint probability distribution to factor into a product) or in terms of onset of Gaussian statistics (law of large numbers). - Now the Fourier transform of a delta function is a constant function, i.e., a power spectrum that is constant, representing a continuous white noise spectrum. THUS A RANDOM SIGNAL HAS A CONTINUOUS SPECTRUM and, conversely, a continuous spectrum often implies some randomness or lack of predictability in the corresponding time series. Note: You show in the current homework that time series generated from the MMa Random[] function is delta-function correlated to rather good accuracy. However, its averaged power spectrum is not constant (flat) which is inconsistent with a true delta-function Rather sadly, there are not rigorous proofs or even a clear understanding of the relation between randomness and continuous parts of a spectrum, there is some important mathematical work needed here. E.g., I know of no theorem that proves that the power spectrum of the logistic map for parameter a=4 is rigorously continuous in the limit of infinitely long time series. - It is plausible, although not rigorously shown, that time series can not be accurately predicted on time scales much longer than the correlation time, since at that point, earlier information is decoupled and useless for forecasting. Decays in the correlation function is the most common way that predictability is quantified for a time series. - If a correlation function falls off substantially over a correlation time tau, one might be tempted to interpret THE TIME TAU AS THE TIME SCALE OVER WHICH THE FUTURE BECOMES UNCORRELATED WITH THE PRESENT. You might then expect that one can not predict the future of the time series that generated the correlation function beyond a horizon of about tau. But then this sounds suspiciously like the meaning of the inverse largest Lyapunov exponent 1/lambda_1 that you explored in a recent homework assignment and which was calculated in a completely different way than a correlation time, by studying how a small perturbation in initial data grew exponentially on the average (so you need to know TWO nearby solutions to calculate lambda_1 while only one time series is necessary to calculate C(tau)). INTERESTING QUESTION: Is the correlation time tau for a correlation function to decay substantially related to the Lyapunov exponent which measures the average amount of exponential instability? How would you explore possible relations when, in general, neither quantity can be calculated analytically for most mathematical models? A CRITERION FOR COMPLEXITY: SMALL SPATIAL CORRELATION LENGTH - In many mathematical models and laboratory expts, an observable u depends on position X as well as on time, u=u(X,t). This was the case in Gollub and Benson's experiment, where they measured the y-component of the velocity at different places in the convection cell, over some interval of time. Given x(X,t), one can calculate a SPATIAL CORRELATION FUNCTIONS for a fixed time t: S(DX) = < (x(X,t) - ) (x(X + DX,t) - ) > , which is an obvious generalization of the time correlation function we defined earlier. The notation <> again denotes an average but now an overall spatial points X in some domain, AND an average over time (and, if we had many duplicated experiments, an average over ensembles). The quantity S(DX) measures the average correlation in space between points a distance ||DX|| away, in a specific direction DX/||DX||. - The quantity S(DX) and its Fourier transform are heavily used in X-ray crystallography. Then the observable is X-ray intensity, which is proportional to electron density, and S(DX) can be used to deduce the position and kinds of atoms in a sample. - This real function of a vector argument can be further reduced to a RADIAL CORRELATION FUNCTION, which is its average over all angles for fixed magnitude, i.e., S(r) = < S( r cos(theta) > where r = ||DX|| is the size of the increment DX and theta (in 2d) is the orientation of the increment w.r.t some axis, e.g., the y-axis. These averages can be done everywhere except near the boundary of the physical domain. Alternatively, you can think of S(r) as the average over time for the spatial correlation functions for ALL pairs of points X1,X2 in the domain that are a distance r from each other. - In some cases, e.g., a turbulent fluid or complex chemical reactions, the radial spatial correlation function S(r) decreases rapidly in magnitude with the size of the increment ||DX||. By analogy to how we defined a correlation time tau, one can extract a SPATIAL CORRELATION LENGTH XI (Greek letter xi) that indicates the spatial range over which one part of system is correlated with another part over all of time. Parts of the system that lie further than xi from each other are uncoupled ON THE AVERAGE and might be statistically independent. - The length xi can be obtained as the 1/decay constant for the roughly exponential decay of S(DX) for large ||DX||. Alternatively, 1/xi can be obtained from the mean wavenumber of the power spectrum of S(DX). The numbers are comparable for many cases, within a factor of two. Usually only the order of magnitude value matters, not the exact value. CAN SYSTEMS BE LARGE COMPARED TO THEIR CORRELATION LENGTH? - An extremely interesting question in any system spread out of space is whether a system can be large compared to the spatial correlation length xi? If this is the case, the system possibly can be understood as consisting of small domains of size xi and the system is not going to be usefully described by methods of chaos, which apply to systems with few degrees of freedom. If the system is three-dimensional with a characteristic largest linear dimension L, then the number of degrees of freedom---minimum number of odes needed to form a good model---is likely to be of order (L/xi)^3. This says that the volume L^3 of the domain consists of many roughly spherically correlated regions of volume xi^3 and to the total number of independent regions is roughly L^3/xi^3. - Researchers have estimated the spatial correlation length xi for a fibrillating pig heart and found that xi (6mm) is rather tiny compared to the radius of the heart (25mm). We deduce that the fibrillating heart is probably not well described by chaos, i.e., by a nonlinear dynamical system with FEW variables. The NORMAL heart, however, has many features that are well described by low-dimensional nonlinear deterministic models. A heart attack is then a bifurcation from simple to complex behavior. - The weather system is complex in that the radius of the earth is quite large compared to the empirical spatial correlation length. This is not too surprising since turbulent fluids (of which many kinds are found in the atmosphere and ocean) are complex in this sense also, correlation lengths decrease quite rapidly in size as the strength of the driving (velocity magnitude) is increased. - Specific examples of the relation between xi, chaos, and turbulence (L >> xi) are still being worked out, and has been part of my own research program at Duke. It is conjectured, but yet to be shown mathematically or numerically, that the fractal dimension of a spatially complex system of width L always grows as a power law: D(L) = c L^d , where c is some constant, L is some characteristic (largest) spatial size of the system, and d is the spatial dimensionality of the system, e.g., d=3 for typical fluids, d=2 for many ecological systems consisting of species living on the floor of a forest. - Another interesting question is the relation between spatial correlation lengths and correlation times: Can a system be random in time (small correlation time) but spatially coherent (large xi)? The answer is believed to be yes, and is probably the case for Gollub and Benson's experiments. In those experiments, the correlation length is the same size as the width of the box, possibly larger. Can a system be spatially disordered (small xi) but coherent (periodic or quasiperiodic) over long times? Here things are less understood, but the answer is believed to be typically no although there are some simple examples, e.g., a phonograph record disorder with holes spinning at a constant angular frequency. Only a few experiments concerning this point have been studied carefully, lots of room for interesting research in many disciplines. - Our current best guess about the difference between chaotic systems and turbulent systems is that the former have dimensions comparable to xi, while the latter have dimensions large compared to xi. But more precise definitions are needed. - Good example of using spatial correlation functions to characterize average spatial disorder, see @inproceedings{Gollub91, author = {J. P. Gollub and R. Ramshankar}, title = {Spatiotemporal chaos in interfacial waves}, booktitle = {New Perspectives in Turbulence}, year = {1991}, address = {Berlin}, editor = {L. Sirovich}, pages = {165--194}, publisher = {Springer-Verlag} } which is reference in bibtex format (used with Latex files). Relation of spatial correlation lengths to fractal dimensions is discussed in some of my recent work, e.g., @Article{OHern96, author = "Corey O'Hern and David Egolf and Henry Greenside", title = "Lyapunov spectral analysis of a nonequilibrium {Ising}-like transition", journal = "Phys. Rev.~E", year = "1996", volume = "53", number = "4", pages = "3374-3386" } POWER-LAW POWER SPECTRA: - One can not always determine a meaningful correlation time from random data. A fascinating and widespread example of this case arises from time series whose power spectra decrease approximately as power laws for ALL frequency values, of the functional form P[ w ] = C w^{-a} , for some constant C and some constant exponent a >= 1. Such time series are then said to have a power-law power spectra with "index a". Power laws in time or frequency have no characteristic time scale associated with them. - For power-law power spectra, the correlation time is comparable with the length of the time series. A larger amount of data just gives a longer correlation time. You can see that something is funny since the mean square value of the time series is related to the integral of the power spectrum (just consider the case tau=0 in the equation that gives the correlation function as the Fourier transform of the power spectrum): < x(t)^2 > = Integrate[ P[w] , { w, 1/(N Dt), 1/(2Dt) } ] . In the limit that the observation time T = N Dt becomes large, the integral diverges for a power-law power spectrum for any exponent a >= 1. This suggests that THE TIME SERIES DOES NOT HAVE A FINITE VARIANCE, and is not statistically stationary, the standard deviation grows steadily over time. Economic time series are of this form, e.g., the Dow Jones average has a power spectrum that is approximately P(f) = 1/f^2. - Some empirical examples with power-law power spectra: - Many electronics systems have electronic noise, a approx 1 - Economic time series (stock market, exchange rates), a approx 2, 3 - Classical and pop music, a approx 1 - Widths of tree rings, a approx 1.3 - Capacities of water reservoirs - Flood levels of rivers - Sand flowing down a funnel (Behringer et al) - Flows of autos on highways - The appearance of 1/f^a noise in so many contexts is poorly understood. We shall see that the spatial structure of these systems often has an evolving FRACTAL structure, but this merely redefines the problem: what causes the spatial fractal structure? - If you encounter a power-law power spectrum, you can make some progress by looking at new time series obtained by taking successive increments, which tend to be more stationary. Thus given x[i] as a time series, define y[i;j] = x[i] - x[i - j] for j some fixed integer. The best choice for j is not known in advance and one typically has to choose several values of the delay index j to see how results of various statistical tests depend on j. ASYMPTOTIC HIGH-FREQUENCY LIMIT OF POWER SPECTRA - Time series that are differentiable can not have interesting structure on arbitrarily fine time scales, they have to smooth out and look locally like a line at sufficient magnification. This leads to a theorem which proves that bounded smooth time series must have power spectra that decrease exponentially for sufficiently high frequencies; the high frequency part looks linear on a log-linear power spectrum plot. Technical aside: the power spectrum P(w) is the magnitude squared |x(w)|^2 of a Fourier coefficient x(w) of a signal x(t). The Fourier coefficient x(w) can be written as a Fourier integral: x(w) = (1/(2Pi)) Integral[ x(t) Exp[I w t] dt ] , where the bounds in the integral go from -Infinity to +Infinity. But if you have studied some complex analysis (hopefully all of you have done so), you should recognize that integrals going from -Infinity to Infinity can be reinterpreted as a CONTOUR INTEGRAL in the complex time plane, with part of the contour running along the real time axis and part of the contour bending away in a large arc into the complex plane. The Fourier coefficient x(w) will then be equal to the sum of residuals coming from poles of the complexification of the signal x(t). If all poles are bounded away from the real time axis, i.e., the signal x(t) is bounded for all time, you can show using standard asymptotics-of-integral tricks of the sort that many of you learned in Richard Palmer's mathematical physics course that the above integral decays exponentially rapidly for sufficiently large angular frequencies w. How does one extend a time signal to the complex time plane? For general dynamical systems dx/dt=f(x) this is way too hard to do analytically but fortunately numerical methods like forward-Euler or Runge-Kutta make it quite easy to extend real time series into the complex time plane. As an example, consider the 1d dynamical system dx/dt=f(x) with real function f(x) and consider a real initial condition x0 at time t=t0. The forward Euler algorithm is: x(t+h) = x(t) + h f(x(t)) , and there is nothing here to prevent us from taking the time step h to be a complex number, e.g., the imaginary number "0.1 I". You can then take successive time steps along any desired path you want in the complex time plane and, in this way, map out the values of x(z) for arbitrary complex numbers z. CHALLENGES (a) Does the value of x(z) depend on which path you took in the complex plane to get to z? (b) Try using forward Euler or Runge Kutta and calculate the complex-time solutions of some familiar dynamical systems, e.g., exponential decay dx/dt=-x or the harmonic oscillator. - Time series that are associated with noise do have structure on all time scales, magnifying a segment of the series shows just as many oscillations and wiggles as the original time series. The power spectrum for such a signal then typically falls off as a power law at sufficiently high frequencies. SOLUTION OF LINEAR STOCHASTIC DIFFERENTIAL EQUATIONS WITH WHITE NOISE - As illustration of how power spectra arise that fall off algebraically at high frequencies w >> 1, consider linear stochastic differential equation driven by white noise source xi(t): x'' + a x' + b x = xi(t) , where a and b are constants. It is not so clear how to make mathematical sense out of this equation since the right side (white noise) is a nondifferentiable function while the left side involves derivatives. Let's just assume that this makes sense and take the Fourier transform of both sides, i.e., find the Fourier coefficient at frequency w of both sides and equate these Fourier coefficients. You get: ( (-I w)^2 + a (-I w) + b ) x(w) = Exp[I theta] , where x(w) is the Fourier coefficient at frequency w of x(t) and where we assume that xi(t), being white noise, has as its Fourier coefficient at frequency w a random complex number of magnitude one, i.e., a number of the form Exp[I theta] where theta is randomly chosen in the interval [0,2Pi]. Taking the absolute value squared of both sides and using the definition P(w) = |x(w)|^2 , we find that the power spectrum of the solution of the sde should be: P(w) = 1 / | b + a(-I w) - w^2 |^2 , This is not a simple formula but you can see that as the frequency w becomes large, the w^2 term dominates all the other terms and we have: P(w) prop to 1/w^4 for w >> 1 . So this is an example of a mechanism, noise driven damped oscillator, which produces a power spectrum that decays as w^4 at high frequencies. If you think about this example, you will see that it is easy to produce asymptotic powers of w that are EVEN positive integers, e.g., -2, -4, -6. But there is no obvious way to produce odd powers, -1, -3, -5 or non-integer powers, say -1.5 or -3.5.