[go: nahoru, domu]

Jump to content

Singular spectrum analysis

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Dkondras (talk | contribs) at 07:41, 6 February 2008 (Created page with '{{underconstruction}} Singular Spectrum Analysis (SSA) is a data-adaptive, nonparametric spectral estimation method based on embedding a [http://en.wikipedia.org/wi...'). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Singular Spectrum Analysis (SSA) is a data-adaptive, nonparametric spectral estimation method based on embedding a time series : in a vector space of dimension . The SSA method proceeds by diagonalizing the lag-covariance matrix of to obtain spectral information on the time series, assumed to be stationary in the weak sense. The matrix can be estimated directly from the data as a Toeplitz matrix with constant diagonals (Vautard and Ghil, 1989), i.e., its entries depend only on the lag :

Alternative way to compute , is by using the ``trajectory matrix" that is formed by lag-shifted copies of , which are long; then

The eigenvectors of lag-covariance matrix are called temporal empirical orthogonal functions (EOFs). The eigenvalues of account for the partial variance in the direction and the sum of the eigenvalues, i.e., the trace of , gives the total variance of the original time series .

Projecting the time series onto each EOF yields the corresponding temporal principal components (PCs) :

An oscillatory mode is characterized by a pair of nearly equal SSA eigenvalues and associated PCs that are nearly in phase quadrature (Ghil et al, 2002). Such pair can represent efficiently a nonlinear, anharmonic oscillation. This is due to the fact that a single pair of data-adaptive SSA eigenmodes can capture better the basic periodicity of a quasi-periodic oscillation, rather than yielding the unnecessary overtones that will appear in methods with fixed basis functions, such as and in Fourier Transform.

The window width determines the longest periodicity captured by SSA. Signal-to-noise separation can be obtained by merely inspecting the slope break in a "scree diagram" of eigenvalues or singular values vs. . A Monte-Carlo test (Allen and Robertson, 1996) can be applied to ascertain statistical significance of the oscillatory pairs detected by SSA. The entire time series or parts of it that correspond to trends, oscillatory modes or noise can be reconstructed by using linear combinations of the principal components and EOFs, which provide the reconstructed components (RCs) :

here is the set of EOFs on which the reconstruction is based. The values of the normalization factor , as well as of the lower and upper bound of summation and , differ between the central part of the time series and its endpoints.

Multi-channel SSA (or M-SSA) is a natural extension of SSA to a -channel time series of vectors or maps with data points: . In the meteorological literature, extended EOF (EEOF) analysis is often assumed to be synonymous with M-SSA. The two methods are both extensions of classical principal component analysis (PCA) but they differ in emphasis: EEOF analysis typically utilizes a number of spatial channels much greater than the number of temporal lags, thus limiting the temporal and spectral information. In M-SSA, on the other hand, one usually chooses . Often M-SSA is applied to a few leading PCA components of the spatial data, with chosen large enough to extract detailed temporal and spectral information from the multivariate time series.

Gap-filling form of SSA can be used to analyze datasets that are non-evenly sampled or with missing data (Kondrashov and Ghil, 2006). For a univariate time series, SSA gap filling procedure utilizes temporal correlations to fill in the missing points. For a multivariate data, gap filling by M-SSA takes advantage of both spatial and temporal correlations. In either case: (i) estimates of missing data points are produced iteratively, which are then used to compute a self-consistent lag-covariance matrix and its EOFs ; and (ii) cross-validation is used to optimize the window width and number of leading SSA modes to fill the gaps with the iteratively estimated ``signal", while the noise is discarded.

References

  • Allen, M.R., and A.W. Robertson: Distinguishing modulated oscillations from coloured noise in multivariate datasets, Clim. Dyn., 12, 775--784, 1996.
  • Ghil, M., R. M. Allen, M. D. Dettinger, K. Ide, D. Kondrashov, et al.: Advanced spectral methods for climatic time series, Rev. Geophys. 40(1), 3.1--3.41, doi: 10.1029/2000RG000092, 2002.
  • Kondrashov, D., and M. Ghil: Spatio-temporal filling of missing points in geophysical data sets, Nonl. Proc. Geophys., 13, 151-159, 2006.
  • Vautard, R., and M. Ghil: Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series, Physica D, 35, 395--424, 1989.

External links