The concepts of affine equivariance and elliptically symmetric distributions have played a central role in the development of robust multivariate statistical methods over the past 30 years. Statistical methods which are robust over the class of elliptical distributions are more widely applicable than methods based solely on the multivariate normal distribution. The class of elliptical distributions, though, represents only a small class of multivariate models. Important cases which do not fall within this class are mixture models and independent components models. Affine equivariance is a useful property since methods possessing it behave equally well over different covariance structures. However, there are many cases where interest lies in certain type of covariance structures, e.g. factor analysis models. The investigator's research goals are thus two-fold. First, the investigator is to further develop and study 'invariant coordinate selection.' This is a new multivariate method, recently introduced by the investigator, which is well suited for exploring non-elliptical models. In particular, it can be used to uncover Fisher's linear discriminant subspace for mixture models when the group identifications are unknown, and can be use to uncover the independent components in independent components models. Second, the investigator is to study the properties of certain non-affine equivariant methods, such as orthogonal equivariant M-estimates. The primary goal here is to achieve a better understanding of the type of covariance structures for which such methods may or may not be advantageous. The need to analyze multivariate data arises in many diverse disciplines, such as computer science, psychology, meteorology, sociology, biology, econometrics and engineering. The primary interest in such data typically is not with an understanding of each variable separately, but rather with the interrelationships among the variables or with unmeasurable 'latent' variables. Many common methods employed in these areas are based upon the multivariate normal model, which are now well known to perform poorly if the normal model does not hold. In particular, only a few errors in the data or a slight deviation in the model can highly influence the interpretation of an experiment or a data set, sometimes with disastrous consequences. This is particularly problematic with high dimensional data, i.e. data consisting of many variables, since bad data points or deviations from the model can be difficult to detect whenever they are associated not with just one variable but with a number of the variables. Thus, multivariate methods which are not greatly affected by such problems are crucial to a proper analysis of such data. The investigator anticipates that the intended research will have an important impact not only on steering the direction of research within robust statistics, but also on the methodology used within the many disciplines that routinely deal with multivariate data.
|Effective start/end date||6/1/09 → 5/31/12|
- National Science Foundation (NSF)