Modeling Complex Functional Data and Random Objects in Metric Spaces

Project Details

Description

The rapid advancement of technology has led to a surge in complex data across various disciplines and society. This poses challenges for statistical analysis and is the driving force behind this research project. Examples of such complex data include brain connectivity correlation matrices, taxi trip networks, microbial compositions, and age-at-death distributions, for which algebraic operations such as sums and scalar multiplications are not well-defined and hence cannot be directly analyzed by conventional statistical tools that rely on algebraic operations to extract relevant information. This project aims to develop statistical methodology to address these data analytic needs, supported by theory and efficient computational implementations. This research project is anticipated to lead to substantial insights, including characterization of the co-evolution of brain regions of interest in early neurodevelopment, or comparison of mortality or income distributions across different countries. The methodology will also enable the detection of differences between groups of complex data, such as between brain connectivity networks during normal aging and pathological aging, and determine associations between different data objects, such as body mass compositions and physical activity intensity distributions. The project will also provide opportunities of statistical training and research for undergraduate and graduate students. This research project will develop statistical modeling and inference methods for functional data and random objects that take values in a metric space which by default does not possess vector space structures. The lack of linearity eliminates the applicability of existing methods that have been developed for Euclidean data and necessitates the development of novel tools for the analysis of such data. One focus of the research is modeling of sparse multivariate functional data. A factor analysis approach will be developed that lends itself to handling extreme temporal sparsity through nonparametric regression, which will be used to estimate the cross-sectional covariance matrices with the choice of a suitable metric. Another focus of the research is statistical modeling and inference for random objects in metric spaces. Principal component analysis (PCA) methods will be developed to model general object data using metric geometry. An application of the proposed object PCA to samples of random distributions in the Wasserstein space will be investigated, which benefits from the Wasserstein geometry potentially in conjunction with functional data analysis techniques. Inference methods will be devised for testing homogeneity and independence for samples of object data based on depth profiles, which uniquely characterize the law of random objects for a wide range of metric spaces. These developments will be accompanied by theoretical analysis and justification as well as scalable and stable algorithms that will be made into publicly available software.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
StatusActive
Effective start/end date6/15/235/31/26

Funding

  • National Science Foundation: $275,000.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.