EAR-Climate: Catalytic: A Modern Spatio-Temporal Hierarchical Modeling Framework for Paleo-Environmental Data (PaleoSTeHM)

Project Details


The geological record provides the only long-term archive of environmental change and variability. Yet that record is sparse and often quite noisy and indirect. Reconstructing past environmental conditions is thus a critical and challenging statistical task. For example, sea level varies over a broad range of spatial and temporal scales. Global average sea-level change must be estimated from sparse, noisy, indirect and typically local records, such as the ecological preferences of fossils preserved in salt marsh sediments. Similar challenges arise in the interpretation of other types of environmental records (for example, of temperature or precipitation). Existing statistical analysis packages are not equipped to easily reconstruct past environmental fields while addressing some of the distinctive characteristics of paleo-environmental records. These challenges include the complex relationships between what can be observed in geological materials and the environmental conditions that they reflect, and the uncertainties that characterize the age of geological samples. Past efforts have thus generally relied on custom-built, problem-specific code, which has limited the adoption of state-of-the-art analysis approaches. The PaleoSTeHM project will therefore develop a new software framework, built on top of modern, scalable software infrastructure for machine learning, that will enable broader use of these approaches. It will thus facilitate the contextualization of current global change and lead to improved projections of future global change risk. The project will also develop resources to train early-career Earth scientists seeking to employ theme methods in their research.

The central concept in PaleoSTeHM is that of the spatio-temporal hierarchical statistical model. These models provide a natural, conceptually straightforward (but sometimes computationally challenging) framework for reconstructing past environmental variables over space and time. Hierarchical statistical models, which are frequently implemented in a Bayesian framework, partition the multiple random effects that lead to individual observations into levels, thus clarifying the assumptions in a statistical analysis. They separate the underlying phenomenon of interest and its variability from the noisy mechanisms by which this underlying process is observed. PaleoSTeHM will develop a Python-based framework for spatio-temporal hierarchical modeling of paleodata. By leveraging an existing, widely used machine-learning framework at the base level, PaleoSTeHM will be built to take advantage of current and future computational advances without modifications to the user-facing product. It will facilitate the incorporation of complex likelihood structures, including the embedding of physical simulation models, and thus pave the way for further advances in paleo-modeling.

This project is co-funded by a collaboration between the Directorate for Geosciences and Office of Advanced Cyberinfrastructure to support AI/ML and open science activities in the geosciences.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Effective start/end date9/1/228/31/25


  • National Science Foundation: $800,808.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.