Project Details
Description
Learning from data is performed by iterative algorithms throughout statistics and machine learning. Iterative learning algorithms find models that best fit the labeled training data, in order to make predictions on unseen and unlabeled data. These iterative algorithms are run in computers until an optimality criterion is met or exhausting computational resources. Empirical evidence suggests that terminating algorithms early, before convergence, enhances prediction performance on unseen data in certain learning scenarios. The project aims to develop theory to explain this early-stopping phenomenon, as well as practical methodologies to determine the optimal early iteration for best predictions on unseen data and saving computational resources. The research will involve students at both undergraduate and graduate levels.Modern estimators in statistics and machine learning are ubiquitously defined as solutions to optimization problems, whether convex or nonconvex. These optimization problems are solved iteratively using gradient descent and its accelerated variants, or proximal iterative schemes if the objective function is non-smooth. On the other hand, inferential methodologies such as debiasing, the construction of confidence intervals in regression models or estimation of the generalization error have so far been developed assuming algorithm convergence. This research will bridge this gap and develop inferential methodologies for algorithm iterates, including at a constant number of iterations or far from convergence. The project aims to develop estimators of the generalization error of the iterates, with application to early stopping to minimize population error. The project further plans to equip iterative algorithms with confidence intervals and hypothesis tests valid throughout the trajectory, allowing practitioners to perform hypothesis rejections and discoveries at early iterations, without relying on algorithm convergence.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Status | Active |
---|---|
Effective start/end date | 7/1/24 → 6/30/27 |
Funding
- National Science Foundation: $225,000.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.