The main focus of this project will be on applications of modern and sophisticated optimization theory to the problems of statistical learning, regression, and density estimation, with constraints. Classes of functional constraints to be investigated will all involve nonnegativity over a domain of the underlying function or some linear functional of it. Both univariate and multivariate cases will be thoroughly investigated. The research in onstrained estimation problems will be carried out in several functional spaces including polynomials, polynomial splines, wavelets, and more general Chebychev systems, for the univariate case, and polynomials, wavelets, and splines for ultivariate case. In particular various strategies for approximation of nonnegativity constraints will be explored. The main tool for such problems is general convex conic optimization models such as semidefinite and second order cone programming. The shape and nonnegativity constraints will be formulated as conic optimization problems, and when possible, tailor made efficient interior point algorithms with good theoretical and numerical properties will be developed. In the multivariate case, Nesterov's semidefinite programming characterization of sum-of-squares functions will be used as a foundation to build methods for multivariate estimation and learning problems. Observing that the explicit problems above in the most abstract form amount to searching in the cone of nonnegative functions in some well-defined closed linear function space such as Sobolev-Hilbert spaces with an explicit orthonormal basis, and starting from a finite subset of such basis, the approximations are refined successively by adding more elements of the basis until the desired accuracy is achieved. However, the question of which sequence of nonnegative cones within each approximating finite dimensional space should be chosen is of fundamental importance, and will be thoroughly investigated.Two interesting theoretical questions are related to the research on constrained learning and estimation problems which will be investigated in this project. One is the question of which cones have bilinear complementarity conditions. The question of existence of such cones beyond the class of symmetric cones (which include positive semidefinite matrices and second order cones) will be investigated. The second theoretical question which arises from this project is the question of characterizing vector valued functions which are required to be in a given convex cone, for example, polynomials with symmetric matrices as coefficients which are required to be positive semidefinite. Such problems have applications in multivariate shape onstrained problems. The characterization, as well as design of efficient optimization algorithms for them will be investigated. Successful completion of the goals of this project will create strong synergy between statistical learning theory and modern conic optimization problems. Through statistical learning, it is expected that new algorithmic methods impact fields as diverse as biology, econometrics, finance and management science, among others. All such fields have numerous problems where regression, density estimation or classificationhas to be carried out with one or many shape and nonnegativity constraints. Any useful software developed for this project will be made available to the community in the open source format. Both C and C++ libraries, as well as MATLAB-like and R based software will be created and published in open sourceformat.The main educational impact of this project will be training and development of graduate students with expertise in multitudes of disciplines, including mathematical programming, statistical learning, signal processing, data visualization and software engineering. It is expected that the graduate students work will culminate in PhD degrees in operations research.
|Effective start/end date||8/1/09 → 7/31/12|
- National Science Foundation (National Science Foundation (NSF))
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.