Abstract

This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency. Most commonly-used model selection methods fail to consistently recover the true model when the covariates are highly correlated. Motivated by econometric and financial studies, we consider the case where covariate dependence can be reduced through the factor model, and propose a consistency strategy named Factor-Adjusted Regularized Model Selection (FarmSelect). By learning the latent factors and idiosyncratic components and using both of them as predictors, FarmSelect transforms the problem from model selection with highly correlated covariates to that with weakly correlated ones via lifting. Model selection consistency, as well as optimal rates of convergence, are obtained under mild conditions. Numerical studies demonstrate the nice finite sample performance in terms of both model selection and out-of-sample prediction. Moreover, our method is flexible in the sense that it pays no price for weakly correlated and uncorrelated cases. Our method is applicable to a wide range of high dimensional sparse regression problems. An R-package FarmSelect is also provided for implementation.

Original languageEnglish (US)
JournalJournal of Econometrics
DOIs
StateAccepted/In press - Jan 1 2020

All Science Journal Classification (ASJC) codes

  • Economics and Econometrics

Keywords

  • Correlated covariates
  • Factor model
  • Model selection consistency
  • Regularized M-estimator
  • Time series

Cite this