A simulation-based tree method for building linear models with interactions

Jin Wang, Javier Cabrera, Kwok Leung Tsui

Research output: Contribution to journalArticle

Abstract

Linear models are the most common predictive models for a continuous, discrete or categorical response and often include interaction terms, but for more than a few predictors interactions tend to be neglected because they add too many terms to the model. In this paper, we propose a simulation-based tree method to detect the interactions, which contributes to the predictions. In the method, we first bootstrap the observations and randomly choose a number of variables to build trees. The interactions between the roots and the corresponding leaves are collected. The times of each interaction that appear are counted. To obtain the benchmark of the number of each interaction that appears in the trees, the response values are substituted by randomly generated values and then we repeat the procedure. The interactions with occurrence frequency more than the benchmark are put into the regression models. Finally, we select variables by running LASSO for the model with main effects and the interactions obtained. In the experiments, our method shows good performances, especially for the data set with many interactions.

Original languageEnglish (US)
JournalCommunications in Statistics - Theory and Methods
DOIs
StateAccepted/In press - Jan 1 2020
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability

Keywords

  • Simulation
  • interaction
  • prediction
  • regression
  • tree

Fingerprint Dive into the research topics of 'A simulation-based tree method for building linear models with interactions'. Together they form a unique fingerprint.

  • Cite this