Abstract
To explore novel methods for the analysis of metabolomics data, we compared the ability of Partial Least Squares Discriminant Analysis (PLS-DA) and Bayesian networks (BN) to build predictive plasma metabolite models of age three asthma status in 411 three year olds (n = 59 cases and 352 controls) from the Vitamin D Antenatal Asthma Reduction Trial (VDAART) study. The standard PLS-DA approach had impressive accuracy for the prediction of age three asthma with an Area Under the Curve Convex Hull (AUCCH) of 81%. However, a permutation test indicated the possibility of overfitting. In contrast, a predictive Bayesian network including 42 metabolites had a significantly higher AUCCH of 92.1% (p for difference < 0.001), with no evidence that this accuracy was due to overfitting. Both models provided biologically informative insights into asthma; in particular, a role for dysregulated arginine metabolism and several exogenous metabolites that deserve further investigation as potential causative agents. As the BN model outperformed the PLS-DA model in both accuracy and decreased risk of overfitting, it may therefore represent a viable alternative to typical analytical approaches for the investigation of metabolomics data.
| Original language | American English |
|---|---|
| Article number | 68 |
| Journal | Metabolites |
| Volume | 8 |
| Issue number | 4 |
| DOIs | |
| State | Published - Dec 2018 |
| Externally published | Yes |
ASJC Scopus subject areas
- Endocrinology, Diabetes and Metabolism
- Biochemistry
- Molecular Biology
Keywords
- Arginine metabolism
- Asthma
- Bayesian networks
- Overfitting
- Partial least-squares discriminant analysis
Fingerprint
Dive into the research topics of 'Partial least squares discriminant analysis and bayesian networks for metabolomic prediction of childhood asthma'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver