Spectral learning of large structured HMMs for comparative epigenomics

Chicheng Zhang, Jimin Song, Kevin C. Chen, Kamalika Chaudhuri

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

We develop a latent variable model and an efficient spectral algorithm motivated by the recent emergence of very large data sets of chromatin marks from multiple human cell types. A natural model for chromatin data in one cell type is a Hidden Markov Model (HMM); we model the relationship between multiple cell types by connecting their hidden states by a fixed tree of known structure. The main challenge with learning parameters of such models is that iterative methods such as EM are very slow, while naive spectral methods result in time and space complexity exponential in the number of cell types. We exploit properties of the tree structure of the hidden states to provide spectral algorithms that are more computationally efficient for current biological datasets. We provide sample complexity bounds for our algorithm and evaluate it experimentally on biological data from nine human cell types. Finally, we show that beyond our specific model, some of our algorithmic ideas can be applied to other graphical models.

Original languageEnglish (US)
Pages (from-to)469-477
Number of pages9
JournalAdvances in Neural Information Processing Systems
Volume2015-January
StatePublished - 2015
Event29th Annual Conference on Neural Information Processing Systems, NIPS 2015 - Montreal, Canada
Duration: Dec 7 2015Dec 12 2015

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Signal Processing
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Spectral learning of large structured HMMs for comparative epigenomics'. Together they form a unique fingerprint.

Cite this