TY - JOUR
T1 - Predictive learning on hidden tree-structured ising models
AU - Nikolakakis, Konstantinos E.
AU - Kalogerias, Dionysios S.
AU - Sarwate, Anand D.
N1 - Funding Information: This work was supported in part by DARPA and SSC Pacific under contract N66001-15-C-4070 and the United States National Science Foundation under award CCF-1453432, and the United States National Institutes of Health under award 1R01DA040487. Publisher Copyright: © 2021 Konstantinos E. Nikolakakis, Dionysios S. Kalogerias and Anand D. Sarwate.
PY - 2021
Y1 - 2021
N2 - We provide high-probability sample complexity guarantees for exact structure recovery and accurate predictive learning using noise-corrupted samples from an acyclic (tree-shaped) graphical model. The hidden variables follow a tree-structured Ising model distribution, whereas the observable variables are generated by a binary symmetric channel taking the hidden variables as its input (flipping each bit independently with some constant probability q ∈ [0, 1/2)). In the absence of noise, predictive learning on Ising models was recently studied by Bresler and Karzand (2020); this paper quantifies how noise in the hidden model impacts the tasks of structure recovery and marginal distribution estimation by proving upper and lower bounds on the sample complexity. Our results generalize state-of-the-art bounds reported in prior work, and they exactly recover the noiseless case (q = 0). In fact, for any tree with p vertices and probability of incorrect recovery δ > 0, the sufficient number of samples remains logarithmic as in the noiseless case, i.e., O(log(p/δ)), while the dependence on q is O(1/(1 - 2q)4, for both aforementioned tasks. We also present a new equivalent of Isserlis’ Theorem for sign-valued tree-structured distributions, yielding a new low-complexity algorithm for higher-order moment estimation.
AB - We provide high-probability sample complexity guarantees for exact structure recovery and accurate predictive learning using noise-corrupted samples from an acyclic (tree-shaped) graphical model. The hidden variables follow a tree-structured Ising model distribution, whereas the observable variables are generated by a binary symmetric channel taking the hidden variables as its input (flipping each bit independently with some constant probability q ∈ [0, 1/2)). In the absence of noise, predictive learning on Ising models was recently studied by Bresler and Karzand (2020); this paper quantifies how noise in the hidden model impacts the tasks of structure recovery and marginal distribution estimation by proving upper and lower bounds on the sample complexity. Our results generalize state-of-the-art bounds reported in prior work, and they exactly recover the noiseless case (q = 0). In fact, for any tree with p vertices and probability of incorrect recovery δ > 0, the sufficient number of samples remains logarithmic as in the noiseless case, i.e., O(log(p/δ)), while the dependence on q is O(1/(1 - 2q)4, for both aforementioned tasks. We also present a new equivalent of Isserlis’ Theorem for sign-valued tree-structured distributions, yielding a new low-complexity algorithm for higher-order moment estimation.
KW - Chow-Liu Algorithm
KW - Distribution Estimation
KW - Hidden Markov Random Fields
KW - Ising Model
KW - Noisy Data
KW - Predictive Learning
KW - Structure Learning
UR - http://www.scopus.com/inward/record.url?scp=85105890907&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105890907&partnerID=8YFLogxK
M3 - Article
VL - 22
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
SN - 1532-4435
ER -