Provable Hierarchy-Based Meta-Reinforcement Learning

Kurtland Chua, Qi Lei, Jason D. Lee

Research output: Contribution to journalConference articlepeer-review

Abstract

Hierarchical reinforcement learning (HRL) has seen widespread interest as an approach to tractable learning of complex modular behaviors. However, existing works either assume access to expert-constructed hierarchies, or use hierarchy-learning heuristics with no provable guarantees. To address this gap, we analyze HRL in the meta-RL setting, where a learner learns latent hierarchical structure during meta-training for use in a downstream task. We consider a tabular setting where natural hierarchical structure is embedded in the transition dynamics. Analogous to supervised meta-learning theory, we provide “diversity conditions” which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy. Furthermore, we provide regret bounds on a learner using the recovered hierarchy to solve a meta-test task. Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.

Original languageAmerican English
Pages (from-to)10918-10967
Number of pages50
JournalProceedings of Machine Learning Research
Volume206
StatePublished - 2023
Event26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 - Valencia, Spain
Duration: Apr 25 2023Apr 27 2023

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Cite this