TY - GEN
T1 - Safe Exploration in Reinforcement Learning by Reachability Analysis over Learned Models
AU - Wang, Yuning
AU - Zhu, He
N1 - Publisher Copyright: © The Author(s) 2024.
PY - 2024
Y1 - 2024
N2 - We introduce VELM, a reinforcement learning (RL) framework grounded in verification principles for safe exploration in unknown environments. VELM ensures that an RL agent systematically explores its environment, adhering to safety properties throughout the learning process. VELM learns environment models as symbolic formulas and conducts formal reachability analysis over the learned models for safety verification. An online shielding layer is then constructed to confine the RL agent’s exploration solely within a state space verified as safe in the learned model, thereby bolstering the overall safety profile of the RL system. Our experimental results demonstrate the efficacy of VELM across diverse RL environments, highlighting its capacity to significantly reduce safety violations in comparison to existing safe learning techniques, all without compromising the RL agent’s reward performance.
AB - We introduce VELM, a reinforcement learning (RL) framework grounded in verification principles for safe exploration in unknown environments. VELM ensures that an RL agent systematically explores its environment, adhering to safety properties throughout the learning process. VELM learns environment models as symbolic formulas and conducts formal reachability analysis over the learned models for safety verification. An online shielding layer is then constructed to confine the RL agent’s exploration solely within a state space verified as safe in the learned model, thereby bolstering the overall safety profile of the RL system. Our experimental results demonstrate the efficacy of VELM across diverse RL environments, highlighting its capacity to significantly reduce safety violations in comparison to existing safe learning techniques, all without compromising the RL agent’s reward performance.
KW - Controller Synthesis
KW - Reinforcement Learning
KW - Safe Exploration
KW - Safety Verification
UR - http://www.scopus.com/inward/record.url?scp=85200781697&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200781697&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-65633-0_11
DO - 10.1007/978-3-031-65633-0_11
M3 - Conference contribution
SN - 9783031656323
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 232
EP - 255
BT - Computer Aided Verification - 36th International Conference, CAV 2024, Proceedings
A2 - Gurfinkel, Arie
A2 - Ganesh, Vijay
PB - Springer Science and Business Media Deutschland GmbH
T2 - 36th International Conference on Computer Aided Verification, CAV 2024
Y2 - 24 July 2024 through 27 July 2024
ER -