Control of many real-life systems strongly relies on the knowledge of a domain expert, who usually adopts a safe control policy to deal with uncertainty. The term safe means that the policy is aimed at avoiding system’s disruptions or relevant deviations from the desired behaviour, usually at the cost of sub-optimal performances. This paper proposes a statistically-sound approach which exploits the collected experience to safe-explore new policies by assuming a reasonable risk in terms of safety while improving performances. Gaussian Process regression is the core of the approach, providing a probabilistic approximation of both system’s dynamics and performances, depending on historical data related to the application of the safe policy. Being a probabilistic model, Gaussian Process provides both an estimate of the level of safety and, more important, the associated predictive uncertainty, which is crucial for implementing the safe-exploration of new efficient policies. The approach allows to avoid the typically expensive implementation of a digital twin of the system, required in the case of simulation-optimization approaches, as well as the formulation as a stochastic programming problem. Results on two case studies, inspired by real-life systems, are presented, showing an improvement in terms of performances with respect the initial safe policy, with reasonable safety of the systems.
Candelieri, A., Ponti, A., Archetti, F. (2022). Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes. In Learning and Intelligent Optimization 16th International Conference, LION 16, Milos Island, Greece, June 5–10, 2022, Revised Selected Papers (pp.232-247). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-24866-5_18].
Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes
Candelieri, A
;Ponti, A;Archetti, F
2022
Abstract
Control of many real-life systems strongly relies on the knowledge of a domain expert, who usually adopts a safe control policy to deal with uncertainty. The term safe means that the policy is aimed at avoiding system’s disruptions or relevant deviations from the desired behaviour, usually at the cost of sub-optimal performances. This paper proposes a statistically-sound approach which exploits the collected experience to safe-explore new policies by assuming a reasonable risk in terms of safety while improving performances. Gaussian Process regression is the core of the approach, providing a probabilistic approximation of both system’s dynamics and performances, depending on historical data related to the application of the safe policy. Being a probabilistic model, Gaussian Process provides both an estimate of the level of safety and, more important, the associated predictive uncertainty, which is crucial for implementing the safe-exploration of new efficient policies. The approach allows to avoid the typically expensive implementation of a digital twin of the system, required in the case of simulation-optimization approaches, as well as the formulation as a stochastic programming problem. Results on two case studies, inspired by real-life systems, are presented, showing an improvement in terms of performances with respect the initial safe policy, with reasonable safety of the systems.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.