This paper generalizes a method originally developed by the authors to perform data driven localization of leakages in urban Water Distribution Networks. The method is based on clustering to perform exploratory analysis and a pool of Support Vector Machines to process on line sensors readings. The performance depends on certain hyperparameters which have been considered as decision variables in a sequential model based optimization process. The objective function is related to clustering performance, computed through an external validity index defined according to the leakage localization goal. Thus, as usual in hyperparameters tuning of machine learning algorithms, the objective function is black box. In this paper it is shown how a Bayesian framework offers not only a good performance but also the flexibility to consider in the optimization loop also the automatic configuration of the algorithm. Both Gaussian Processes and Random Forests have been considered to fit the surrogate model of the objective function, while results from a simple grid search have been considered as baseline.
Candelieri, A., Giordani, I., Archetti, F. (2017). Automatic configuration of kernel-based clustering: an optimization approach. In Learning and Intelligent Optimization : 11th International Conference, LION 11, Nizhny Novgorod, Russia, June 19-21, 2017, Revised Selected Papers (pp.34-49). Springer Verlag [10.1007/978-3-319-69404-7_3].
Automatic configuration of kernel-based clustering: an optimization approach
Candelieri, A
;Giordani, I;Archetti, F
2017
Abstract
This paper generalizes a method originally developed by the authors to perform data driven localization of leakages in urban Water Distribution Networks. The method is based on clustering to perform exploratory analysis and a pool of Support Vector Machines to process on line sensors readings. The performance depends on certain hyperparameters which have been considered as decision variables in a sequential model based optimization process. The objective function is related to clustering performance, computed through an external validity index defined according to the leakage localization goal. Thus, as usual in hyperparameters tuning of machine learning algorithms, the objective function is black box. In this paper it is shown how a Bayesian framework offers not only a good performance but also the flexibility to consider in the optimization loop also the automatic configuration of the algorithm. Both Gaussian Processes and Random Forests have been considered to fit the surrogate model of the objective function, while results from a simple grid search have been considered as baseline.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.