Given a sample of unlabeled observations, the goal of a novelty detection method is to identify which units substantially deviate from the observed labeled patterns. Therefore, in a model-based framework, it is firstly of paramount importance to learn the components that correspond to the manifest groups in the training set. Secondly, one needs to take into account the lack of knowledge regarding the statistical novelties. Thirdly, contaminated elements in the known classes could greatly jeopardize the identification of new groups. Motivated by these challenges, we propose a two-stage Bayesian non-parametric novelty detector. At stage one, robust estimates are extracted from the training set and, subsequently, such information is employed to elicit informative priors within a flexible semiparametric mixture. This general paradigm can be easily adapted to complex modeling frameworks: we provide here an application to functional data from a food authenticity study.
Denti, F., Cappozzo, A., Greselin, F. (2021). Outlier and novelty detection for Functional data: a semi-parametric Bayesian approach. In S. Ingrassia, A. Punzo, R. Rocci (a cura di), Models and Learning in Clustering and Classification (pp. 33-38). LEDIpublishing.
Outlier and novelty detection for Functional data: a semi-parametric Bayesian approach
Francesco Denti;Francesca Greselin
2021
Abstract
Given a sample of unlabeled observations, the goal of a novelty detection method is to identify which units substantially deviate from the observed labeled patterns. Therefore, in a model-based framework, it is firstly of paramount importance to learn the components that correspond to the manifest groups in the training set. Secondly, one needs to take into account the lack of knowledge regarding the statistical novelties. Thirdly, contaminated elements in the known classes could greatly jeopardize the identification of new groups. Motivated by these challenges, we propose a two-stage Bayesian non-parametric novelty detector. At stage one, robust estimates are extracted from the training set and, subsequently, such information is employed to elicit informative priors within a flexible semiparametric mixture. This general paradigm can be easily adapted to complex modeling frameworks: we provide here an application to functional data from a food authenticity study.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.