Food authenticity studies deal with the detection of products that are not what they claim to be, thereby preventing economic fraud or possible damage to health. For identifying illegal sub-samples we introduce robustness in a semi-supervised model-based classification rule. That is, labelled and unlabelled data are jointly modeled by a Gaussian mixture model with parsimonious covariance structure. To avoid singularity issues, we adopt a restriction on the eigenvalues’ ratio of the group scatter matrices. Adulterated observations are detected by monitoring their contributions to the overall observed likelihood, and following the impartial trimming established technique: the illegal sub-sample is the least plausible under the estimated model. A wrapper approach for variable selection is then considered, providing relevant information about discriminant variables and for feature reduction in a high-dimensional context. Experiments on real data, artificially adulterated, are provided to underline the benefits of the proposed method.
Cappozzo, A., Greselin, F., Murphy, B. (2018). The role of trimming and variable selection in robust model-based classification for food authenticity studies. In A.G. Colubi (a cura di), COMPSTAT 2018 Book of Abstracts (pp. 35-35). COMPSTAT and CRoNoS.
The role of trimming and variable selection in robust model-based classification for food authenticity studies
Cappozzo, A
;Greselin, F;
2018
Abstract
Food authenticity studies deal with the detection of products that are not what they claim to be, thereby preventing economic fraud or possible damage to health. For identifying illegal sub-samples we introduce robustness in a semi-supervised model-based classification rule. That is, labelled and unlabelled data are jointly modeled by a Gaussian mixture model with parsimonious covariance structure. To avoid singularity issues, we adopt a restriction on the eigenvalues’ ratio of the group scatter matrices. Adulterated observations are detected by monitoring their contributions to the overall observed likelihood, and following the impartial trimming established technique: the illegal sub-sample is the least plausible under the estimated model. A wrapper approach for variable selection is then considered, providing relevant information about discriminant variables and for feature reduction in a high-dimensional context. Experiments on real data, artificially adulterated, are provided to underline the benefits of the proposed method.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.