Minimum Spanning Tree (MST) is a well-known clustering algorithm that provides a graphical tree representation of the objects in a data set by exploiting local information to link each pair of similar objects. The a-posteriori analysis of this tree in terms of nodes and edges provides the basis to derive simple classifiers, namely semi-supervised classification approaches based on the minimum spanning tree approach. In this work, we propose different metrics to evaluate the MST ability to group objects of the same a-priori known classes. The classification capability of the proposed approach, using 13 different distance measures, was compared with that of classical supervised classification approaches such as N-Nearest Neighbour (N3), Binned Nearest Neighbour (BNN), Partial Least Squares-Discriminant Analysis (PLS-DA), K-Nearest Neighbour (KNN), exponentially weighted K-Nearest Neighbour (wKNN) and Support Vector Machine with radial functions (SVM-RBF) on 31 data sets. The proposed approach resulted to be competitive and comparable with the considered classical supervised classification methods. Finally, we analysed the role of the 13 different measures in terms of performance and percentage of not-assigned objects.
Todeschini, R., Valsecchi, C. (2022). Evaluation of classification performances of minimum spanning trees by 13 different metrics. MATCH, 87(2), 273-298 [10.46793/match.87-2.273T].
Evaluation of classification performances of minimum spanning trees by 13 different metrics
Todeschini, Roberto
;Valsecchi, Cecile
2022
Abstract
Minimum Spanning Tree (MST) is a well-known clustering algorithm that provides a graphical tree representation of the objects in a data set by exploiting local information to link each pair of similar objects. The a-posteriori analysis of this tree in terms of nodes and edges provides the basis to derive simple classifiers, namely semi-supervised classification approaches based on the minimum spanning tree approach. In this work, we propose different metrics to evaluate the MST ability to group objects of the same a-priori known classes. The classification capability of the proposed approach, using 13 different distance measures, was compared with that of classical supervised classification approaches such as N-Nearest Neighbour (N3), Binned Nearest Neighbour (BNN), Partial Least Squares-Discriminant Analysis (PLS-DA), K-Nearest Neighbour (KNN), exponentially weighted K-Nearest Neighbour (wKNN) and Support Vector Machine with radial functions (SVM-RBF) on 31 data sets. The proposed approach resulted to be competitive and comparable with the considered classical supervised classification methods. Finally, we analysed the role of the 13 different measures in terms of performance and percentage of not-assigned objects.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.