In this paper, a new approach in classification models, called Polarized Classification Tree model, is introduced. From a methodological perspective, a new index of polarization to measure the goodness of splits in the growth of a classification tree is proposed. The new introduced measure tackles weaknesses of the classical ones used in classification trees (Gini and Information Gain), because it does not only measure the impurity but it also reflects the distribution of each covariate in the node, i.e., employing more discriminating covariates to split the data at each node. From a computational prospective, a new algorithm is proposed and implemented employing the new proposed measure in the growth of a tree. In order to show how our proposal works, a simulation exercise has been carried out. The results obtained in the simulation framework suggest that our proposal significantly outperforms impurity measures commonly adopted in classification tree modeling. Moreover, the empirical evidence on real data shows that Polarized Classification Tree models are competitive and sometimes better with respect to classical classification tree models.

Ballante, E., Galvani, M., Uberti, P., Figini, S. (2021). Polarized Classification Tree Models: Theory and Computational Aspects. JOURNAL OF CLASSIFICATION, 38(3), 481-499 [10.1007/s00357-021-09383-8].

Polarized Classification Tree Models: Theory and Computational Aspects

Pierpaolo Uberti;
2021

Abstract

In this paper, a new approach in classification models, called Polarized Classification Tree model, is introduced. From a methodological perspective, a new index of polarization to measure the goodness of splits in the growth of a classification tree is proposed. The new introduced measure tackles weaknesses of the classical ones used in classification trees (Gini and Information Gain), because it does not only measure the impurity but it also reflects the distribution of each covariate in the node, i.e., employing more discriminating covariates to split the data at each node. From a computational prospective, a new algorithm is proposed and implemented employing the new proposed measure in the growth of a tree. In order to show how our proposal works, a simulation exercise has been carried out. The results obtained in the simulation framework suggest that our proposal significantly outperforms impurity measures commonly adopted in classification tree modeling. Moreover, the empirical evidence on real data shows that Polarized Classification Tree models are competitive and sometimes better with respect to classical classification tree models.
Articolo in rivista - Articolo scientifico
Classification trees; Polarization measures; Splitting rules;
English
2021
38
3
481
499
reserved
Ballante, E., Galvani, M., Uberti, P., Figini, S. (2021). Polarized Classification Tree Models: Theory and Computational Aspects. JOURNAL OF CLASSIFICATION, 38(3), 481-499 [10.1007/s00357-021-09383-8].
File in questo prodotto:
File Dimensione Formato  
Ballante-2021-J Classif-VoR.pdf

Solo gestori archivio

Descrizione: Original paper
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 801.48 kB
Formato Adobe PDF
801.48 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/394652
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact