In this paper, for the first time, a novel discretization scheme is proposed aiming at enabling scalability but also at least three other strong challenges. It is based on a Left-to-Right (LR) scanning process, which partitions the input stream into intervals. This task can be implemented by an algorithm or by using a generator that builds automatically the discretization program. We focus especially on unsupervised discretization and design a method called Usupervised Left to Right Discretization (ULR-Discr). Extensive experiments were conducted using various cut-point functions on small, large and medical public datasets. First, ULR-Discr variants under different statistics are compared between themselves with the aim at observing the impact of the cut-point functions on accuracy and runtime. Then the proposed method is compared to traditional and recent techniques for classification. The result is that the classification accuracy is highly improved when using our method for discretization.

Drias, H., Moulai, H., Drias, Y. (2020). An Automated Unsupervised Discretization Method: A Novel Approach. VIETNAM JOURNAL OF COMPUTER SCIENCE, 7(3), 301-322 [10.1142/S2196888820500177].

An Automated Unsupervised Discretization Method: A Novel Approach

Drias Y.
2020

Abstract

In this paper, for the first time, a novel discretization scheme is proposed aiming at enabling scalability but also at least three other strong challenges. It is based on a Left-to-Right (LR) scanning process, which partitions the input stream into intervals. This task can be implemented by an algorithm or by using a generator that builds automatically the discretization program. We focus especially on unsupervised discretization and design a method called Usupervised Left to Right Discretization (ULR-Discr). Extensive experiments were conducted using various cut-point functions on small, large and medical public datasets. First, ULR-Discr variants under different statistics are compared between themselves with the aim at observing the impact of the cut-point functions on accuracy and runtime. Then the proposed method is compared to traditional and recent techniques for classification. The result is that the classification accuracy is highly improved when using our method for discretization.
Articolo in rivista - Articolo scientifico
Classifier; Data preprocessing; Lexical generator; Massive data; Unsupervised discretization;
English
2020
7
3
301
322
open
Drias, H., Moulai, H., Drias, Y. (2020). An Automated Unsupervised Discretization Method: A Novel Approach. VIETNAM JOURNAL OF COMPUTER SCIENCE, 7(3), 301-322 [10.1142/S2196888820500177].
File in questo prodotto:
File Dimensione Formato  
Drias-2020-Vietnam Journal of Computer Science-VoR.pdf

accesso aperto

Descrizione: This is an Open Access article published by World Scienti¯c Publishing Company. It is distributed underthe terms of the Creative Commons Attribution 4.0 (CC BY)
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 1.05 MB
Formato Adobe PDF
1.05 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/506819
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact