Code smells are archetypes of design shortcomings in the code that can potentially cause problems during maintenance. One known approach for detecting code smells is via detection rules: a combination of different object-oriented metrics with pre-defined threshold values. The usage of inadequate thresholds when using this approach could lead to either having too few observations (too many false negatives) or too many observations (too many false positives). Furthermore, without a clear methodology for deriving thresholds, one is left with those suggested in literature (or by the tool vendors), which may not necessarily be suitable to the context of analysis. In this paper, we propose a data-driven (i.e., benchmark-based) method to derive threshold values for code metrics, which can be used for implementing detection rules for code smells. Our method is transparent, repeatable and enables the extraction of thresholds that respect the statistical properties of the metric in question (such as scale and distribution). Thus, our approach enables the calibration of code smell detection rules by selecting relevant systems as benchmark data. To illustrate our approach, we generated a benchmark dataset based on 74 systems of the Qualitas Corpus, and extracted the thresholds for five smell detection rules.
ARCELLI FONTANA, F., Ferme, V., Zanoni, M., Yamashita, A. (2015). Automatic metric thresholds derivation for code smell detection. In Proceedings of the 6th International Workshop on Emerging Trends in Software Metrics (WETSoM 2015) (pp.44-53). IEEE Computer Society [10.1109/WETSoM.2015.14].
Automatic metric thresholds derivation for code smell detection
ARCELLI FONTANA, FRANCESCAPrimo
;ZANONI, MARCO
;
2015
Abstract
Code smells are archetypes of design shortcomings in the code that can potentially cause problems during maintenance. One known approach for detecting code smells is via detection rules: a combination of different object-oriented metrics with pre-defined threshold values. The usage of inadequate thresholds when using this approach could lead to either having too few observations (too many false negatives) or too many observations (too many false positives). Furthermore, without a clear methodology for deriving thresholds, one is left with those suggested in literature (or by the tool vendors), which may not necessarily be suitable to the context of analysis. In this paper, we propose a data-driven (i.e., benchmark-based) method to derive threshold values for code metrics, which can be used for implementing detection rules for code smells. Our method is transparent, repeatable and enables the extraction of thresholds that respect the statistical properties of the metric in question (such as scale and distribution). Thus, our approach enables the calibration of code smell detection rules by selecting relevant systems as benchmark data. To illustrate our approach, we generated a benchmark dataset based on 74 systems of the Qualitas Corpus, and extracted the thresholds for five smell detection rules.File | Dimensione | Formato | |
---|---|---|---|
thresholds.pdf
Solo gestori archivio
Descrizione: articolo
Dimensione
232.35 kB
Formato
Adobe PDF
|
232.35 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.