Decision making activities stress data and information quality requirements. The quality of data sources is frequently very poor, therefore a cleansing process is required before using such data for decision making processes. When alternative (and more trusted) data sources are not available data can be cleansed only us- ing business rules derived from domain knowledge. Business rules focus on fixing inconsistencies, but an inconsistency can be cleansed in different ways (i.e. the correction can be not deterministic), therefore the choice on how to cleanse data can (even strongly) affect the aggregate values computed for decision making purposes. The paper proposes a methodology exploiting Finite State Systems to quantitatively estimate how computed variables and indicators might be affected by the uncertainty related to low data quality, indepen- dently from the data cleansing methodology used. The methodology has been implemented and tested on a real case scenario providing effective results.
Mezzanzanica, M., Boselli, R., Cesarini, M., Mercorio, F. (2012). Data Quality Sensitivity Analysis on Aggregate Indicators. In DATA 2012 - Proceedings of the International Conference on Data Technologies and Applications (pp.97-108). SciTePress [10.5220/0004040300970108].
Data Quality Sensitivity Analysis on Aggregate Indicators
MEZZANZANICA, MARIO;BOSELLI, ROBERTO;CESARINI, MIRKO;MERCORIO, FABIO
2012
Abstract
Decision making activities stress data and information quality requirements. The quality of data sources is frequently very poor, therefore a cleansing process is required before using such data for decision making processes. When alternative (and more trusted) data sources are not available data can be cleansed only us- ing business rules derived from domain knowledge. Business rules focus on fixing inconsistencies, but an inconsistency can be cleansed in different ways (i.e. the correction can be not deterministic), therefore the choice on how to cleanse data can (even strongly) affect the aggregate values computed for decision making purposes. The paper proposes a methodology exploiting Finite State Systems to quantitatively estimate how computed variables and indicators might be affected by the uncertainty related to low data quality, indepen- dently from the data cleansing methodology used. The methodology has been implemented and tested on a real case scenario providing effective results.File | Dimensione | Formato | |
---|---|---|---|
data2012cr.pdf
accesso aperto
Tipologia di allegato:
Author’s Accepted Manuscript, AAM (Post-print)
Dimensione
216.15 kB
Formato
Adobe PDF
|
216.15 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.