This paper explores the application of formal methods (specifically, model checking) to the field of data quality. A model expressing the consistency of longitudinal data is derived from the domain knowledge. This model is used (1) to automatically verify the consistency of the data stored on a database and (2) to automatically generate a universal cleanser, i.e. a cleanser which summarises all the feasible corrections for any kind of inconsistency which may affect the data (as far as they can be guessed from the formal consistency model). The universal cleanser represents a repository of corrective interventions useful to develop cleansing routines. We applied our approach to a real world scenario: a formal verification has been performed on labour market data evaluating the consistency of people working careers. The results show that the proposed approach can improve the data quality evaluation and the development of cleansing activities.
Mezzanzanica, M., Boselli, R., Cesarini, M., Mercorio, F. (2012). Towards the use of Model Checking for performing Data Consistency Evaluation and Cleansing. In ICIQ 2012. The 17th International Conference on Information Quality (pp.163-177). Paris : CNAM.
Towards the use of Model Checking for performing Data Consistency Evaluation and Cleansing
MEZZANZANICA, MARIOPrimo
;BOSELLI, ROBERTOSecondo
;CESARINI, MIRKOPenultimo
;MERCORIO, FABIO
2012
Abstract
This paper explores the application of formal methods (specifically, model checking) to the field of data quality. A model expressing the consistency of longitudinal data is derived from the domain knowledge. This model is used (1) to automatically verify the consistency of the data stored on a database and (2) to automatically generate a universal cleanser, i.e. a cleanser which summarises all the feasible corrections for any kind of inconsistency which may affect the data (as far as they can be guessed from the formal consistency model). The universal cleanser represents a repository of corrective interventions useful to develop cleansing routines. We applied our approach to a real world scenario: a formal verification has been performed on labour market data evaluating the consistency of people working careers. The results show that the proposed approach can improve the data quality evaluation and the development of cleansing activities.File | Dimensione | Formato | |
---|---|---|---|
ICIQ2012_pubblicato.pdf
accesso aperto
Dimensione
1.05 MB
Formato
Adobe PDF
|
1.05 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.