Large amounts of data are collected by public administrations and healthcare organizations, the integration of the data scattered in several information systems can facilitate the comprehension of complex scenarios and support the activities of decision makers. Unfortunately, the quality of information system archives is very poor, as widely stated by the existing literature. Data cleansing is one of the most frequently used data improvement technique. Data can be cleansed in several ways, the optimal choice however is strictly dependent on the integration and analysis processes to be performed. Therefore, the design of a data analysis process should consider in a holistic way the data integration, cleansing, and analysis activities. However, in the existing literature, the data integration and cleansing issues have been mostly addressed in isolation. In this paper we describe how a model based cleansing framework is extended to address also integration activities. The combined approach facilitates the rapid prototyping, development, and evaluation of data pre-processing activities. Furthermore, the combined use of formal methods and visualization techniques strongly empower the data analyst which can effectively evaluate how cleansing and integration activities can affect the data analysis. An example focusing on labour and healthcare data integration is showed.

Boselli, R., Cesarini, M., Mercorio, F., Mezzanzanica, M. (2014). A Policy-Based Cleansing and Integration Framework for Labour and Healthcare Data. In A. Holzinger, I. Jurisica (a cura di), Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. State-of-the-Art and Future Challenges (pp. 141-168). Berlin Heidelberg : Springer Verlag [10.1007/978-3-662-43968-5_8].

A Policy-Based Cleansing and Integration Framework for Labour and Healthcare Data

Boselli, R;Cesarini, M;Mercorio, F;Mezzanzanica, M.
2014

Abstract

Large amounts of data are collected by public administrations and healthcare organizations, the integration of the data scattered in several information systems can facilitate the comprehension of complex scenarios and support the activities of decision makers. Unfortunately, the quality of information system archives is very poor, as widely stated by the existing literature. Data cleansing is one of the most frequently used data improvement technique. Data can be cleansed in several ways, the optimal choice however is strictly dependent on the integration and analysis processes to be performed. Therefore, the design of a data analysis process should consider in a holistic way the data integration, cleansing, and analysis activities. However, in the existing literature, the data integration and cleansing issues have been mostly addressed in isolation. In this paper we describe how a model based cleansing framework is extended to address also integration activities. The combined approach facilitates the rapid prototyping, development, and evaluation of data pre-processing activities. Furthermore, the combined use of formal methods and visualization techniques strongly empower the data analyst which can effectively evaluate how cleansing and integration activities can affect the data analysis. An example focusing on labour and healthcare data integration is showed.
Capitolo o saggio
Data Quality; Data Integration; Model-based Reasoning; Data Visualisation
English
Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. State-of-the-Art and Future Challenges
Holzinger, A; Jurisica, I
2014
978-3-662-43967-8
8401
Springer Verlag
141
168
Boselli, R., Cesarini, M., Mercorio, F., Mezzanzanica, M. (2014). A Policy-Based Cleansing and Integration Framework for Labour and Healthcare Data. In A. Holzinger, I. Jurisica (a cura di), Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. State-of-the-Art and Future Challenges (pp. 141-168). Berlin Heidelberg : Springer Verlag [10.1007/978-3-662-43968-5_8].
none
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/52122
Citazioni
  • Scopus 17
  • ???jsp.display-item.citation.isi??? ND
Social impact