Bicocca Open Archive

As the availability of big biomedical data advances, there is a growing need of university students trained professionally on analyzing these data and correctly interpreting their results. We propose here a study plan for a master's degree course on biomedical data science, by describing our experience during the last academic year. In our university course, we explained how to find an open biomedical dataset, how to correctly clean it and how to prepare it for a computational statistics or machine learning phase. By doing so, we introduce common health data science terms and explained how to avoid common mistakes in the process. Moreover, we clarified how to perform an exploratory data analysis (EDA) and how to reasonably interpret its results. We also described how to properly execute a supervised or unsupervised machine learning analysis, and now to understand and interpret its outcomes. Eventually, we explained how to validate the findings obtained. We illustrated all these steps in the context of open science principles, by suggesting to the students to use only open source programming languages (R or Python in particular), open biomedical data (if available), and open access scientific articles (if possible). We believe our teaching proposal can be useful and of interest for anyone wanting to start to prepare a course on biomedical data science.

Chicco, D., Coelho, V. (2025). A teaching proposal for a short course on biomedical data science. PLOS COMPUTATIONAL BIOLOGY, 21(4) [10.1371/journal.pcbi.1012946].

A teaching proposal for a short course on biomedical data science

Chicco D.^Primo;Coelho V.

2025

Abstract

As the availability of big biomedical data advances, there is a growing need of university students trained professionally on analyzing these data and correctly interpreting their results. We propose here a study plan for a master's degree course on biomedical data science, by describing our experience during the last academic year. In our university course, we explained how to find an open biomedical dataset, how to correctly clean it and how to prepare it for a computational statistics or machine learning phase. By doing so, we introduce common health data science terms and explained how to avoid common mistakes in the process. Moreover, we clarified how to perform an exploratory data analysis (EDA) and how to reasonably interpret its results. We also described how to properly execute a supervised or unsupervised machine learning analysis, and now to understand and interpret its outcomes. Eventually, we explained how to validate the findings obtained. We illustrated all these steps in the context of open science principles, by suggesting to the students to use only open source programming languages (R or Python in particular), open biomedical data (if available), and open access scientific articles (if possible). We believe our teaching proposal can be useful and of interest for anyone wanting to start to prepare a course on biomedical data science.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				health informatics, teaching, biomedical data science
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				14-apr-2025
			
	Data di pubblicazione
	
				2025
			
	Rivista
	
				PLOS COMPUTATIONAL BIOLOGY
			
	Numero del volume
	
				21
			
	Fascicolo
	
				4
			
	Article number
	
				e1012946
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1371/journal.pcbi.1012946
			
	Fulltext
	
				open
			
	Citazione
	
				Chicco, D., Coelho, V. (2025). A teaching proposal for a short course on biomedical data science. PLOS COMPUTATIONAL BIOLOGY, 21(4) [10.1371/journal.pcbi.1012946].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Chicco-2025-PLoS Computational Biology-VoR.pdf accesso aperto Descrizione: CC BY 4.0 This is an open access article distributed under the terms of the Creative Commons Attribution License Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 2.25 MB Formato Adobe PDF Visualizza/Apri	2.25 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/550421

Citazioni

ND

0

Social impact