Bicocca Open Archive

The problem of maximizing cell type discovery under budget constraints is a fundamental challenge for the collection and analysis of single-cell RNAsequencing (scRNA-seq) data. In this paper we introduce a simple, computationally efficient and scalable Bayesian nonparametric sequential approach to optimize the budget allocation when designing a large-scale experiment for the collection of scRNA-seq data for the purpose of, but not limited to, creating cell atlases. Our approach relies on the following tools: (i) a hierarchical Pitman–Yor prior that recapitulates biological assumptions regarding cellular differentiation, and (ii) a Thompson sampling multiarmed bandit strategy that balances exploitation and exploration to prioritize experiments across a sequence of trials. Posterior inference is performed by using a sequential Monte Carlo approach which allows us to fully exploit the sequential nature of our species sampling problem. We empirically show that our approach outperforms state-of-the-art methods and achieves near-Oracle performance on simulated and scRNA-seq data alike.

Camerlenghi, F., Dumitrascu, B., Ferrari, F., Engelhardt, B., Favaro, S. (2020). Nonparametric bayesian multiarmed bandits for single-cell experiment design. THE ANNALS OF APPLIED STATISTICS, 14(4), 2003-2019 [10.1214/20-AOAS1370].

Nonparametric bayesian multiarmed bandits for single-cell experiment design

Camerlenghi F.^Co-primo;Dumitrascu B.^Co-primo;Ferrari F.^Co-primo;Engelhardt B. E.;Favaro S.

2020

Abstract

The problem of maximizing cell type discovery under budget constraints is a fundamental challenge for the collection and analysis of single-cell RNAsequencing (scRNA-seq) data. In this paper we introduce a simple, computationally efficient and scalable Bayesian nonparametric sequential approach to optimize the budget allocation when designing a large-scale experiment for the collection of scRNA-seq data for the purpose of, but not limited to, creating cell atlases. Our approach relies on the following tools: (i) a hierarchical Pitman–Yor prior that recapitulates biological assumptions regarding cellular differentiation, and (ii) a Thompson sampling multiarmed bandit strategy that balances exploitation and exploration to prioritize experiments across a sequence of trials. Posterior inference is performed by using a sequential Monte Carlo approach which allows us to fully exploit the sequential nature of our species sampling problem. We empirically show that our approach outperforms state-of-the-art methods and achieves near-Oracle performance on simulated and scRNA-seq data alike.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Cell type discovery; Experimental sampling design; Hierarchical Pitman–Yor model; Multiarmed bandits; ScRNA-seq; Sequential Monte Carlo; Thompson sampling;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				19-dic-2020
			
	Data di pubblicazione
	
				2020
			
	Rivista
	
				THE ANNALS OF APPLIED STATISTICS
			
	Numero del volume
	
				14
			
	Fascicolo
	
				4
			
	Pagina iniziale
	
				2003
			
	Pagina finale
	
				2019
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1214/20-AOAS1370
			
	Fulltext
	
				none
			
	Citazione
	
				Camerlenghi, F., Dumitrascu, B., Ferrari, F., Engelhardt, B., Favaro, S. (2020). Nonparametric bayesian multiarmed bandits for single-cell experiment design. THE ANNALS OF APPLIED STATISTICS, 14(4), 2003-2019 [10.1214/20-AOAS1370].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/298273

Citazioni

6

5

Social impact