The availability of complex-structured data has sparked new research directions in statistics and machine learning. Bayesian nonparametrics is at the forefront of this trend thanks to two crucial features: its coherent probabilistic framework, which naturally leads to principled prediction and uncertainty quantification, and its infinite-dimensionality, which exempts from parametric restrictions and ensures full modeling flexibility. In this paper, we provide a concise overview of Bayesian nonparametrics starting from its foundations and the Dirichlet process, the most popular nonparametric prior. We describe the use of the Dirichlet process in species discovery, density estimation, and clustering problems. Among the many generalizations of the Dirichlet process proposed in the literature, we single out the Pitman–Yor process, and compare it to the Dirichlet process. Their different features are showcased with real-data illustrations. Finally, we consider more complex data structures, which require dependent versions of these models. One of the most effective strategies to achieve this goal is represented by hierarchical constructions. We highlight the role of the dependence structure in the borrowing of information and illustrate its effectiveness on unbalanced datasets.

Catalano, M., Lijoi, A., Prünster, I., Rigon, T. (2023). Bayesian modeling via discrete nonparametric priors. JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 6(2), 607-624 [10.1007/s42081-023-00210-5].

Bayesian modeling via discrete nonparametric priors

Rigon, T
2023

Abstract

The availability of complex-structured data has sparked new research directions in statistics and machine learning. Bayesian nonparametrics is at the forefront of this trend thanks to two crucial features: its coherent probabilistic framework, which naturally leads to principled prediction and uncertainty quantification, and its infinite-dimensionality, which exempts from parametric restrictions and ensures full modeling flexibility. In this paper, we provide a concise overview of Bayesian nonparametrics starting from its foundations and the Dirichlet process, the most popular nonparametric prior. We describe the use of the Dirichlet process in species discovery, density estimation, and clustering problems. Among the many generalizations of the Dirichlet process proposed in the literature, we single out the Pitman–Yor process, and compare it to the Dirichlet process. Their different features are showcased with real-data illustrations. Finally, we consider more complex data structures, which require dependent versions of these models. One of the most effective strategies to achieve this goal is represented by hierarchical constructions. We highlight the role of the dependence structure in the borrowing of information and illustrate its effectiveness on unbalanced datasets.
Articolo in rivista - Articolo scientifico
Clustering; Density estimation; Dependence; Dirichlet process; Exchangeability; Mixture model; Partial exchangeability; Pitman–Yor process; Species discovery;
English
22-giu-2023
2023
6
2
607
624
open
Catalano, M., Lijoi, A., Prünster, I., Rigon, T. (2023). Bayesian modeling via discrete nonparametric priors. JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 6(2), 607-624 [10.1007/s42081-023-00210-5].
File in questo prodotto:
File Dimensione Formato  
10281-453727_VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 488.93 kB
Formato Adobe PDF
488.93 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/453727
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact