Statistical modelling in the presence of data organized in groups is a crucial task in Bayesian statistics. The present paper conceives a mixture model based on a novel family of Bayesian priors designed for multilevel data and obtained by normalizing a finite point process. In particular, the work extends the popular Mixture of Finite Mixtures model to the hierarchical framework to capture heterogeneity within and between groups. A full distribution theory for this new family and the induced clustering is developed, including the marginal, posterior, and predictive distributions. Efficient marginal and conditional Gibbs samplers are designed to provide posterior inference. The proposed mixture model outperforms the Hierarchical Dirichlet Process, the foremost tool for handling multilevel data, in terms of analytical feasibility, clustering discovery, and computational time. The motivating application comes from the analysis of shot put data, which contains performance measurements of athletes across different seasons. In this setting, the proposed model is exploited to induce clustering of the observations across seasons and athletes. By linking clusters across seasons, similarities and differences in athletes’ performances are identified.

Colombi, A., Argiento, R., Camerlenghi, F., Paci, L. (2025). Hierarchical Mixture of Finite Mixtures. BAYESIAN ANALYSIS [10.1214/24-ba1501].

Hierarchical Mixture of Finite Mixtures

Colombi, Alessandro
;
Camerlenghi, Federico;
2025

Abstract

Statistical modelling in the presence of data organized in groups is a crucial task in Bayesian statistics. The present paper conceives a mixture model based on a novel family of Bayesian priors designed for multilevel data and obtained by normalizing a finite point process. In particular, the work extends the popular Mixture of Finite Mixtures model to the hierarchical framework to capture heterogeneity within and between groups. A full distribution theory for this new family and the induced clustering is developed, including the marginal, posterior, and predictive distributions. Efficient marginal and conditional Gibbs samplers are designed to provide posterior inference. The proposed mixture model outperforms the Hierarchical Dirichlet Process, the foremost tool for handling multilevel data, in terms of analytical feasibility, clustering discovery, and computational time. The motivating application comes from the analysis of shot put data, which contains performance measurements of athletes across different seasons. In this setting, the proposed model is exploited to induce clustering of the observations across seasons and athletes. By linking clusters across seasons, similarities and differences in athletes’ performances are identified.
Articolo in rivista - Articolo scientifico
Model-based clustering, multilevel data , Partial exchangeability , Sports analytics, vector of finite Dirichlet processes
English
15-dic-2024
2025
none
Colombi, A., Argiento, R., Camerlenghi, F., Paci, L. (2025). Hierarchical Mixture of Finite Mixtures. BAYESIAN ANALYSIS [10.1214/24-ba1501].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/547529
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact