Location-scale Dirichlet process mixtures of Gaussians (DPM-G) have proved extremely useful in dealing with density estimation and clustering problems in a wide range of domains. Motivated by an astronomical application, in this work we address the robustness of DPM-G models to affine transformations of the data, a natural requirement for any sensible statistical method for density estimation and clustering. First, we devise a coherent prior specification of the model which makes posterior inference invariant with respect to affine transformations of the data. Second, we formalise the notion of asymptotic robustness under data transformation and show that mild assumptions on the true data generating process are sufficient to ensure that DPM-G models feature such a property. Our investigation is supported by an extensive simulation study and illustrated by the analysis of an astronomical dataset consisting of physical measurements of stars in the field of the globular cluster NGC 2419.
Arbel, J., Corradin, R., Nipoti, B. (2021). Dirichlet process mixtures under affine transformations of the data. COMPUTATIONAL STATISTICS, 36(1), 577-601 [10.1007/s00180-020-01013-y].
Dirichlet process mixtures under affine transformations of the data
Corradin R.;Nipoti B.
2021
Abstract
Location-scale Dirichlet process mixtures of Gaussians (DPM-G) have proved extremely useful in dealing with density estimation and clustering problems in a wide range of domains. Motivated by an astronomical application, in this work we address the robustness of DPM-G models to affine transformations of the data, a natural requirement for any sensible statistical method for density estimation and clustering. First, we devise a coherent prior specification of the model which makes posterior inference invariant with respect to affine transformations of the data. Second, we formalise the notion of asymptotic robustness under data transformation and show that mild assumptions on the true data generating process are sufficient to ensure that DPM-G models feature such a property. Our investigation is supported by an extensive simulation study and illustrated by the analysis of an astronomical dataset consisting of physical measurements of stars in the field of the globular cluster NGC 2419.File | Dimensione | Formato | |
---|---|---|---|
Arbel2020_Article_DirichletProcessMixturesUnderA.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Dimensione
4.31 MB
Formato
Adobe PDF
|
4.31 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.