Recently, observed departures from the classical Gaussian mixture model in real datasets have led to the introduction of more flexible tools for modeling heterogeneous skew data. Among the latest proposals in the literature, we consider mixtures of skew normal, to incorporate asymmetry in components, as well as mixtures of t, to down-weight the contribution of extremal observations. Clearly, mixtures of skew t have widened the application of model based clustering and classification to great many real datasets, as they can adapt to both asymmetry and leptokurtosis in the grouped data. Unfortunately, when data contamination occurs far from the bulk of the data, or even between the groups, classical inference for these models is not reliable. Our proposal is to address robust estimation of mixtures of skew normal, to resist sparse outliers and even pointwise contamination that could arise in data collection. We introduce a constructive way to obtain a robust estimator for the mixture of skew normal model, by incorporating impartial trimming and constraints in the EM algorithm. At each E-step, a low percentage of less plausible observations, under the estimated model, is tentatively trimmed; at the M-step, constraints on the scatter matrices are employed to avoid singularities and reduce spurious maximizers. Some applications on artificial and real data show the effectiveness of our proposal, and the joint role of trimming and constraints to achieve robustness
Garcìa-Escudero, L., Greselin, F., Mclachlan, G., Mayo-Iscar, A. (2015). Robust estimation for mixtures of skew data. In Proceedings of the 8th International Conference of the ERCIM WG on Computing & Statistics (CMStatistics 2015) (pp. 212-212). London : 2015 - CFE and CMStatistics networks.
Robust estimation for mixtures of skew data
Greselin, F;
2015
Abstract
Recently, observed departures from the classical Gaussian mixture model in real datasets have led to the introduction of more flexible tools for modeling heterogeneous skew data. Among the latest proposals in the literature, we consider mixtures of skew normal, to incorporate asymmetry in components, as well as mixtures of t, to down-weight the contribution of extremal observations. Clearly, mixtures of skew t have widened the application of model based clustering and classification to great many real datasets, as they can adapt to both asymmetry and leptokurtosis in the grouped data. Unfortunately, when data contamination occurs far from the bulk of the data, or even between the groups, classical inference for these models is not reliable. Our proposal is to address robust estimation of mixtures of skew normal, to resist sparse outliers and even pointwise contamination that could arise in data collection. We introduce a constructive way to obtain a robust estimator for the mixture of skew normal model, by incorporating impartial trimming and constraints in the EM algorithm. At each E-step, a low percentage of less plausible observations, under the estimated model, is tentatively trimmed; at the M-step, constraints on the scatter matrices are employed to avoid singularities and reduce spurious maximizers. Some applications on artificial and real data show the effectiveness of our proposal, and the joint role of trimming and constraints to achieve robustnessFile | Dimensione | Formato | |
---|---|---|---|
GGMM Ercim 2015 Robust estimation for mixtures of skew data.pdf
accesso aperto
Dimensione
453.17 kB
Formato
Adobe PDF
|
453.17 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.