The Correlation Clustering problem has been introduced recently [5] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring respectively the similarity and dissimilarity of the two points, and we would like to compute an optimal partition where the value of a partition is obtained by summing up scores of pairs involving points from a same cluster and scores of pairs involving points from different clusters. A closely related problem is Consensus Clustering, where we are given a set of partitions and we would like to obtain a partition that best summarizes the input partitions. The latter problem is a restricted case of Correlation Clustering. In this paper we prove that Min Consensus Clustering is APX-hard even for three input partitions, answering an open question, while Max Consensus Clustering admits a PTAS on instances with a bounded number of input partitions. We exhibit a combinatorial and practical 4/5-approximation algorithm based on a greedy technique for Max Consensus Clustering on three partitions. Moreover, we prove that a PTAS exists for Max Correlation Clustering when the maximum ratio between two scores is at most a constant.

Bonizzoni, P., Della Vedova, G., Dondi, R., Jiang, T. (2005). Correlation Clustering and Consensus Clustering. In Algorithms and Computation 16th International Symposium, ISAAC 2005, Sanya, Hainan, China, December 19-21, 2005, Proceedings (pp.226-235) [10.1007/11602613_24].

Correlation Clustering and Consensus Clustering

Bonizzoni, P;Della Vedova, G;
2005

Abstract

The Correlation Clustering problem has been introduced recently [5] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring respectively the similarity and dissimilarity of the two points, and we would like to compute an optimal partition where the value of a partition is obtained by summing up scores of pairs involving points from a same cluster and scores of pairs involving points from different clusters. A closely related problem is Consensus Clustering, where we are given a set of partitions and we would like to obtain a partition that best summarizes the input partitions. The latter problem is a restricted case of Correlation Clustering. In this paper we prove that Min Consensus Clustering is APX-hard even for three input partitions, answering an open question, while Max Consensus Clustering admits a PTAS on instances with a bounded number of input partitions. We exhibit a combinatorial and practical 4/5-approximation algorithm based on a greedy technique for Max Consensus Clustering on three partitions. Moreover, we prove that a PTAS exists for Max Correlation Clustering when the maximum ratio between two scores is at most a constant.
paper
Algorithms; Approximation theory; Combinatorial mathematics; Computational complexity; Correlation methods; Problem solving
English
16th International Symposium on Algorithms and Computation, ISAAC 2005 - December 19-21, 2005
2005
Deng, X; Du, DZ
Algorithms and Computation 16th International Symposium, ISAAC 2005, Sanya, Hainan, China, December 19-21, 2005, Proceedings
9783540309352
2005
3827 LNCS
226
235
http://dx.doi.org/10.1007/11602613_24
none
Bonizzoni, P., Della Vedova, G., Dondi, R., Jiang, T. (2005). Correlation Clustering and Consensus Clustering. In Algorithms and Computation 16th International Symposium, ISAAC 2005, Sanya, Hainan, China, December 19-21, 2005, Proceedings (pp.226-235) [10.1007/11602613_24].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/34695
Citazioni
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 10
Social impact