The reconstruction of the two distinct copies of each chromosome, called haplotypes, is an essential process for the characterization of the genome of an individual. Here we address a successful approach for haplotype assembly, called the weighted Minimum Error Correction (wMEC) problem, which consists in computing the two haplotypes that partition the sequencing reads into two disjoint sub-sets with the least number of corrections to the Single Nucleotide Polymorphisms values. To solve this problem we propose GenHap, a computational method based on Genetic Algorithms, which are able to obtain optimal solutions thanks to a global search process. To evaluate the effectiveness of GenHap, we test it on a synthetic (yet realistic) dataset based on the PacBio RS II sequencing technology. We compare the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype assembly. We show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 20× faster than HapCol on this synthetic (yet realistic) dataset.

Tangherloni, A., Spolaor, S., Rundo, L., Nobile, M., Cazzaniga, P., Mauri, G., et al. (2018). GenHap: Evolutionary Computation For Haplotype Assembly. Intervento presentato a: Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics, Caparica, Portugal.

GenHap: Evolutionary Computation For Haplotype Assembly

Tangherloni, A;Spolaor, S;Rundo, L;Nobile, M;Cazzaniga, P;Mauri, G;Besozzi, D;Merelli, I
2018

Abstract

The reconstruction of the two distinct copies of each chromosome, called haplotypes, is an essential process for the characterization of the genome of an individual. Here we address a successful approach for haplotype assembly, called the weighted Minimum Error Correction (wMEC) problem, which consists in computing the two haplotypes that partition the sequencing reads into two disjoint sub-sets with the least number of corrections to the Single Nucleotide Polymorphisms values. To solve this problem we propose GenHap, a computational method based on Genetic Algorithms, which are able to obtain optimal solutions thanks to a global search process. To evaluate the effectiveness of GenHap, we test it on a synthetic (yet realistic) dataset based on the PacBio RS II sequencing technology. We compare the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype assembly. We show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 20× faster than HapCol on this synthetic (yet realistic) dataset.
slide + paper
Haplotype assembly, Genetic algorithms, Combinatorial optimization, Weighted Minimum Error Correction problem
English
Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics
2018
2018
none
Tangherloni, A., Spolaor, S., Rundo, L., Nobile, M., Cazzaniga, P., Mauri, G., et al. (2018). GenHap: Evolutionary Computation For Haplotype Assembly. Intervento presentato a: Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics, Caparica, Portugal.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/327532
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact