The MINIMUM-RECOMBINANT HAPLOTYPE CONFIGURATION problem (MRHC) has been highly successful in providing a sound combinatorial formulation for the important problem of genotype phasing on pedigrees. Despite several algorithmic advances that have improved the efficiency, its applicability to real data sets has been limited since it does not take into account some important phenomena such as mutations, genotyping errors, and missing data. In this work, we propose the MINIMUM-RECOMBINANT HAPLOTYPE CONFIGURATION WITH BOUNDED ERRORS problem (MRHCE), which extends the original MRHC formulation by incorporating the two most common characteristics of real data: errors and missing genotypes (including untyped individuals). We describe a practical algorithm for MRHCE that is based on a reduction to the well-known Satisfiability problem (SAT) and exploits recent advances in the constraint programming literature. An experimental analysis demonstrates the biological soundness of the phasing model and the effectiveness (on both accuracy and performance) of the algorithm under several scenarios. The analysis on real data and the comparison with state-of-the-art programs reveals that our approach couples better scalability to large and complex pedigrees with the explicit inclusion of genotyping errors into the model.

Pirola, Y., DELLA VEDOVA, G., Biffani, S., Stella, A., Bonizzoni, P. (2012). A Fast and Practical Approach to Genotype Phasing and Imputation on a Pedigree with Erroneous and Incomplete Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 9(6), 1582-1594 [10.1109/TCBB.2012.100].

A Fast and Practical Approach to Genotype Phasing and Imputation on a Pedigree with Erroneous and Incomplete Information

PIROLA, YURI
Primo
;
DELLA VEDOVA, GIANLUCA
Secondo
;
BONIZZONI, PAOLA
Ultimo
2012

Abstract

The MINIMUM-RECOMBINANT HAPLOTYPE CONFIGURATION problem (MRHC) has been highly successful in providing a sound combinatorial formulation for the important problem of genotype phasing on pedigrees. Despite several algorithmic advances that have improved the efficiency, its applicability to real data sets has been limited since it does not take into account some important phenomena such as mutations, genotyping errors, and missing data. In this work, we propose the MINIMUM-RECOMBINANT HAPLOTYPE CONFIGURATION WITH BOUNDED ERRORS problem (MRHCE), which extends the original MRHC formulation by incorporating the two most common characteristics of real data: errors and missing genotypes (including untyped individuals). We describe a practical algorithm for MRHCE that is based on a reduction to the well-known Satisfiability problem (SAT) and exploits recent advances in the constraint programming literature. An experimental analysis demonstrates the biological soundness of the phasing model and the effectiveness (on both accuracy and performance) of the algorithm under several scenarios. The analysis on real data and the comparison with state-of-the-art programs reveals that our approach couples better scalability to large and complex pedigrees with the explicit inclusion of genotyping errors into the model.
Articolo in rivista - Articolo scientifico
haplotype inference, recombinations, pedigrees, genotyping errors, missing genotypes, bioinformatics, algorithm design and analysis, computational biology
English
2012
9
6
1582
1594
reserved
Pirola, Y., DELLA VEDOVA, G., Biffani, S., Stella, A., Bonizzoni, P. (2012). A Fast and Practical Approach to Genotype Phasing and Imputation on a Pedigree with Erroneous and Incomplete Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 9(6), 1582-1594 [10.1109/TCBB.2012.100].
File in questo prodotto:
File Dimensione Formato  
journ-art-12-tcbb-b.pdf

Solo gestori archivio

Descrizione: Articolo principale
Dimensione 1.78 MB
Formato Adobe PDF
1.78 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/39594
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
Social impact