Background: SNP (single nucleotide polymorphisms) genotype data are increasingly available in cattle populations and, among other things, can be used to predict carriers of specific mutations. It is therefore convenient to have a practical statistical method for the accurate classification of individuals into carriers and non-carriers. In this paper, we compared - through cross-validation- five classification models (Lasso-penalized logistic regression -Lasso, Support Vector Machines with either linear or radial kernel -SVML and SVMR, k-nearest neighbors -KNN, and multi-allelic gene prediction -MAG), for the identification of carriers of the TUBD1 recessive mutation on BTA19 (Bos taurus autosome 19), known to be associated with high calf mortality. A population of 3116 Fleckvieh and 392 Brown Swiss animals genotyped with the 54K SNP-chip was available for the analysis. Results: In general, the use of SNP genotypes proved to be very effective for the identification of mutation carriers. The best predictive models were Lasso, SVML and MAG, with an average error rate, respectively, of 0.2 %, 0.4 % and 0.6 % in Fleckvieh, and 1.2 %, 0.9 % and 1.7 % in Brown Swiss. For the three models, the false positive rate was, respectively, 0.1 %, 0.1 % and 0.2 % in Fleckvieh, and 3.0 %, 2.4 % and 1.6 % in Brown Swiss; the false negative rate was 4.4 %, 7.6 %1.0 % in Fleckvieh, and 0.0 %, 0.1% and 0.8 % in Brown Swiss. MAG appeared to be more robust to sample size reduction: with 25 % of the data, the average error rate was 0.7 % and 2.2 % in Fleckvieh and Brown Swiss, compared to 2.1 % and 5.5 % with Lasso, and 2.6 % and 12.0 % with SVML. Conclusions: The use of SNP genotypes is a very effective and efficient technique for the identification of mutation carriers in cattle populations. Very few misclassifications were observed, overall and both in the carriers and non-carriers classes. This indicates that this is a very reliable approach for potential applications in cattle breeding.

Biscarini, F., Schwarzenbacher, H., Pausch, H., Nicolazzi, E., Pirola, Y., Biffani, S. (2016). Use of SNP genotypes to identify carriers of harmful recessive mutations in cattle populations. BMC GENOMICS, 17(1), 857 [10.1186/s12864-016-3218-9].

Use of SNP genotypes to identify carriers of harmful recessive mutations in cattle populations

Pirola Y.;
2016

Abstract

Background: SNP (single nucleotide polymorphisms) genotype data are increasingly available in cattle populations and, among other things, can be used to predict carriers of specific mutations. It is therefore convenient to have a practical statistical method for the accurate classification of individuals into carriers and non-carriers. In this paper, we compared - through cross-validation- five classification models (Lasso-penalized logistic regression -Lasso, Support Vector Machines with either linear or radial kernel -SVML and SVMR, k-nearest neighbors -KNN, and multi-allelic gene prediction -MAG), for the identification of carriers of the TUBD1 recessive mutation on BTA19 (Bos taurus autosome 19), known to be associated with high calf mortality. A population of 3116 Fleckvieh and 392 Brown Swiss animals genotyped with the 54K SNP-chip was available for the analysis. Results: In general, the use of SNP genotypes proved to be very effective for the identification of mutation carriers. The best predictive models were Lasso, SVML and MAG, with an average error rate, respectively, of 0.2 %, 0.4 % and 0.6 % in Fleckvieh, and 1.2 %, 0.9 % and 1.7 % in Brown Swiss. For the three models, the false positive rate was, respectively, 0.1 %, 0.1 % and 0.2 % in Fleckvieh, and 3.0 %, 2.4 % and 1.6 % in Brown Swiss; the false negative rate was 4.4 %, 7.6 %1.0 % in Fleckvieh, and 0.0 %, 0.1% and 0.8 % in Brown Swiss. MAG appeared to be more robust to sample size reduction: with 25 % of the data, the average error rate was 0.7 % and 2.2 % in Fleckvieh and Brown Swiss, compared to 2.1 % and 5.5 % with Lasso, and 2.6 % and 12.0 % with SVML. Conclusions: The use of SNP genotypes is a very effective and efficient technique for the identification of mutation carriers in cattle populations. Very few misclassifications were observed, overall and both in the carriers and non-carriers classes. This indicates that this is a very reliable approach for potential applications in cattle breeding.
Articolo in rivista - Articolo scientifico
Carrier identification; Cattle; Haplotypes; KNN; Lasso-penalised logistic regression; MAG; Recessive mutations; SNP genotypes; Support vector machines; Algorithms; Animals; Cattle; Female; Genetic Carrier Screening; Male; Reproducibility of Results; Support Vector Machine; Genes, Recessive; Genotype; Heterozygote; Mutation; Polymorphism, Single Nucleotide
English
2016
17
1
857
857
open
Biscarini, F., Schwarzenbacher, H., Pausch, H., Nicolazzi, E., Pirola, Y., Biffani, S. (2016). Use of SNP genotypes to identify carriers of harmful recessive mutations in cattle populations. BMC GENOMICS, 17(1), 857 [10.1186/s12864-016-3218-9].
File in questo prodotto:
File Dimensione Formato  
s12864-016-3218-9.pdf

accesso aperto

Descrizione: Article
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 1.75 MB
Formato Adobe PDF
1.75 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/286960
Citazioni
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 9
Social impact