The use of next-generation sequencing instruments to study hematological malignancies generates a tremendous amount of sequencing data. This leads to a challenging bioinformatics problem to store, manage, and analyze terabytes of sequencing data, often generated from extremely different data sources. Our project is mainly focused on sequence analysis of human cancer genomes, in order to identify the genetic lesions underlying the development of tumors. However, the automated detection procedure of somatic mutations and the statistical testing procedure to identify genetic lesions are still an open problem. Therefore, we propose a computational procedure to handle large-scale sequencing data in order to detect exonic somatic mutations in a tumor sample. The proposed pipeline includes several steps based on open-source software and the R language: alignment, detection of mutations, annotation, functional classification, and visualization of results. We analyzed Illumina whole-exome sequencing data from five leukemic patients and five paired controls plus one colon cancer sample and paired control. The results were validated by Sanger sequencing.

Spinelli, R., Piazza, R., Pirola, A., Valletta, S., Roberta, R., Mogavero, A., et al. (2014). Whole-Exome Sequencing Data - Identifying Somatic Mutations. In N. Kasabov (a cura di), Springer Handbook of Bio-/Neuroinformatics (pp. 419-427). Springer [10.1007/978-3-642-30574-0_25].

Whole-Exome Sequencing Data - Identifying Somatic Mutations

SPINELLI, ROBERTA;PIAZZA, ROCCO GIOVANNI;PIROLA, ALESSANDRA;VALLETTA, SIMONA;MOGAVERO, ANGELA;MAREGA, MANUELA;KUNDANINGATTU RAMAN, HIMA;GAMBACORTI PASSERINI, CARLO
2014

Abstract

The use of next-generation sequencing instruments to study hematological malignancies generates a tremendous amount of sequencing data. This leads to a challenging bioinformatics problem to store, manage, and analyze terabytes of sequencing data, often generated from extremely different data sources. Our project is mainly focused on sequence analysis of human cancer genomes, in order to identify the genetic lesions underlying the development of tumors. However, the automated detection procedure of somatic mutations and the statistical testing procedure to identify genetic lesions are still an open problem. Therefore, we propose a computational procedure to handle large-scale sequencing data in order to detect exonic somatic mutations in a tumor sample. The proposed pipeline includes several steps based on open-source software and the R language: alignment, detection of mutations, annotation, functional classification, and visualization of results. We analyzed Illumina whole-exome sequencing data from five leukemic patients and five paired controls plus one colon cancer sample and paired control. The results were validated by Sanger sequencing.
Capitolo o saggio
R language, somatic mutation, aCML, CML,
English
Springer Handbook of Bio-/Neuroinformatics
Kasabov, N
2014
978-3-642-30573-3
Springer
419
427
Spinelli, R., Piazza, R., Pirola, A., Valletta, S., Roberta, R., Mogavero, A., et al. (2014). Whole-Exome Sequencing Data - Identifying Somatic Mutations. In N. Kasabov (a cura di), Springer Handbook of Bio-/Neuroinformatics (pp. 419-427). Springer [10.1007/978-3-642-30574-0_25].
none
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/40433
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact