The interpretation of genome-wide association study is difficult, as it is hard to understand how polymorphisms can affect gene regulation, in particular for trans-regulatory elements located far from their controlling gene. Using RNA or protein expression data as phenotypes, it is possible to correlate their variations with specific genotypes. This technique is usually referred to as expression Quantitative Trait Loci (eQTLs) analysis and only few packages exist for the integration of genotype patterns and expression profiles. In particular, tools are needed for the analysis of next-generation sequencing (NGS) data on a genome-wide scale, which is essential to identify eQTLs able to control a large number of genes (hotspots). Here we present SPIRE (Software for Polymorphism Identification Regulating Expression), a generic, modular and functionally highly flexible pipeline for eQTL processing. SPIRE integrates different univariate and multivariate approaches for eQTL analysis, paying particular attention to the scalability of the procedure in order to support cis- as well as trans-mapping, thus allowing the identification of hotspots in NGS data. In particular, we demonstrated how SPIRE can handle big association study datasets, reproducing published results and improving the identification of trans-eQTLs. Furthermore, we employed the pipeline to analyse novel data concerning the genotypes of two different C. elegans strains (N2 and Hawaii) and related miRNA expression data, obtained using RNA-Seq. A miRNA regulatory hotspot was identified in chromosome 1, overlapping the transcription factor grh-1, known to be involved in the early phases of embryonic development of C. elegans. In a follow-up qPCR experiment we were able to verify most of the predicted eQTLs, as well as to show, for a novel miRNA, a significant difference in the sequences of the two analysed strains of C. elegans. SPIRE is publicly available as open source software at https://bitbucket.org/bereste/spire, together with some example data, a readme file, supplementary material and a short tutorial.
Kel, I., Chang, Z., Galluccio, N., Romeo, M., Beretta, S., Diomede, L., et al. (2016). SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in: C. elegans. MOLECULAR BIOSYSTEMS, 12(11), 3447-3458 [10.1039/c6mb00453a].
SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in: C. elegans
BERETTA, STEFANO;MILANESI, LUCIANO;MERELLI, IVAN
2016
Abstract
The interpretation of genome-wide association study is difficult, as it is hard to understand how polymorphisms can affect gene regulation, in particular for trans-regulatory elements located far from their controlling gene. Using RNA or protein expression data as phenotypes, it is possible to correlate their variations with specific genotypes. This technique is usually referred to as expression Quantitative Trait Loci (eQTLs) analysis and only few packages exist for the integration of genotype patterns and expression profiles. In particular, tools are needed for the analysis of next-generation sequencing (NGS) data on a genome-wide scale, which is essential to identify eQTLs able to control a large number of genes (hotspots). Here we present SPIRE (Software for Polymorphism Identification Regulating Expression), a generic, modular and functionally highly flexible pipeline for eQTL processing. SPIRE integrates different univariate and multivariate approaches for eQTL analysis, paying particular attention to the scalability of the procedure in order to support cis- as well as trans-mapping, thus allowing the identification of hotspots in NGS data. In particular, we demonstrated how SPIRE can handle big association study datasets, reproducing published results and improving the identification of trans-eQTLs. Furthermore, we employed the pipeline to analyse novel data concerning the genotypes of two different C. elegans strains (N2 and Hawaii) and related miRNA expression data, obtained using RNA-Seq. A miRNA regulatory hotspot was identified in chromosome 1, overlapping the transcription factor grh-1, known to be involved in the early phases of embryonic development of C. elegans. In a follow-up qPCR experiment we were able to verify most of the predicted eQTLs, as well as to show, for a novel miRNA, a significant difference in the sequences of the two analysed strains of C. elegans. SPIRE is publicly available as open source software at https://bitbucket.org/bereste/spire, together with some example data, a readme file, supplementary material and a short tutorial.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.