The biological interpretation of large-scale gene expression data is one of the challenges in current bioinformatics. The state-of-the-art approach is to perform clustering and then compute a functional characterization via enrichments by Gene Ontology terms. To better assist the interpretation of results, it may be useful to establish connections among different clusters. This machine learning step is sometimes termed cluster meta-analysis, and several approaches have already been proposed; in particular, they usually rely on enrichments based on flat lists of GO terms. However, GO terms are organized in taxonomical graphs, whose structure should be taken into account when performing enrichment studies. To tackle this problem, we propose a kernel approach that can exploit such structured graphical nature. Finally, we compare our approach against a specific flat list method by analyzing the cdc15-subset of the well known Spellman’s Yeast Cell Cycle dataset
Zoppis, I., Merico, D., Antoniotti, M., Mishra, B., Mauri, G. (2007). Discovering relations among GO-annotated clusters by graph kernel methods. In Bioinformatics Research and Applications. Third International Symposium, ISBRA 2007, Atlanta, GA, USA, May 7-10, 2007. Proceedings (pp.158-169). Berlin : Springer [10.1007/978-3-540-72031-7_15].
Discovering relations among GO-annotated clusters by graph kernel methods
ZOPPIS, ITALO FRANCESCO;ANTONIOTTI, MARCO;MAURI, GIANCARLO
2007
Abstract
The biological interpretation of large-scale gene expression data is one of the challenges in current bioinformatics. The state-of-the-art approach is to perform clustering and then compute a functional characterization via enrichments by Gene Ontology terms. To better assist the interpretation of results, it may be useful to establish connections among different clusters. This machine learning step is sometimes termed cluster meta-analysis, and several approaches have already been proposed; in particular, they usually rely on enrichments based on flat lists of GO terms. However, GO terms are organized in taxonomical graphs, whose structure should be taken into account when performing enrichment studies. To tackle this problem, we propose a kernel approach that can exploit such structured graphical nature. Finally, we compare our approach against a specific flat list method by analyzing the cdc15-subset of the well known Spellman’s Yeast Cell Cycle datasetI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.