BAR expressolog identification: expression profile similarity ranking of homologous genes in plant species

selected heat map images from paperPatel RV, Nahal HK, Breit R, Provart NJ

Plant J. 2012 Sep;71(6):1038-50

PubMed PMID: 22607031

Abstract

Large numbers of sequences are now readily available for many plant species, allowing easy identification of homologous genes. However, orthologous gene identification across multiple species is made difficult by evolutionary events such as whole-genome or segmental duplications. Several developmental atlases of gene expression have been produced in the past couple of years, and it may be possible to use these transcript abundance data to refine ortholog predictions. In this study, clusters of homologous genes between seven plant species – Arabidopsis, soybean, Medicago truncatula, poplar, barley, maize and rice – were identified. Following this, a pipeline to rank homologs within gene clusters by both sequence and expression profile similarity was devised by determining equivalent tissues between species, with the best expression profile match being termed the ‘expressolog’. Five electronic fluorescent pictograph (eFP) browsers were produced as part of this effort, to aid in visualization of gene expression data and to complement existing eFP browsers at the Bio-Array Resource (BAR). Within the eFP browser framework, these expression profile similarity rankings were incorporated into an Expressolog Tree Viewer to allow cross-species homolog browsing by both sequence and expression pattern similarity. Global analyses showed that orthologs with the highest sequence similarity do not necessarily exhibit the highest expression pattern similarity. Other orthologs may show different expression patterns, indicating that such genes may require re-annotation or more specific annotation. Ultimately, it is envisaged that this pipeline will aid in improvement of the functional annotation of genes and translational plant research.