Poster Presentation 40th Annual Lorne Genome Conference 2019

Assessing the biological signal of different RNA fractions for computational deconvolution of healthy tissues (#114)

Francisco Avila Cobos 1 2 , Lucia Lorenzi 1 2 , Jo Vandesompele 1 2 , Gary Schroth 3 , Joseph Powell 4 , Katleen De Preter 1 2 , Pieter Mestdagh 1 2
  1. Center for Medical Genetics , Ghent University, Ghent, Belgium
  2. Cancer Research Institute Ghent, CRIG, Ghent, Belgium
  3. Illumina, San Diego, California, USA
  4. Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia

Background

The analysis of the transcriptome has significantly contributed to our understanding of the processes involved in disease and development, but the heterogeneous nature of samples and tissues under investigation has been largely neglected. Multiple computational approaches have been developed to infer abundance of different cell types in heterogeneous samples (=computational deconvolution) [1]. Albeit potentially applicable to different RNA fractions, the available methods have been designed and tested on protein coding genes (mRNAs) only. Using expression data of known and novel long non-coding RNAs (lncRNAs), circular RNAs (circRNAs), microRNAs (miRNAs) and mRNAs from RNA-sequencing data across 160 different normal cell types and 45 tissues from the RNA Atlas project [2], we investigated the performance of additional RNA fractions in the computational deconvolution workflow.

Results

Tissues and cell types in the RNA-Atlas dataset were matched based on UBERON ontology. For each cell type, we defined cell-type specific markers based on matching mRNA, lncRNA, miRNA and circRNA expression data. These markers were subsequently applied to determine the proportion of each cell type in each of the tissues through computational deconvolution. For any given tissue, we defined the “signal” as the sum of the proportions of all its constituent cell types. This signal was computed for mRNA, miRNA, lncRNA and circRNA markers separately.

Conclusions

We found that mRNAs contained the highest amount of biological signal across tissues, closely followed by lncRNAs. Furthermore, despite having lower overall performance, both miRNAs and circRNAs can deconvolve specific tissues with higher accuracy than mRNAs and lncRNAs.

References

  1. Avila Cobos,F. et al. Computational deconvolution of transcriptomics data from mixed cell populations. Bioinformatics. (DOI: 10.1093/bioinformatics/bty019)
  2. Lorenzi L. et al. RNA Atlas: A nucleotide resolution map of the human transcriptome. ISMB2018 (RNA COSI)