Oral Presentation 40th Annual Lorne Genome Conference 2019

Comparative analysis of H3K27me3 domains establishes a repressive index for inferring regulatory genes governing cell identity from any chordate cell type (#35)

Nathan Palpant 1
  1. University of Queensland, St. Lucia, QLD, Australia

Identifying the mechanisms governing development and disease remains difficult due to the challenge of enriching for regulatory genes in an unsupervised manner. We evaluated chromatin states from 111 NIH epigenome roadmap samples and found that genes having broad H3K27me3 domains with high frequency across diverse cell types, which we call a repressive tendency (RT), significantly enrich for cell-type specific regulatory genes. We found that a gene’s RT value can act as a fixed variable to weight any quantitative gene expression data resulting in enrichment of regulatory genes governing that cell type. This analysis approach, which we call TRIAGE (Transcriptional regulatory inference analysis of gene expression), is unsupervised and does not depend on external reference data, statistical cutoffs, or prior knowledge. We used consortium data from the Human Cell Atlas, FANTOM, and a draft map of the human proteome to show that TRIAGE can enrich for regulatory genes from any cell or tissue type using any quantitative readout of gene expression including RNA-seq (bulk or single cell), CAGE, or quantitative proteomics. Given the highly conserved role of regulatory genes, we show that TRIAGE can be applied to quantitative gene expression data from any chordate species from tunicates to mammals and identify the regulatory drivers of disease and development. Lastly, we utilized TRIAGE to analyze scRNA-seq data from cardiac differentiation and identified CRLF1, GAD1, and SIX3 as candidate novel genes governing germ layer specification. We used CRISPRi hPSCs to show that loss of function for these genes blocks derivation of definitive endoderm. Taken together, TRIAGE provides a computational approach for analyzing any quantitative readout of gene expression to identify regulatory genes underlying cell identity and fate in health and disease from any somatic cell type and chordate species thus opening new opportunities to discover mechanisms underlying organ development, disease, and regeneration.