Poster Presentation 40th Annual Lorne Genome Conference 2019

Stemformatics: visualise and download curated stem cell data (#128)

Jarny Choi 1 , Chris M Pacheco 1 , Rowland Mosbergen 1 , Othmar Korn 2 , Tyrone Chen 1 , Isha Nagpal 1 , Steve Englart 1 , Paul W Angel 1 , Christine Wells 1
  1. The University of Melbourne, Parkville, VIC, Australia
  2. The University of Queensland, Brisbane, Queensland, Australia

Stemformatics is an established gene expression data portal containing over 420 public gene expression datasets derived from microarray, RNA sequencing and single cell profiling technologies. Initially developed for the stem cell community, it has a major focus on pluripotency, tissue stem cells, and staged differentiation. Stemformatics includes curated ‘collections’ of data relevant to cell reprogramming, as well as hematopoiesis and leukaemia. Rather than simply rehosting datasets as they appear in public repositories, Stemformatics uses a stringent set of quality control metrics and its own pipelines to process handpicked datasets from raw files. This means that about 30% of datasets processed by Stemformatics fail the quality control metrics and never make it to the portal, ensuring that Stemformatics data are of high quality and have been processed in a consistent manner. Stemformatics provides easy-to-use and intuitive tools for biologists to visually explore the data, including interactive gene expression profiles, principal component analysis plots and hierarchical clusters, among others. Users can also download multiple processed datasets autonomously, and this has the potential use for bioinformaticians who may require high quality datasets for method testing and validation. The addition of tools that facilitate cross-dataset comparisons provides users with snapshots of gene expression in multiple cell and tissues, assisting the identification of cell-type restricted genes, or potential housekeeping genes. We are also prototyping tools that can show integrated data, such as an integrated PCA of blood cells, which can serve as a reference for other datasets to be projected onto. Stemformatics is freely available at stemformatics.org.