High-throughput sequencing enabled scientists and clinicians to produce sequencing data quickly, cheaply, and at scale. The amount of genetic data generated is doubling almost every year. At 100s of gigabytes per human genome sample, the computational demand adds up quickly. In addition, re-analysis of genetic data as a component of large population studies is on the rise. When there are tens of thousands of patients, analyzing genomic data can take years using CPU-servers. This data explosion is constantly challenging conventional methods used in genomics. More and more deep learning genomics applications are introduced to be able to accurately get meaningful results from this large volume data.
To enable researchers to efficiently analyze trends in genomic data from entire populations without the need to wait for years, we have made a software framework to accelerate these analyses including conventional and deep learning by using the power of Graphics Processing units (GPUs). GPUs have rapidly evolved to become high-performance accelerators for data parallel computing. Modern GPUs contain hundreds of processing units, The massively parallel hardware architecture make them particularly well-suited to many of the bioinformatics workloads and deep learning application that occupy HPC clusters, leading to their incorporation as HPC accelerators. Beyond their appeal as cost-effective HPC accelerators, GPUs also have the potential to significantly reduce space, power, and cooling demand relative to traditional CPU clusters of similar computational capability.