Poster Presentation 40th Annual Lorne Genome Conference 2019

SquiggleKit: A toolkit for manipulating nanopore signal data (#152)

James M Ferguson 1 , Shaun Carswell 1 , Hasindu Gamaarachchi 1 , Kirston Barton 1 , Martin A Smith 1
  1. Garvan Institute of Medical Research, Darlinghurst, NSW, Australia

Oxford Nanopore Technology has brought real-time single molecule long-read sequencing to the forefront of research. Many advancements in this technology have been in the signal analysis space, however using ‘squiggle’ data remains a challenge, both for use with existing tools, and for developing new ones. Here, we present our expanding toolkit for manipulating nanopore signal data, including software to manage fast5 files (fast5_fetcher) and interrogate signal data (sigtools). Fast5_fetcher decreases file handling time by optimising the extraction of individual reads from collections of fast5 files, which significantly accelerates signal data analysis pipelines, such as Nanopolish. Furthermore, our Sigtools  toolkit facilitates the interrogation of raw signal data, such as homopolymer and various barcode detection methods, including thousands of single-cells. All methods have been designed to favour speed and ease of use to fall in line with the real-time and portable nature of the technology. We anticipate that these tools will facilitate the development of machine learning algorithms predicated on nanopore sequencing data.