Cryptosporidium is the second leading cause of death due to diarrhoeal disease worldwide, particularly among infants, young children, and the immunocompromised. Despite its global importance, its prevalence and transmission remain high due to a lack of treatment options and limited tools to accurately track transmission and identify infection sources in a local context.
Human cryptosporidiosis is caused by two major species, C. hominis and C. parvum. Current studies of the molecular epidemiology, species sub-structuring and transmission dynamics of these species are largely limited to single locus genotyping, or use of a small number of marker genes/loci. There is no standardized approach for the application of these markers for population genetic studies of Cryptosporidium. Further, they have not been shown to accurately resolving population structure or reflect the underlying genetic relationships within and among populations.
The draft C. parvum and C. hominis genome were published in 2002 and 2004 respectively. However, resequencing of non-reference field isolates has only recently been undertaken and are largely limited to one study in Bangladesh. We have compiled and undertaken comprehensive genomic variant and population structure analysis of all currently available (n ~ 45) and 28 newly sequenced C. hominis isolates (from infected individuals in Ghana, Madagascar, Gabon and Tanzania). We assess global variation of the genome, map this variation to coding regions and test the genomes for evidence of selection, clonality and ‘hotspots’ of recombination/rearrangement. Finally, we examine overall population structuring within and among these isolates using whole genome data and compare this to existing ‘population’ marker loci for the species.