Long non-coding RNAs (lncRNAs) have recently emerged as prominent regulators of gene expression in eukaryotes. With over 200 nt have little to no protein-coding potential, lncRNAs often drive the modification and maintenance of gene activation or gene silencing states via chromatin conformation rearrangements. In plants, lncRNAs have been shown to participate in gene regulation, and are essential to processes such as vernalization (Csorba et al. 2014) and photomorphogenesis (Wang et al. 2014). Despite their prominent functions, only over a dozen lncRNAs have been experimentally and functionally characterised.
Little is known about the evolutionary patterns of lncRNAs plants. The rates of divergence are much higher in lncRNAs than in protein coding mRNAs, making it difficult to identify lncRNA conservation using traditional sequence comparison methods. One of the few studies that has tried to address this found only 4 lncRNAs with positional conservation and 15 conserved at the sequence level in Brassicaceae (Mohammadin et al. 2015).
Here, we characterised the splicing conservation of lncRNAs in Brassicaceae. We generated a whole-genome alignment of 16 Brassica species and used it to identify synthenic lncRNA orthologues. Using a scoring system trained on transcriptomes from A. thaliana and B. oleracea, we identified splice sites across the whole alignment and measured their conservation. Our analysis revealed that 38% of all intergenic lncRNAs (~900) display splicing conservation in at least one exon, an estimate that is substantially higher to previous estimates of lncRNA conservation in this group. Our findings agree with similar studies in vertebrates (Nitsche et al. 2015), suggesting that splicing conservation can be evidence of stabilising selection and thus used to identify functional lncRNAs in plants.