Nicotiana attenuata Assembly and Gene Annotation
About Nicotiana attenuata
Nicotiana attenuata is a species of wild diploid tobacco, known as coyote tobacco, which is native to North America. While used for medicinal and ceremonial purposes by some Native Americans, it is not a commercial tobacco plant.
The pyridine alkaloid nicotine, whose addictive properties are well known to humans, is the signature compound of the genus Nicotiana (Solanaceae). In nature, nicotine is arguably one of the most broadly effective plant defence metabolites, in that it poisons acetylcholine receptors and is thereby toxic to all heterotrophs with neuromuscular junctions. Field studies using genetically modified Nicotiana attenuata (coyote tobacco) plants, have revealed that this toxin fulfills multifaceted ecological functions that contribute to plant fitness.
The sequencing of the genome of N. attenuata was done by the Max Planck Institute for Chemical Ecology, using 30× Illumina short reads, 4.5x 454 reads, and 10x PacBio single-molecule long reads. It was then assembled into 2.37 Gb of sequences representing 92% of the expected genome size. From this was generated a 50x optical map and a high-density linkage map for super-scaffolding, which anchored 825.8 Mb to 12 linkage groups and resulted in a final assembly with a N50 contig equal to 90.4 kb and a scaffold size of 524.5 kb.
N. attenuata gene annotation was combined with that of N. obtusifolia integrating both hint-guided AUGUSTUS and MAKER2 gene prediction pipelines predicted 33,449 gene models in the N. attenuata genome. More than 71% of these gene models were fully supported by RNA-seq reads and 12,617 and 18,176 of these genes are orthologous to Arabidopsis and tomato genes, respectively.
- Wild tobacco genomes reveal the evolution of nicotine biosynthesis.
Shuqing Xu, Thomas Brockmoller,a, Aura Navarro-Quezada, Heiner Kuhl, Klaus Gase, Zhihao Ling, Wenwu Zhou, Christoph Kreitzer, Mario Stanke et al. 2017. PNAS. 114(23):61336138.
General information about this species can be found in Wikipedia.
|Assembly||NIATTr2, INSDC Assembly GCA_001879085.1, Nov 2016|
|Golden Path Length||2,365,682,703|
|Data source||Max Planck Institute for Chemical Ecology|
|Non coding genes||3,084|
|Small non coding genes||2,790|
|Long non coding genes||294|