Nicotiana attenuata Assembly and Gene Annotation

About Nicotiana attenuata

Nicotiana attenuata is a species of wild diploid tobacco, known as coyote tobacco, which is native to North America. While used for medicinal and ceremonial purposes by some Native Americans, it is not a commercial tobacco plant.

The pyridine alkaloid nicotine, whose addictive properties are well known to humans, is the signature compound of the genus Nicotiana (Solanaceae). In nature, nicotine is arguably one of the most broadly effective plant defence metabolites, in that it poisons acetylcholine receptors and is thereby toxic to all heterotrophs with neuromuscular junctions. Field studies using genetically modified Nicotiana attenuata (coyote tobacco) plants, have revealed that this toxin fulfills multifaceted ecological functions that contribute to plant fitness.


The sequencing of the genome of N. attenuata was done by the Max Planck Institute for Chemical Ecology, using 30× Illumina short reads, 4.5x 454 reads, and 10x PacBio single-molecule long reads. It was then assembled into 2.37 Gb of sequences representing 92% of the expected genome size. From this was generated a 50x optical map and a high-density linkage map for super-scaffolding, which anchored 825.8 Mb to 12 linkage groups and resulted in a final assembly with a N50 contig equal to 90.4 kb and a scaffold size of 524.5 kb.


N. attenuata gene annotation was combined with that of N. obtusifolia integrating both hint-guided AUGUSTUS and MAKER2 gene prediction pipelines predicted 33,449 gene models in the N. attenuata genome. More than 71% of these gene models were fully supported by RNA-seq reads and 12,617 and 18,176 of these genes are orthologous to Arabidopsis and tomato genes, respectively.


  1. Wild tobacco genomes reveal the evolution of nicotine biosynthesis.
    Shuqing Xu, Thomas Brockmoller,a, Aura Navarro-Quezada, Heiner Kuhl, Klaus Gase, Zhihao Ling, Wenwu Zhou, Christoph Kreitzer, Mario Stanke et al. 2017. PNAS. 114(23):61336138.

More information

AssemblyNIATTr2, INSDC Assembly GCA_001879085.1, Nov 2016
Database version95.1
Base Pairs2,365,682,703
Golden Path Length2,365,682,703
Genebuild byMPI
Genebuild methodImported from ENA
Data sourceMax Planck Institute for Chemical Ecology

Gene counts

Coding genes33,320
Non coding genes3,084
Small non coding genes2,790
Long non coding genes294
Gene transcripts36,404

