Nicotiana attenuata Assembly and Gene Annotation
About Nicotiana attenuata
Nicotiana attenuata is a species of wild diploid tobacco, known as coyote tobacco, which is native to North America. While used for medicinal and ceremonial purposes by some Native Americans, it is not a commercial tobacco plant.
The pyridine alkaloid nicotine, whose addictive properties are well known to humans, is the signature compound of the genus Nicotiana (Solanaceae). In nature, nicotine is arguably one of the most broadly effective plant defence metabolites, in that it poisons acetylcholine receptors and is thereby toxic to all heterotrophs with neuromuscular junctions. Field studies using genetically modified Nicotiana attenuata (coyote tobacco) plants, have revealed that this toxin fulfills multifaceted ecological functions that contribute to plant fitness.
Assembly
The sequencing of the genome of N. attenuata was done by the Max Planck Institute for Chemical Ecology, using 30× Illumina short reads, 4.5x 454 reads, and 10x PacBio single-molecule long reads. It was then assembled into 2.37 Gb of sequences representing 92% of the expected genome size. From this was generated a 50x optical map and a high-density linkage map for super-scaffolding, which anchored 825.8 Mb to 12 linkage groups and resulted in a final assembly with a N50 contig equal to 90.4 kb and a scaffold size of 524.5 kb.
Annotation
N. attenuata gene annotation was combined with that of N. obtusifolia integrating both hint-guided AUGUSTUS and MAKER2 gene prediction pipelines predicted 33,449 gene models in the N. attenuata genome. More than 71% of these gene models were fully supported by RNA-seq reads and 12,617 and 18,176 of these genes are orthologous to Arabidopsis and tomato genes, respectively.
Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 1551386522 - Repeats content: 65.6%
References
- Wild tobacco genomes reveal the evolution of nicotine
biosynthesis.
Shuqing Xu, Thomas Brockmoller,a, Aura Navarro-Quezada, Heiner Kuhl, Klaus Gase, Zhihao Ling, Wenwu Zhou, Christoph Kreitzer, Mario Stanke et al. 2017. PNAS. 114(23):61336138.
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | NIATTr2, INSDC Assembly GCA_001879085.1, Nov 2016 |
Database version | 113.1 |
Golden Path Length | 2,365,682,703 |
Genebuild by | MPI-CE |
Genebuild method | Import |
Data source | Max Planck Institute for Chemical Ecology |
Gene counts
Coding genes | 33,320 |
Non coding genes | 3,084 |
Small non coding genes | 2,790 |
Long non coding genes | 294 |
Gene transcripts | 36,404 |