Capsicum annuum Assembly and Gene Annotation
About Capsicum annuum
Hot pepper (Capsicum annuum), one of the oldest domesticated crops in the Americas, is the most widely grown spice crop in the world. It is a diploid (2n=2x=24), self-pollinating crop closely related to potato, tomato and tobacco (Solanaceae).
Assembly
Mexican landrace Criollo de Morelos 334 (CM334) was used to generate 650.2 Gb (186x coverage) of whole-genome shotgun sequence by Illumina sequencing of genomic libraries with insert sizes ranging from 180 bp to 20 kb. After filtering, a 3.06 Gb assembly was obtained with SOAPdenovo and SSPACE, containing 37,989 scaffolds (N50=2.47 Mb), of which 1,276 contained of 90% of the genome sequence. A high-density genetic map with 6,281 markers using 120 recombinant inbred lines was used to anchor scaffolds. A total of 1.048 scaffolds (75.6% of the genome) were ordered along 12 chromosome pseudomolecules.
Annotation
A total of 34,903 protein-coding genes were predicted in the Pepper Genome Annotation (PGA). These were subsequently evaluated using 19.8 Gb of Illumina RNA-seq data. Overall, 93.2% of the predicted coding sequences were supported by Illumina data, demonstrating the high accuracy of gene prediction by PGA. Manual curation was used to improve gene models.
References
- Genome sequence of the hot pepper provides insights into the
evolution of pungency in Capsicum
species.
Kim S, Park M, Yeom SI et al. 2014. Nature Genetics. 46(3):270-280.
Picture credit: Sandra Knapp (Creative Commons license)
Links
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | ASM51225v2, INSDC Assembly GCA_000512255.2, |
Database version | 113.2 |
Golden Path Length | 3,063,864,880 |
Genebuild by | PGA |
Genebuild method | Import |
Data source | Seoul National University |
Gene counts
Coding genes | 35,845 |
Gene transcripts | 35,845 |