Papaver somniferum Assembly and Gene Annotation
About Papaver somniferum
The opium poppy (Papaver somniferum) is a diploid (2n=22) of the Papaveraceae family. The sap, known as opium, contains various alkaloids, including morphine and codeine, with effects ranging from pain relief and cough suppression to euphoria, sleepiness, and addiction. It has been a source of painkillers since the Neolithic. Opium poppy remains the only commercially viable source of the morphinan subclass of benzylisoquinoline alkaloids. This is the genome of cultivar High Noscapine 1 (HN1).
Assembly
After two cycles of self-pollination, one HN1 plant was selected to prepare DNA from leaves for Illumina paired-end and mate-pair sequencing (214x). Subsequently, this plant was self-pollinated and its progeny grown to obtain fine leaf material for Ultra High Molecular Weight grade and Next Generation Sequencing grade DNA preparations. This DNA was used for 10X Genomics (40x) and Single Molecule Real-Time PacBio sequencing (66.8x). The genome size was estimated to be 2.87Gb based on 61-mer frequency. The genome assembly was conducted using DeNovoMAGIC from NRGene, yielding a highly contiguous 2.73Gb assembly with scaffold N50 at 15.6Mb and contig N50 at 121kb. Overall, 81.6% of sequences were assigned into individual chromosomes using a linkage map generated by sequence-based genotyping of 84 F2 plants.
Annotation
RNA sequencing of seven tissues (leaf, petal, stamen, capsule, stem, fine root, tap root) was carried out. Transcripts were annotated with an in-house pipeline chaining Hisat2, Stringtie and Ballgown. In addition, Trinity was used for de novo transcriptome assembly and generated EST evidence for gene prediction. Gene models were predicted using the MAKER pipeline. The BUSCO test reported 95.3% of complete gene models (38% single-copy and 62% duplicated genes) plus 1.4% additional fragmented models.
Repeats were annotated with the Ensembl Genomes repeat feature pipeline.There are: 785658 Low complexity (Dust) features, covering 23 Mb (0.8% of the genome); 193205 RepeatMasker features (with the nrTEplants library), covering 65 Mb (2.4% of the genome); 367896 Tandem repeats (TRF) features, covering 60 Mb (2.2% of the genome); Repeat Detector repeats length 1.84Gb (67.9% of the genome).
References
- The opium poppy genome and morphinan production.
Guo L, Winzer T, Yang X, Li Y, Ning Z, He Z, Teodor R, Lu Y, Bowser TA, Graham IA, Ye K..
Picture credit: Franz Eugen Köhler, public domain
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | ASM357369v1, INSDC Assembly GCA_003573695.1, |
Database version | 113.1 |
Golden Path Length | 2,715,377,404 |
Genebuild by | Xi'an Jiaotong University |
Genebuild method | Import |
Data source | Xi'an Jiaotong University |
Gene counts
Coding genes | 41,770 |
Gene transcripts | 41,770 |