Cucumis sativus Assembly and Gene Annotation
About Cucumis sativus
Cucumber (Cucumis sativus) is a widely cultivated plant in the gourd
family, Cucurbitaceae. It is a creeping vine that bears cucumiform
fruits that are used in salads. There are three main varieties of
cucumber: slicing, pickling, and seedless. Within these varieties,
several cultivars have been created. In North America, the term
"wild
cucumber
" refers to plants in the genera Echinocystis and Marah, but
these are not closely related. The cucumber is originally from South
Asia, but now grows on most continents. Many different types of cucumber
are traded on the global market.
Assembly
The 'Chinese long' inbred line 9930 was selected for the genome sequencing project by the Cucumber Genome Initiative. A total of 26.5 billion high-quality base pairs were generated, or 72.2-fold genome coverage, of which the Sanger reads provided 3.9-fold coverage and the Illumina GA reads provided 68.3-fold coverage The GA reads ranged in length from 42 to 53 bp.
The final assembly is of length 195,669,205 bp consisting of 190 scaffolds (N50 = 29,076,228) and 11,366 contigs (N50 = 42,349).
Annotation
The Cucumber Genome Initiative predicted protein coding genes using three methods (cDNA-EST, homology based and ab initio) and a consensus gene set was built by merging all of the results. A total of 26,682 genes were predicted, with a mean coding sequence size of 1,046 bp and an average of 4.39 exons per gene. Under an 80% sequence overlap threshold, 26.7% of the genes were supported by all three gene prediction methods, 25% had both ab initio prediction and homology-based evidence, and 7.4% had ab initio prediction and cDNA-EST expression evidence; the remaining genes were primarily derived from pure ab initio prediction, but the majority of these were supported by multiple gene finders. About 81% of the genes have homologues in the TrEMBL protein database, and 66% can be classified by InterPro. In total, 82% of the genes have either known homologues or can be functionally classified.
Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 93375853 - Repeats content: 48.2%
Variation
Variation from the European Variation Archive was added.
A monogenic locus was mapped for resistance to CVYV in cucumber by using a Bulked Segregant Analysis (BSA) strategy coupled with whole-genome resequencing. 135 F3 families from a segregating population between a pickling susceptible cucumber and a long Dutch type resistant cucumber were phenotyped for CVYV resistance.[4]
References
- Cucumber.
Wikipedia. - The genome of the cucumber, Cucumis sativus
L.
Sanwen Huang, Ruiqiang Li[]Songgang Li. 2009. Nature Genetics. 41:12751281. - NCBI page for Cucumis sativus (cucumber) assembly.
NCBI. - Mapping Cucumber Vein Yellowing Virus Resistance in Cucumber (Cucumis sativus L.) by Using BSA-seq Analysis Pujol M, Alexiou KG, Fontaine AS, Mayor P, Miras M, Jahrmann T, Garcia-Mas J, Aranda MA 2019. Frontiers in Plant Science 10:1583
Picture credit: By Francisco Manuel Blanco (O.S.A.) [Public domain], via Wikimedia Commons
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | ASM407v2, INSDC Assembly GCA_000004075.2, Oct 2014 |
Database version | 113.2 |
Golden Path Length | 193,829,320 |
Genebuild by | BGI |
Genebuild method | Import |
Data source | Beijing Genomics Institute |
Gene counts
Coding genes | 23,780 |
Non coding genes | 682 |
Small non coding genes | 681 |
Long non coding genes | 1 |
Gene transcripts | 24,462 |