Cucumis sativus (ASM407v2)

Cucumis sativus Assembly and Gene Annotation

About Cucumis sativus

Cucumber (Cucumis sativus) is a widely cultivated plant in the gourd family, Cucurbitaceae. It is a creeping vine that bears cucumiform fruits that are used in salads. There are three main varieties of cucumber: slicing, pickling, and seedless. Within these varieties, several cultivars have been created. In North America, the term
"wild cucumber
" refers to plants in the genera Echinocystis and Marah, but these are not closely related. The cucumber is originally from South Asia, but now grows on most continents. Many different types of cucumber are traded on the global market.


The 'Chinese long' inbred line 9930 was selected for the genome sequencing project by the Cucumber Genome Initiative. A total of 26.5 billion high-quality base pairs were generated, or 72.2-fold genome coverage, of which the Sanger reads provided 3.9-fold coverage and the Illumina GA reads provided 68.3-fold coverage The GA reads ranged in length from 42 to 53 bp.

The final assembly is of length 195,669,205 bp consisting of 190 scaffolds (N50 = 29,076,228) and 11,366 contigs (N50 = 42,349).


The Cucumber Genome Initiative predicted protein coding genes using three methods (cDNA-EST, homology based and ab initio) and a consensus gene set was built by merging all of the results. A total of 26,682 genes were predicted, with a mean coding sequence size of 1,046 bp and an average of 4.39 exons per gene. Under an 80% sequence overlap threshold, 26.7% of the genes were supported by all three gene prediction methods, 25% had both ab initio prediction and homology-based evidence, and 7.4% had ab initio prediction and cDNA-EST expression evidence; the remaining genes were primarily derived from pure ab initio prediction, but the majority of these were supported by multiple gene finders. About 81% of the genes have homologues in the TrEMBL protein database, and 66% can be classified by InterPro. In total, 82% of the genes have either known homologues or can be functionally classified.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 93375853 - Repeats content: 48.2%


Variation from the European Variation Archive was added.

A monogenic locus was mapped for resistance to CVYV in cucumber by using a Bulked Segregant Analysis (BSA) strategy coupled with whole-genome resequencing. 135 F3 families from a segregating population between a pickling susceptible cucumber and a long Dutch type resistant cucumber were phenotyped for CVYV resistance.[4]


  1. Cucumber.
  2. The genome of the cucumber, Cucumis sativus L.
    Sanwen Huang, Ruiqiang Li[]Songgang Li. 2009. Nature Genetics. 41:12751281.
  3. NCBI page for Cucumis sativus (cucumber) assembly.
  4. Mapping Cucumber Vein Yellowing Virus Resistance in Cucumber (Cucumis sativus L.) by Using BSA-seq Analysis Pujol M, Alexiou KG, Fontaine AS, Mayor P, Miras M, Jahrmann T, Garcia-Mas J, Aranda MA 2019. Frontiers in Plant Science 10:1583

Picture credit: By Francisco Manuel Blanco (O.S.A.) [Public domain], via Wikimedia Commons

More information

General information about this species can be found in Wikipedia.



AssemblyASM407v2, INSDC Assembly GCA_000004075.2, Oct 2014
Database version109.2
Golden Path Length193,829,320
Genebuild byBGI
Genebuild methodImport
Data sourceBeijing Genomics Institute

Gene counts

Coding genes23,780
Non coding genes682
Small non coding genes681
Long non coding genes1
Gene transcripts24,462