Cucumis sativus Assembly and Gene Annotation

About Cucumis sativus

Cucumber (Cucumis sativus) is a widely cultivated plant in the gourd family, Cucurbitaceae. It is a creeping vine that bears cucumiform fruits that are used as vegetables. There are three main varieties of cucumber: slicing, pickling, and seedless. Within these varieties, several cultivars have been created. In North America, the term "wild cucumber" refers to plants in the genera Echinocystis and Marah, but these are not closely related. The cucumber is originally from South Asia, but now grows on most continents. Many different types of cucumber are traded on the global market. [1]

Assembly

The 'Chinese long' inbred line 9930 was selected for the genome sequencing project. A total of 26.5 billion high-quality base pairs were generated, or 72.2-fold genome coverage, of which the Sanger reads provided 3.9-fold coverage and the Illumina GA reads provided 68.3-fold coverage The GA reads ranged in length from 42 to 53 bp. [2]

The final assembly is of length 195,669,205 bp consisting of 190 scaffolds (N50 = 29,076,228) and 11,366 contigs (N50 = 42,349) [3].

Annotation

Protein coding gene prediction was done using three methods (cDNA-EST, homology based and ab initio) and a consensus gene set was built by merging all of the results. A total of 26,682 genes were predicted, with a mean coding sequence size of 1,046 bp and an average of 4.39 exons per gene. Under an 80% sequence overlap threshold, 26.7% of the genes were supported by all three gene prediction methods, 25% had both ab initio prediction and homology-based evidence, and 7.4% had ab initio prediction and cDNA-EST expression evidence; the remaining genes were primarily derived from pure ab initio prediction, but the majority of these were supported by multiple gene finders. About 81% of the genes have homologs in the TrEMBL protein database, and 66% can be classified by InterPro. In total, 82% of the genes have either known homologs or can be functionally classified [2].

References

  1. Cucumber.
    Wikipedia.
  2. The genome of the cucumber, Cucumis sativus L.
    Sanwen Huang, Ruiqiang Li[]Songgang Li. 2009. Nature Genetics. 41:12751281.
  3. NCBI page for Cucumis sativus (cucumber) assembly.
    NCBI.

Picture credit: By Francisco Manuel Blanco (O.S.A.) [Public domain], via Wikimedia Commons

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyASM407v2, INSDC Assembly GCA_000004075.2, Oct 2014
Database version93.2
Base Pairs193,829,320
Golden Path Length193,829,320
Genebuild byEnsemblPlants
Genebuild methodGenerated from ENA annotation
Data sourceEuropean Nucleotide Archive

Gene counts

Coding genes23,780
Non coding genes299
Small non coding genes299
Gene transcripts24,079

About this species