Daucus carota (ASM162521v1)

Daucus carota Assembly and Gene Annotation

About Daucus carota subsp. sativus

Carrot is a globally important root crop whose production has quadrupled between 1976 and 2013, outpacing the overall rate of increase in vegetable production and world population growth through development of high-value products for fresh consumption, juices, and natural pigments and cultivars adapted to warmer production regions.

Assembly

An orange, doubled-haploid, Nantes-type carrot (DH1) was used for genome sequencing by the United States Department of Agriculture. BAC end sequences were used and a newly developed linkage map with 2,075 markers to correct 135 scaffolds with one or more chimeric regions.

The resulting v2.0 assembly spans 421.5 Mb and contains 4,907 scaffolds (N50 of 12.7 Mb), accounting for approximately 90% of the estimated genome size. The scaftig N50 of 31.2 kb is similar to those of other high-quality genome assemblies such as potato. About 86% (362 Mb) of the assembled genome is included in only 60 super-scaffolds anchored to the nine pseudomolecules. The longest superscaffold spans 30.2 Mb, 85% of chromosome 4.

Annotation

In the gene annotation carreid out by the United States Department of Agriculture, 32,113 genes were predicted, of which 79% had substantial homology with known genes. The majority (98.7%) of gene predictions had supporting cDNA and/or EST evidence, demonstrating the high accuracy of gene prediction. Relative to five other closely related genomes, carrot was enriched for genes involved in a wide range of molecular functions. 564 tRNAs, 31 rRNA fragments, 532 small nuclear RNA (snRNA) genes, and 248 microRNAs (miRNAs) were also identified, distributed among 46 families.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 144037532 - Repeats content: 34.2%

References

  1. A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution.
    Massimo Iorizzo, Shelby Ellison, Douglas Senalik, Peng Zeng, Pimchanok Satapoomin, Jiaying Huang, Megan Bowman, Marina Iovene, Walter Sanseverino, Pablo Cavagnaro et al. 2016. Nature Genetics. 48:657666.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyASM162521v1, INSDC Assembly GCA_001625215.1, May 2016
Database version112.1
Golden Path Length421,502,825
Genebuild byUSDA
Genebuild methodImport
Data sourceUnited States Department of Agriculture

Gene counts

Coding genes32,109
Non coding genes2,154
Small non coding genes2,089
Long non coding genes65
Gene transcripts34,263