Brassica oleracea (BOL)

Brassica oleracea Assembly and Gene Annotation

About Brassica oleracea

Brassica oleracea is a widely cultivated vegetable species integral to human diets, with different strains giving rise to many common vegetables, such as ales, cabbages, Brussels sprouts, broccoli, kohl rabi and cauliflower. It is also a pre-cursor to B. napus, having hybridised with B. rapa is known as the C genome of B. napus.

Assembly

The genomic sequence within this version of Ensembl includes 33,459 scaffolds (>200 bp) with an N50 of 850 kb that was assembled at NRC-Saskatoon using a hybrid approach from Illumina, Roche 454 and Sanger sequence data. The assembly has been orientated and assigned to the nine pseudochromosomes using dense genotype-by-sequencing genetic maps.

Annotation

Gene prediction of the assembled genomic scaffolds was conducted by JCVI and NRC-Saskatoon using MAKER and PASA with RNA-seq data from four tissues and homologous protein sequences. Functional annotation for the gene models is provided through similarity to Arabidopsis thaliana genes.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 175672448 - Repeats content: 35.9526%

References

  1. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea.
    Parkin IA, Koh C, Tang H, Robinson SJ, Kagale S, Clarke WE, Town CD, Nixon J, Krishnakumar V, Bidwell SL et al. 2014. Genome Biol.. 15:R77.

Picture credit:
"Fractal Broccoli
" by Jon Sullivan. Licensed under Public domain via Wikimedia Commons.

  • http://brassica.info
    This site collates and exchanges open source information relating to Brassica genomics and genetics.
  • http://brassicadb.org
    The Brassica database (BRAD) is a web-based database of genetic data at the whole genome scale for important Brassica crops.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyBOL, INSDC Assembly GCA_000695525.1, May 2014
Database version111.1
Golden Path Length488,622,507
Genebuild byCanSeq
Genebuild methodImport
Data sourceCanSeq

Gene counts

Coding genes59,225
Non coding genes1,361
Small non coding genes1,339
Long non coding genes22
Gene transcripts60,586