Brassica oleracea Assembly and Gene Annotation
About Brassica oleracea
Brassica oleracea is a widely cultivated vegetable species integral to human diets, with different strains giving rise to many common vegetables, such as ales, cabbages, Brussels sprouts, broccoli, kohl rabi and cauliflower. It is also a pre-cursor to B. napus, having hybridised with B. rapa is known as the C genome of B. napus.
Assembly
The genomic sequence within this version of Ensembl includes 33,459 scaffolds (>200 bp) with an N50 of 850 kb that was assembled at NRC-Saskatoon using a hybrid approach from Illumina, Roche 454 and Sanger sequence data. The assembly has been orientated and assigned to the nine pseudochromosomes using dense genotype-by-sequencing genetic maps.
Annotation
Gene prediction of the assembled genomic scaffolds was conducted by JCVI and NRC-Saskatoon using MAKER and PASA with RNA-seq data from four tissues and homologous protein sequences. Functional annotation for the gene models is provided through similarity to Arabidopsis thaliana genes.
Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 175672448 - Repeats content: 35.9526%
References
- Transcriptome and methylome profiling reveals relics of genome
dominance in the mesopolyploid Brassica
oleracea.
Parkin IA, Koh C, Tang H, Robinson SJ, Kagale S, Clarke WE, Town CD, Nixon J, Krishnakumar V, Bidwell SL et al. 2014. Genome Biol.. 15:R77.
Picture credit:
"Fractal Broccoli
" by Jon Sullivan. Licensed under
Public domain via Wikimedia
Commons.
Links
- http://brassica.info
This site collates and exchanges open source information relating to Brassica genomics and genetics. - http://brassicadb.org
The Brassica database (BRAD) is a web-based database of genetic data at the whole genome scale for important Brassica crops.
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | BOL, INSDC Assembly GCA_000695525.1, May 2014 |
Database version | 113.1 |
Golden Path Length | 488,622,507 |
Genebuild by | CanSeq |
Genebuild method | Import |
Data source | CanSeq |
Gene counts
Coding genes | 59,225 |
Non coding genes | 1,361 |
Small non coding genes | 1,339 |
Long non coding genes | 22 |
Gene transcripts | 60,586 |