Brassica rapa Assembly and Gene Annotation

BrassicaInfoThe Brassica rapa genome browser has been developed through a joint effort by the Ensembl Genomes group and Rothamsted Research. From release 13 of Ensembl Genomes, the EBI will be maintaining the genome browser for B. rapa in the context of Ensembl Plants.

Rothamsted Research Acknowledges:

  • BBSRC for funding (grant number BB/E017797/1)
  • Ian Bancroft and Martin Trick at the John Innes Centre for providing the gene annotation.
  • Nick James and Sean May (NASC) for their assistance in establishing BrassEnsembl.

About Brassica rapa

Brassica rapa (Chinese cabbage) is a widely cultivated leaf and root vegetable. The genome was sequenced as a contribution to the Multinational Brassica Genome Sequencing Project and was published in August 2011 (Wang X, et. al. Nature 2011).

Assembly

The genomic sequence within this version of Ensembl includes 193 large scaffolds assembled by CAAS-IVF, which have been orientated and assigned to pseudochromosomes using publicly available genetic markers.

Annotation

Gene prediction of the assembled genomic scaffolds has been conducted by CAAS-IVF using GLEAN and BLAT. Functional annotation for the gene models is provided through similarity to Arabidopsis thaliana genes (E=1E-5) and Gene Ontology terms are provided through significant similarity to UniProtKB proteins (E=1E-5).

Sequence alignments

Further annotations generated by RRes are displayed as additional tracks:

  • Arabidopsis coding sequences aligned using BLAT:
    • Alignment parameters: minmatch(2), minscore(30), min identity(80), maxGap(2), evalue threshold(1e-5).
    • Dataset: 33,410 Arabidopsis TAIR v9 coding sequences.
    • External links: AtEnsembl transcripts.
  • A 95k Brassica UniGene set generated by JCVI aligned using BLAT:
    • Alignment parameters: default BLAT (minmatch(2), minscore(30), min identity(90), maxGap(2), evalue threshold(1e-20)).
    • Dataset: 94,558 Brassica UniGenes.
  • A 135k Brassica UniGene set generated by RRes aligned using BLAT:
    • Alignment parameters: default BLAT (minmatch(2), minscore(30), min identity(90), maxGap(2), evalue threshold(1e-20)).
    • Dataset: 135 201 Brassica UniGenes.
  • B. rapa BAC end sequences aligned using Decypher tera-blastn:
    • Alignment parameters: match_score(1), mismatch_score(-3), open_penalty(-5), extend_penalty(-2), gapped_alignment(banded), query_filtered, max_score(10), max alignment number(10), evalue threshold(1e-50), word_size (9), query_increment(3), extension_threshold(20), percent identity(95).
    • Dataset: 196,837 B. rapa BAC end sequences obtained from GenBank 5-Aug-2010.
    • External links: GenBank.
  • B. rapa ESTs aligned using Decypher tera-blastn:
    • Alignment parameters: match_score(1), mismatch_score(-1), open_penalty(-1), extend_penalty(-2), gapped_alignment(banded), query_filtered, max_score(10), max alignment number(10), evalue threshold(1e-20), word_size(9), query_increment(3), extension_threshold(20), percent identity(90).
    • Dataset: 902,700 Brassica ESTs obtained from GenBank 13-Aug-2010.
    • External links: GenBank.

References

  1. The genome of the mesopolyploid crop species Brassica rapa.
    Wang X, Wang H, Wang J, Sun R, Wu J, Liu S, Bai Y, Mun JH, Bancroft I, Cheng F et al. 2011. Nat. Genet.. 43:1035-1039.

Picture credit: School Division, Houghton Mifflin Company

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyIVFCAASv1, INSDC Assembly GCA_000309985.1, Aug 2009
Database version90.1
Base Pairs283,822,785
Golden Path Length283,822,783
Genebuild byIVFCAAS
Genebuild methodImported from IVFCAAS by BrassEnsembl
Data sourceBrassica database (BRAD)

Gene counts

Coding genes41,018
Non coding genes1,291
Small non coding genes1,272
Long non coding genes19
Gene transcripts42,316

About this species