Brachypodium distachyon Assembly and Gene Annotation
About Brachypodium distachyon
Brachypodium distachyon, like Arabidopsis thaliana, has several features that recommend it as a model plant for functional genomic studies, especially in the grasses. It has a small, diploid genome (~355 Mbp), small physical size, a short life-cycle and few growth requirements. Brachypodium is related to the major cereal grain species but is understood to be more closely related to the Triticeae (wheat and barley) than to the other cereals.
This release represents the second improved Brachypodium distachyon (Bd21) genome including ~270 Mb of improved Brachypodium sequence. These regions were improved by dividing the gene space into ~2Mb overlapping pieces. Each region was manually inspected and then finished using a variety of technologies including Sanger (primer walks on subclones and fosmid templates, transposon sequencing on subclone templates), Illumina (small insert shatter libraries) and clone-based shotgun sequencing using both Sanger and Illumina libraries. 1,496 gaps were closed, and a total of 1.43 MB of base pairs was added to the assembly. Overall contiguity (contig N50) increased by a factor of 63 from 347.8Kb to 22 Mb.
74,756 transcript assemblies were constructed from 160M paired-end Illumina RNA-seq reads, 17,647 transcript assemblies from ~1.9M 454 reads. The transcript assemblies from RNA-seq reads were made using PERTRAN. 76,209 transcript assemblies were constructed using PASA from 314,866 sequences in total, consisting of the RNA-seq transcript assemblies above, as well as Sanger ESTs. Loci were determined by transcript assembly alignments and/or EXONERATE alignments of proteins from arabidopsis (Arabidopsis thaliana), rice, sorghum, foxtail, grape, soybean and Swiss-Prot eukaryote proteins to soft-repeatmasked Brachypodium distachyon Bd21 genome using RepeatMasker with up to 2K BP extension on both ends unless extending into another locus on the same strand. Gene models were predicted by homology-based predictors, FGENESH+, FGENESH_EST (similar to FGENESH+, EST as splice site and intron input instead of protein/translated ORF), and GenomeScan.
The end result was 34,310 loci containing protein-coding transcripts and 52,972 protein-coding transcripts
Brachypodium sylvaticum transcriptome
De novo gene models from the RNA-Seq analysis of three Brachypodium sylvaticum populations  were mapped to the B. distachyon reference genome. Click here for example. Assembled data is available from the Jaiswal lab and raw reads are available from INSDC project PRJNA182761.
Triticum aestivum transcriptome
Wheat RNA-Seq, EST and UniGene datasets have been aligned to the Brachypodium distachyon genome:
- 454 Wheat RNA-seq data, from the study, ERP001415, were aligned using GMAP. Click here for example.
- All publicly available Wheat EST data were aligned using Exonerate, following the standard Ensembl pipeline. Click here for example.
- Wheat UniGene cluster sequence data were aligned using Exonerate, following the standard Ensembl pipeline. Click here for example.
Brachypodium variation data
Approximately 394,000 genetic variations have been identified by the alignment of transcriptome assemblies from three slender false brome (Brachypodium sylvaticum) populations . Two populations come from B. sylvaticum's native range (Greece and Spain) and one comes from its invasive range (Oregon). Both the transciptome alignments and variation data are available in Ensembl Plants. Click here for example.
Wheat inter-homoeologous variants
As part of the wheat genome analysis, we have aligned a set of Triticum aestivum (bread wheat) homoeologous SNPs (SNPs between the component A, B, and D genomes of wheat) against the Brachypodium distachyon genome. SNPs have been classified into two groups, 1) SNPs that differ between the A and D genomes (where the B genome is unknown) and, 2) SNPs that are the same between the A and D genomes, but differ in B .
The wheat sequence alignments and the projected homoeologous SNPs are available as tracks under the "Wheat SNPs and alignments" section of the "Configure This page" menu. Click here for example.
Links (Brachypodium distachyon)
- Gramene species page for Brachypodium
- The International Brachypodium Initiative (IBI) community portal
- The John Innes Centre ModelCrop.org
- MIPS Brachypodium genome database
- Phytozome entry page for Brachypodium distachyon
- Jaiswal Lab at Oregon State University
- INSDC project PRJNA182761
Links (Triticum aestivum)
- MIPS Wheat Genome Database
- ENA study ERP000319: 454 pyrosequencing of the Triticum aestivum (bread wheat) genome to 5X coverage
- ENA study ERP001415: 454 sequencing of Triticum aestivum (bread wheat) cv. Chinese spring cDNA samples from a pool of tissues, from plants under drought stress and from circadian-sampled leaves
- Triticum aestivum ESTs at ENA
- Triticum aestivum UniGene cluster sequences at NCBI
- Genome sequencing and analysis of the model grass Brachypodium distachyon.
The International Brachypodium Initiative. 2010. Nature. 463:763-768.
- Sequencing and De Novo Transcriptome Assembly of Brachypodium sylvaticum (Poaceae).
Samuel E. Fox, Justin Preece, Jeffrey A. Kimbrel, Gina L. Marchini, Abigail Sage, Ken Youens-Clark, Mitchell B. Cruzan, and Pankaj Jaiswal. 2013. Applications in Plant Sciences. 1(3):1200011.
General information about this species can be found in Wikipedia.
|Assembly||Brachypodium_distachyon_v3.0, INSDC Assembly GCA_000005505.4, Feb 2018|
|Golden Path Length||271,163,419|
|Genebuild method||Imported from ENA|
|Data source||Joint Genome Institute|
|Non coding genes||815|
|Small non coding genes||784|
|Long non coding genes||31|