Brachypodium distachyon Assembly and Gene Annotation
About Brachypodium distachyon
Brachypodium distachyon, like Arabidopsis thaliana, has several features that recommend it as a model plant for functional genomic studies, especially in the grasses. It has a small, diploid genome (~355 Mb), small physical size, a short life-cycle and few growth requirements. Brachypodium is related to the major cereal grain species but is understood to be more closely related to the Triticeae (wheat and barley) than to the other cereals.
Assembly
This release represents the second improved Brachypodium distachyon (Bd21 strain) genome including ~270 Mb of improved Brachypodium sequence, from JGI. These regions were improved by dividing the gene space into ~2Mb overlapping pieces. Each region was manually inspected and then finished using a variety of technologies including Sanger (primer walks on subclones and fosmid templates, transposon sequencing on subclone templates), Illumina (small insert shatter libraries) and clone-based shotgun sequencing using both Sanger and Illumina libraries. 1,496 gaps were closed, and a total of 1.43 Mb was added to the assembly. Overall contiguity (contig N50) increased by a factor of 63 from 347.8 kb to 22 Mb.
Annotation
74,756 transcript assemblies were constructed from 160 million paired-end Illumina RNA-seq reads, 17,647 transcript assemblies from ~1.9 million 454 reads. The transcript assemblies from RNA-seq reads were made using PERTRAN. 76,209 transcript assemblies were constructed using PASA from 314,866 sequences in total, consisting of the RNA-seq transcript assemblies above, as well as Sanger ESTs. Loci were determined by transcript assembly alignments and/or EXONERATE alignments of proteins from arabidopsis (Arabidopsis thaliana), rice, sorghum, foxtail, grape, soybean and Swiss-Prot eukaryote proteins to soft-repeatmasked Brachypodium distachyon Bd21 genome using RepeatMasker with up to 2 kb extension on both ends unless extending into another locus on the same strand. Gene models were predicted by homology-based predictors, FGENESH+, FGENESH_EST (similar to FGENESH+, EST as splice site and intron input instead of protein/translated ORF), and GenomeScan.
The end result was 34,310 loci containing protein-coding transcripts and 52,972 protein-coding transcripts
Sequence alignments
Brachypodium sylvaticum transcriptome
De novo gene models from the RNA-seq analysis of three Brachypodium sylvaticum populations were mapped to the B. distachyon reference genome. Assembled data is available from the Jaiswal lab and raw reads are available from INSDC project PRJNA182761.
Triticum aestivum transcriptome
Wheat RNA-Seq, EST and UniGene datasets have been aligned to the Brachypodium distachyon genome:
- 454 Wheat RNA-seq data, from the study, ERP001415, were aligned using GMAP.
- All publicly available Wheat EST data were aligned using Exonerate, following the standard Ensembl pipeline.
- Wheat UniGene cluster sequence data were aligned using Exonerate, following the standard Ensembl pipeline.
Variation
Brachypodium variation data
Approximately 394,000 genetic variants have been identified by the alignment of transcriptome assemblies from three slender false brome (Brachypodium sylvaticum) populations. Two populations come from B. sylvaticum's native range (Greece and Spain) and one comes from its invasive range (Oregon). Both the transciptome alignments and variation data are available in Ensembl Plants.
Wheat inter-homoeologous variants
As part of the wheat genome analysis, we have aligned a set of Triticum aestivum (bread wheat) homoeologous SNPs (SNPs between the component A, B, and D genomes of wheat) against the Brachypodium distachyon genome. SNPs have been classified into two groups, 1) SNPs that differ between the A and D genomes (where the B genome is unknown) and, 2) SNPs that are the same between the A and D genomes, but differ in B.
The wheat sequence alignments and the projected homoeologous SNPs are
available as tracks under the
"Wheat SNPs and alignments
" section of
the
"Configure this page
" menu.
References
- Genome sequencing and analysis of the model grass Brachypodium
distachyon.
The International Brachypodium Initiative. 2010. Nature. 463:763-768. - Sequencing and De Novo Transcriptome Assembly of Brachypodium
sylvaticum
(Poaceae).
Samuel E. Fox, Justin Preece, Jeffrey A. Kimbrel, Gina L. Marchini, Abigail Sage, Ken Youens-Clark, Mitchell B. Cruzan, and Pankaj Jaiswal. 2013. Applications in Plant Sciences. 1(3):1200011.
Links
Links (Brachypodium distachyon)
- Gramene species page for Brachypodium
- The International Brachypodium Initiative (IBI) community portal
- The John Innes Centre ModelCrop.org
- MIPS Brachypodium genome database
- Phytozome entry page for Brachypodium distachyon
- Jaiswal Lab at Oregon State University
- INSDC project PRJNA182761
Links (Triticum aestivum)
- MIPS Wheat Genome Database
- ENA study ERP000319: 454 pyrosequencing of the Triticum aestivum (bread wheat) genome to 5X coverage
- ENA study ERP001415: 454 sequencing of Triticum aestivum (bread wheat) cv. Chinese spring cDNA samples from a pool of tissues, from plants under drought stress and from circadian-sampled leaves
- Triticum aestivum ESTs at ENA
- Triticum aestivum UniGene cluster sequences at NCBI
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | Brachypodium_distachyon_v3.0, INSDC Assembly GCA_000005505.4, Feb 2018 |
Database version | 113.4 |
Golden Path Length | 271,163,419 |
Genebuild by | JGI |
Genebuild method | Import |
Data source | Joint Genome Institute |
Gene counts
Coding genes | 34,310 |
Non coding genes | 815 |
Small non coding genes | 784 |
Long non coding genes | 31 |
Gene transcripts | 53,787 |
Other
Short Variants | 327,200 |