Aegilops tauschii Assembly and Gene Annotation
About Aegilops tauschii
Aegilops tauschii (goatgrass) is the diploid progenitor of the bread wheat D-genome, providing important evolutionary information for wheat. The bread wheat genome is a hexaploid, resulting from the hybridization of the wild A. tauschii with a cultivated tetraploid wheat, Triticum turgidum. This spontaneous event occured about 8,000 years ago in the Fertile Crescent.
The genome of Aegilops tauschii accession AL8/78 was sequenced by the BGI using a whole-genome shotgum strategy, and assembled using SOAPdenovo software. The genome assembly achieved contigs with a N50 size of 4.51 kbp. Using paired-end information, and additional Roche/454 long-read sequences, the draft assembly was 4.23 Gbp, with a scaffold N50 length of 57.6 kbp.
The chloroplast genome component and its gene annotation are also present. This was imported from ENA entry, JQ754651.
34,498 protein-coding genes were predicted, using FGENESH and GeneID, supplemented with evidence-based information using RNA-Seq and ESTs sequences. For more details about genome sequencing and gene prediction see .
Non coding RNA genes have been annotated using tRNAScan-SE (Lowe, T.M. and Eddy, S.R. 1997), RFAM (Griffiths-Jones et al 2005), and RNAmmer (Lagesen K.,et al 2007); additional analysis tools have also been applied.
Triticeae Repeats from TREP were aligned to the A. tauschii genome using RepeatMasker.
Regulation and sequence alignments
RNA-Seq data, ESTs and UniGene datasets have also been aligned to the Aegilops tauschii genome:
- Aegilops tauschii 454 RNA-seq data were aligned using STAR, for the following ENA studies:
- Wheat UniGene cluster sequence data were aligned using Exonerate, following the standard Ensembl pipeline. Click here for example.
- All publicly available wheat EST data were aligned using STAR. Click here for example.
Analysis of the bread wheat genome using comparative whole genome shotgun sequencing - Brenchley et al. 
The wheat genome assemblies previously generated by Brenchley et al. (PMID:23192148) have been aligned to the bread wheat survey sequence, Brachypodium, barley and the wild wheat progenitors (Triticum urartu and Aegilops tauschii). Homoeologous variants inferred between the three wheat genomes (A, B, and D) are displayed in the context of the gene models of these five genomes.
Sequences of diploid progenitor and ancestral species permitted homoeologous variants to be classified into two groups, 1) SNPs that differ between the A and D genomes (where the B genome is unknown) and, 2) SNPs that are the same between the A and D genomes, but differ in B.
The wheat gene alignments and the projected wheat SNPs are available on the Location view, as additional tracks under the "Wheat SNPs and alignments" section of the "Configure This page" menu. Click here for example.
Links (Aegilops tauschii)
- ENA study: SRP002455: Discovery of SNPs and genome-specific mutations by comparative analysis of transcriptomes of hexaploid wheat and its diploid ancestors
- ENA study: DRP000562: RNASeq from seedling leaves of Aegilops tauschii
- TREP, the Triticeae Repeat Sequence Database
Links (Triticum aestivum)
- MIPS Wheat Genome Database
- ENA study ERP000319: 454 pyrosequencing of the Triticum aestivum (bread wheat) genome to 5X coverage
- Triticum aestivum UniGene cluster sequences at NCBI
- Triticum aestivum ESTs at ENA
- Aegilops tauschii draft genome sequence reveals a gene repertoire for wheat adaptation.
Jia J, Zhao S, Kong X, Li Y, Zhao G, He W, Appels R, Pfeifer M, Tao Y, Zhang X et al. 2013. Nature. 496:91-95.
- Discovery of high-confidence single nucleotide polymorphisms from large-scale de novo analysis of leaf transcripts of Aegilops tauschii, a wild wheat progenitor.
Iehisa JC, Shimizu A, Sato K, Nasuda S, Takumi S. 2012. DNA Res.. 19:487-497.
- Image credit: Mark Nesbitt [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons.
- Homoeolog-specific transcriptional bias in allopolyploid wheat.
Akhunova AR, Matniyazov RT, Liang H, Akhunov ED. 2010. BMC Genomics. 11:505.
General information about this species can be found in Wikipedia.
|Assembly||ASM34733v1, INSDC Assembly GCA_000347335.1, Apr 2013|
|Golden Path Length||3,313,764,331|
|Genebuild method||Imported from ENA|
|Data source||Beijing Genomics Institute|
|Non coding genes||2,219|
|Small non coding genes||2,004|
|Long non coding genes||215|