Triticum urartu Assembly and Gene Annotation

Wheat genomics resources are developed as part of our involvement in the consortium Triticeae Genomics For Sustainable Agriculture, funded by the BBSRC, and led by TGAC.

BBSRC logo

About Triticum urartu

Triticum urartu is the diploid progenitor of the bread wheat A-genome, providing important evolutionary information for bread and durum wheat. It is closely related to einkorn wheat, T. monococcum.

Assembly

The genome of Triticum urartu accession G1812 was sequenced by the BGI using a whole-genome shotgum strategy, and assembled using SOAPdenovo software. The genome assembly reached 3.92 Gbp with a N50 size of 3.42 kbp. After gap closure, the draft assembly was 4.66 Gbp, with a scaffold N50 length of 63.69 kbp.

The chloroplast genome component and its gene annotation are also present. This was imported from ENA entry, KC912693.

Annotation

34,879 protein-coding genes were predicted. For more details about genome sequencing and the protein_coding gene prediction protocol, see [1].

Non coding RNA genes have been annotated using tRNAScan-SE (Lowe, T.M. and Eddy, S.R. 1997), RFAM (Griffiths-Jones et al 2005), and RNAmmer (Lagesen K.,et al 2007); additional analysis tools have also been applied.

Triticeae Repeats from TREP were aligned to the T. urartu genome using RepeatMasker.

Regulation and sequence alignment

RNA-Seq data, ESTs and UniGene datasets have also been aligned to the Triticum urartu genome:

Analysis of the bread wheat genome using comparative whole genome shotgun sequencing - Brenchley et al. [4]

The wheat genome assemblies previously generated by Brenchley et al. (PMID:23192148) have been aligned to the bread wheat survey sequence, Brachypodium, barley and the wild wheat progenitors (Triticum urartu and Aegilops tauschii). Homoeologous variants inferred between the three wheat genomes (A, B, and D) are displayed in the context of the gene models of these five genomes.

Sequences of diploid progenitor and ancestral species permitted homoeologous variants to be classified into two groups, 1) SNPs that differ between the A and D genomes (where the B genome is unknown) and, 2) SNPs that are the same between the A and D genomes, but differ in B.

The wheat gene alignments and the projected wheat SNPs are available on the Location view, as additional tracks under the "Wheat SNPs and alignments" section of the "Configure This page" menu. Click here for example.

Transcriptome assembly in diploid einkorn wheat Triticum monococcum - Fox et al. [5]

Genome-wide transcriptomes of two Triticum monococcum subspecies were constructed, the wild winter wheat T. monococcum ssp. aegilopoides (accession G3116) and the domesticated spring wheat T. monococcum ssp. monococcum (accession DV92) by generating de novo assemblies of RNA-Seq data derived from both etiolated and green seedlings. Assembled data is available from the Jaiswal lab and raw reads are available from INSDC projects PRJNA203221 and PRJNA195398.

The de novo transcriptome assemblies of DV92 and G3116 represent 120,911 and 117,969 transcripts, respectively. They were mapped to the bread wheat, barley and Triticum urartu genomes using STAR. Click here for a Triticum urartu example.

Links

Links (Triticum urartu)

  • GigaDB
  • ENA study: SRP002455: Discovery of SNPs and genome-specific mutations by comparative analysis of transcriptomes of hexaploid wheat and its diploid ancestors
  • TREP, the Triticeae Repeat Sequence Database

Links (Triticum aestivum)

References

  1. Draft genome of the wheat A-genome progenitor Triticum urartu.
    Ling HQ, Zhao S, Liu D, Wang J, Sun H, Zhang C, Fan H, Li D, Dong L, Tao Y et al. 2013. Nature. 496:87-90.
  2. Image credit: Mark Nesbitt [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons.
  3. Homoeolog-specific transcriptional bias in allopolyploid wheat.
    Akhunova AR, Matniyazov RT, Liang H, Akhunov ED. 2010. BMC Genomics. 11:505.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyASM34745v1, INSDC Assembly GCA_000347455.1, Apr 2013
Database version94.1
Base Pairs3,008,981,354
Golden Path Length3,747,163,292
Genebuild byBGI
Genebuild methodImported from ENA
Data sourceBeijing Genomics Institute

Gene counts

Coding genes34,903
Non coding genes1,876
Small non coding genes1,664
Long non coding genes212
Gene transcripts36,779

About this species