Triticum urartu (ASM34745v1)

Triticum urartu Assembly and Gene Annotation

Wheat genomics resources are developed as part of our involvement in the consortium Triticeae Genomics For Sustainable Agriculture, funded by the BBSRC, and led by TGAC.

BBSRC logo

About Triticum urartu

Triticum urartu is the diploid progenitor of the bread wheat A-genome, providing important evolutionary information for bread and durum wheat. It is closely related to einkorn wheat, T. monococcum.


The genome of Triticum urartu accession G1812 was sequenced by the BGI using a whole-genome shotgum strategy, and assembled using SOAPdenovo software. The genome assembly reached 3.92 Gb with a N50 size of 3.42 kb. After gap closure, the draft assembly was 4.66 Gb, with a scaffold N50 length of 63.69 kb.

The chloroplast genome component and its gene annotation are also present. This was imported from ENA entry, KC912693.


34,879 protein-coding genes were predicted by Ling et al. (2013), using an ab initio approach with sequence similarity and RNA-seq data.

Additional analysis was carried out in house: non coding RNA genes have been annotated using tRNAScan-SE, Rfam, and RNAmmer. Triticeae Repeats from TREP were aligned to the T. urartu genome using RepeatMasker.

Regulation and sequence alignment

RNA-seq data, ESTs and UniGene datasets have also been aligned to the Triticum urartu genome:

  • Triticum urartu 454 RNA-seq data, from the ENA study SRP002455, and published by Akhunova et al., were aligned using STAR.
  • Wheat UniGene cluster sequence data were aligned using Exonerate, following the standard Ensembl pipeline.
  • All publicly available wheat EST data were aligned using STAR.

Analysis of the bread wheat genome using comparative whole genome shotgun sequencing

The wheat genome assemblies previously generated by Brenchley et al. have been aligned to the bread wheat survey sequence, Brachypodium, barley and the wild wheat progenitors (Triticum urartu and Aegilops tauschii). Homoeologous variants inferred between the three wheat genomes (A, B, and D) are displayed in the context of the gene models of these five genomes.

Sequences of diploid progenitor and ancestral species permitted homoeologous variants to be classified into two groups, 1) SNPs that differ between the A and D genomes (where the B genome is unknown) and, 2) SNPs that are the same between the A and D genomes, but differ in B.

Transcriptome assembly in diploid einkorn wheat Triticum monococcum

Genome-wide transcriptomes of two Triticum monococcum subspecies were constructed by Fox et al, the wild winter wheat T. monococcum ssp. aegilopoides (accession G3116) and the domesticated spring wheat T. monococcum ssp. monococcum (accession DV92) by generating de novo assemblies of RNA-seq data derived from both etiolated and green seedlings. Assembled data is available from the Jaiswal lab and raw reads are available from INSDC projects PRJNA203221 and PRJNA195398.

The de novo transcriptome assemblies of DV92 and G3116 represent 120,911 and 117,969 transcripts, respectively. They were mapped to the bread wheat, barley and Triticum urartu genomes using STAR.


  1. Draft genome of the wheat A-genome progenitor Triticum urartu.
    Ling HQ, Zhao S, Liu D, Wang J, Sun H, Zhang C, Fan H, Li D, Dong L, Tao Y et al. 2013. Nature. 496:87-90.
  2. Image credit: Mark Nesbitt CC-BY-SA-3.0
  3. Homoeolog-specific transcriptional bias in allopolyploid wheat.
    Akhunova AR, Matniyazov RT, Liang H, Akhunov ED. 2010. BMC Genomics. 11:505.

Links (Triticum urartu)

  • GigaDB
  • ENA study: SRP002455: Discovery of SNPs and genome-specific mutations by comparative analysis of transcriptomes of hexaploid wheat and its diploid ancestors
  • TREP, the Triticeae Repeat Sequence Database

Links (Triticum aestivum)

More information

General information about this species can be found in Wikipedia.



AssemblyASM34745v1, INSDC Assembly GCA_000347455.1, Apr 2013
Database version104.1
Golden Path Length3,747,163,292
Genebuild byBGI
Genebuild methodImport
Data sourceBeijing Genomics Institute

Gene counts

Coding genes34,903
Non coding genes1,876
Small non coding genes1,664
Long non coding genes212
Gene transcripts36,779

About this species