Triticum urartu Assembly and Gene Annotation
About Triticum urartu
Triticum urartu is the diploid progenitor of the bread wheat A-genome, providing important evolutionary information for bread and durum wheat. It is closely related to einkorn wheat, T. monococcum.
The genome of Triticum urartu accession G1812 was sequenced by the BGI using a whole-genome shotgum strategy, and assembled using SOAPdenovo software. The genome assembly reached 3.92 Gb with a N50 size of 3.42 kb. After gap closure, the draft assembly was 4.66 Gb, with a scaffold N50 length of 63.69 kb.
The chloroplast genome component and its gene annotation are also present. This was imported from ENA entry, KC912693.
34,879 protein-coding genes were predicted by Ling et al. (2013), using an ab initio approach with sequence similarity and RNA-seq data.
Additional analysis was carried out in house: non coding RNA genes have been annotated using tRNAScan-SE, Rfam, and RNAmmer. Triticeae Repeats from TREP were aligned to the T. urartu genome using RepeatMasker.
Regulation and sequence alignment
RNA-seq data, ESTs and UniGene datasets have also been aligned to the Triticum urartu genome:
- Triticum urartu 454 RNA-seq data, from the ENA study SRP002455, and published by Akhunova et al., were aligned using STAR.
- Wheat UniGene cluster sequence data were aligned using Exonerate, following the standard Ensembl pipeline.
- All publicly available wheat EST data were aligned using STAR.
Analysis of the bread wheat genome using comparative whole genome shotgun sequencing
The wheat genome assemblies previously generated by Brenchley et al. have been aligned to the bread wheat survey sequence, Brachypodium, barley and the wild wheat progenitors (Triticum urartu and Aegilops tauschii). Homoeologous variants inferred between the three wheat genomes (A, B, and D) are displayed in the context of the gene models of these five genomes.
Sequences of diploid progenitor and ancestral species permitted homoeologous variants to be classified into two groups, 1) SNPs that differ between the A and D genomes (where the B genome is unknown) and, 2) SNPs that are the same between the A and D genomes, but differ in B.
Transcriptome assembly in diploid einkorn wheat Triticum monococcum
Genome-wide transcriptomes of two Triticum monococcum subspecies were constructed by Fox et al, the wild winter wheat T. monococcum ssp. aegilopoides (accession G3116) and the domesticated spring wheat T. monococcum ssp. monococcum (accession DV92) by generating de novo assemblies of RNA-seq data derived from both etiolated and green seedlings. Assembled data is available from the Jaiswal lab and raw reads are available from INSDC projects PRJNA203221 and PRJNA195398.
- Draft genome of the wheat A-genome progenitor Triticum
Ling HQ, Zhao S, Liu D, Wang J, Sun H, Zhang C, Fan H, Li D, Dong L, Tao Y et al. 2013. Nature. 496:87-90.
- Image credit: Mark Nesbitt CC-BY-SA-3.0
- Homoeolog-specific transcriptional bias in allopolyploid
Akhunova AR, Matniyazov RT, Liang H, Akhunov ED. 2010. BMC Genomics. 11:505.
Links (Triticum urartu)
- ENA study: SRP002455: Discovery of SNPs and genome-specific mutations by comparative analysis of transcriptomes of hexaploid wheat and its diploid ancestors
- TREP, the Triticeae Repeat Sequence Database
Links (Triticum aestivum)
- MIPS Wheat Genome Database
- Triticum monococcum resources from Jaiswal Lab in Oregon State University
- ENA study ERP000319: 454 pyrosequencing of the Triticum aestivum (bread wheat) genome to 5X coverage
- Triticum aestivum UniGene cluster sequences at NCBI
- Triticum aestivum ESTs at ENA
General information about this species can be found in Wikipedia.
|Assembly||ASM34745v1, INSDC Assembly GCA_000347455.1, Apr 2013|
|Golden Path Length||3,747,163,292|
|Data source||Beijing Genomics Institute|
|Non coding genes||1,876|
|Small non coding genes||1,664|
|Long non coding genes||212|