Hordeum vulgare Assembly and Gene Annotation

About Hordeum vulgare

Hordeum vulgare (barley) is the world's fourth most important cereal crop and an important model for ecological adaptation, having been cultivated in all temperate regions from the Arctic Circle to the tropics. It was one of the first domesticated cereal grains originating in the Fertile Crescent over 10,000 years ago. About two-thirds of the global barley crop is used for animal feed, while the remaining third underpins the malting, brewing, and distilling industries. Although the human diet is not a primary use, barley offers potential health benefits, and is still the major calorie source in several parts of the world. Barley is a diploid member of the grass family, making it a natural model for the genetics and genomics of the Triticeae tribe, including polyploid wheat and rye. With a haploid genome size of ~5.3 Gbp in 7 chromosomes, it is one of the largest diploid genomes sequenced to date.

Assembly

The barley genome assembly presented here was produced by the International Barley Genome Sequencing Consortium (IBSC) [1] using a hierarchical approach. Initially multiplexed short read BAC by BAC contig assemblies (N50: 79 kbp) were scaffolded using physical, genetic and optical maps (N50: 1.9 Mbp) and were assigned to chromosomes using a POPSEQ genetic map. Finally, the linear order and orientation of scaffold sequences was determined using chromosome-conformation capture sequencing (Hi-C) [2].

The final chromosome-scale assembly consisted of 6,347 ordered super-scaffolds composed of merged assemblies of individual BACs, representing 4.79 Gbp (~95%) of the genomic sequence content, of which 4.54 Gbp have been assigned to precise chromosomal locations in the Hi-C map.

The chloroplast genome component and its gene annotation are also present (KC912687).

Annotation

Mapping of transcriptome data and reference protein sequences from other plant species identified 83,105 putative gene loci including protein coding genes, non-coding RNAs, pseudogenes and transcribed transposons. These loci were filtered and divided into 39,734 high-confidence and 41,949 low-confidence genes based on sequence homology. Additionally 19,908 long non-coding RNAs and 792 microRNA precursor loci were predicted. Using a set of conserved eukaryotic core genes (BUSCO), it was estimated that the predicted gene models represent 98% of the cv. Morex barley gene complement.

Regulation

Mappings for probes from the Barley1 GeneChip array, the Agilent barley full-length cDNA array, and the barley PGRC1 10k A and B array set can be viewed in the browser. For example, see the results for Contig2083_s_at.

Variation

Five sources of barley variation data are shown:

  1. Variation data from WGS survey sequencing of four cultivars, Barke, Bowman, Igri, Haruna Nijo and a wild barley (H. spontaneum). The data was collected as described in [3].
  2. SNPs discovered from RNA-Seq performed on the embryo tissues of 9 spring barley varieties (Barke, Betzes, Bowman, Derkado, Intro, Optic, Quench, Sergeant and Tocada) and Morex using Illumina HiSeq 2000 [3].
  3. Approximately five million variations from population sequencing of 90 Morex x Barke individuals [4].
  4. Approximately six million variations from population sequencing of 84 Oregon Wolfe barley individuals [4].
  5. SNPs from the Illumina iSelect 9k barley SNP chip[6]. ~2,600 mapped genetic markers associated with these SNPs [5] are also displayed.

References

  1. A chromosome conformation capture ordered sequence of the barley genome.
    Mascher M, Gundlach H, Himmelbach A, Beier S, Twardziok SO, Wicker T, Radchuk V, Dockter C, Hedley PE, Russell J et al. 2017. Nature. 544:427-433.
  2. Comprehensive mapping of long-range interactions reveals folding principles of the human genome.
    Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO et al. 2009. Science. 326:289-93.
  3. A physical, genetic and functional sequence assembly of the barley genome.
    International Barley Genome Sequencing Consortium, Mayer KF, Waugh R, Brown JW, Schulman A, Langridge P, Platzer M, Fincher GB, Muehlbauer GJ, Sato K et al. 2012. Nature. 491:711-716.
  4. A sequence-ready physical map of barley anchored genetically by two million single-nucleotide polymorphisms.
    Ariyadasa R, Mascher M, Nussbaumer T, Schulte D, Frenkel Z, Poursarebani N, Zhou R, Steuernagel B, Gundlach H, Taudien S et al. 2014. Plant Physiol.. 164
  5. Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ).
    Mascher M, Muehlbauer GJ, Rokhsar DS, Chapman J, Schmutz J, Barry K, Muoz-Amatrian M, Close TJ, Wise RP, Schulman AH et al. 2013. Plant J.. 76:718-727.

Picture credit: Lucash (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyIBSC v2,
Database version89.3
Base Pairs8,059,148,479
Golden Path Length4,833,907,081
Genebuild byIBSC_1.0
Genebuild methodImport
Data sourceIBSC

Gene counts

Coding genes39,809
Non coding genes36
Small non coding genes36
Gene transcripts248,381

Other

Short Variants16,552,953

About this species