Avena sativa Sang Assembly and Gene Annotation
About Avena sativa cv. Sang
Cultivated oat (Avena sativa L.) is an allohexaploid (AACCDD, 2n = 6x = 42) thought to have been domesticated more than 3,000 years ago while growing as a weed in wheat, emmer and barley fields in Anatolia. Oat has a low carbon footprint, substantial health benefits and the potential to replace animal-based food products. Oat is a member of Poaceae, an economically important grass family that includes wheat, rice, barley, common millet, maize, sorghum and sugarcane. Avena species exist in nature as diploids, tetraploids and hexaploids and exhibit the greatest genetic diversity around the Mediterranean, Middle East, Canary Islands and Himalayas. Currently, oat is a global crop with production ranking seventh among cereals (http://www.fao.org/faostat/en/, accessed May 2021). Compared with that of other cereals, oat cultivation requires fewer treatments with insecticides, fungicides or fertilizers. Whole-grain oats are a healthy source of antioxidants, polyunsaturated fatty acids, proteins and dietary fibre such as β-glucan, which is important in post-meal glycaemic responses and for preventing cardiovascular disease. Cereals such as wheat, barley and rye store high amounts of gluten proteins in their grain; by contrast, oat and rice store globular proteins in their grain.
Assembly
We produced a chromosome-scale reference sequence of oat cv. ‘Sang’ comprising 21 pseudochromosomes, with a BUSCO score of 98.7%, following the short-read strategy used for wheat, barley and rye. Inspection of Hi-C contact matrices and the consensus genetic map and their comparison with the independent assembly (long-read) of hexaploid oat OT3098 verified the integrity of the assembly.
Annotation
Gene models were predicted in the oat genome using an automated annotation pipeline, assisted by RNA-sequencing (RNA-seq) and Iso-seq transcriptome data, protein homology and ab initio prediction. This yielded 80,608 high-confidence protein-coding loci (98.5% BUSCO), 83.5% of which showed evidence of transcription in at least one condition. Another 71,727 low-confidence protein-coding loci primarily represent gene fragments, pseudogenes and gene models with weak support. The overall amount and composition of transposable elements is very similar between the Sang and OT3098 assemblies.
- The mosaic oat genome gives insights into a uniquely healthy cereal crop.
Kamal N, Tsardakas Renhuldt N, Bentzer J, Gundlach H, Haberer G, Juhász A, Lux T, Bose U, Tye-Din JA, Lang D, van Gessel N, Reski R, Fu YB, Spégel P, Ceplitis A, Himmelbach A, Waters AJ, Bekele WA, Colgrave ML, Hansson M, Stein N, Mayer KFX, Jellen EN, Maughan PJ, Tinker NA, Mascher M, Olsson O, Spannagl M, Sirijovski N.. Nature 606 (7912)
Picture credit: Wikipedia
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | Asativa_sang.v1.1, INSDC Assembly GCA_910574605.1, |
Database version | 113.1 |
Golden Path Length | 11,012,379,496 |
Genebuild by | IPK-Gatersleben |
Genebuild method | External annotation import |
Data source | IPK-Gatersleben |
Gene counts
Coding genes | 80,606 |
Pseudogenes | 2 |
Gene transcripts | 94,976 |