Secale cereale Assembly and Gene Annotation
About Secale cereale
Rye (Secale cereale L.), a member of the grass tribe Triticeae and close relative of wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.), is grown primarily for human consumption and animal feed. Rye is uniquely stress tolerant (biotic and abiotic) and thus shows high yield potential under marginal conditions. This makes rye an important crop along the northern boreal-hemiboreal belt, a climatic zone predicted to expand considerably in Eurasia and North America with anthropogenic global warming1. Currently, rye is produced on 4.1 million ha, 81% of which is in northeastern Europe.
Assembly
De novo scaffolds where assembled representing 6.74 Gb of the estimated 7.9 Gb ‘Lo7’ genome from >1.8 Tb of short-read sequence. These scaffolds were ordered, oriented and curated using: (1) chromosome-specific shotgun (CSS) reads8, (2) 10x Chromium linked reads, (3) genetic map markers9, (4) three-dimensional chromosome conformation capture sequencing (Hi-C)18 and (5) a Bionano optical genome map. After intensive manual curation, 92% of this assembled sequence (~78% of the estimated genome size) was arranged first into super-scaffolds (N50 > 29 megabases, Mb) and then into pseudomolecules. Shotgun reads (~947 Gb of data, ~120× mean depth-of coverage) were mapped back to the assembly to confirm a near-unimodal coverage distribution consistent with a high-quality assembly.
Annotation
De novo annotation yielded 34,441 high-confidence (HC) genes, including 96.4% of the BUSCO (v.3) near-universal single-copy ortholog set, 19,456 full-length DNA long terminal repeat (LTR) retrotransposons (LTR-RTs) from six transposon families, 13,238 putative microRNAs (miRNAs) in 90 miRNA families and 1,382,323 tandem repeat arrays.
- Chromosome-scale genome assembly provides insights into rye biology, evolution and agronomic potential.
Rabanus-Wallace MT, Hackauf B, Mascher M, Lux T, Wicker T, Gundlach H, Baez M, Houben A, Mayer KFX, Guo L, Poland J, Pozniak CJ, Walkowiak S, Melonek J, Praz CR, Schreiber M, Budak H, Heuberger M, Steuernagel B, Wulff B, Börner A, Byrns B, Čížková J, Fowler DB, Fritz A, Himmelbach A, Kaithakottil G, Keilwagen J, Keller B, Konkin D, Larsen J, Li Q, Myśków B, Padmarasu S, Rawat N, Sesiz U, Biyiklioglu-Kaya S, Sharpe A, Šimková H, Small I, Swarbreck D, Toegelová H, Tsvetkova N, Voylokov AV, Vrána J, Bauer E, Bolibok-Bragoszewska H, Doležel J, Hall A, Jia J, Korzun V, Laroche A, Ma XF, Ordon F, Özkan H, Rakoczy-Trojanowska M, Scholz U, Schulman AH, Siekmann D, Stojałowski S, Tiwari VK, Spannagl M, Stein N.. Nat Genet 53 (4)
Picture credit: Wikipedia
Statistics
Summary
Assembly | Rye_Lo7_2018_v1p1p1, INSDC Assembly GCA_902687465.1, Jan 2021 |
Database version | 113.1 |
Golden Path Length | 6,735,227,109 |
Genebuild by | ARRAY(0x1566880) |
Genebuild method | External annotation import |
Data source | IPK |
Gene counts
Coding genes | 34,441 |
Gene transcripts | 34,441 |