Chenopodium quinoa (ASM168347v1)

Chenopodium quinoa Assembly and Gene Annotation

About Chenopodium quinoa

Chenopodium quinoa (Quinoa) is a highly nutritious crop that is adapted to thrive in a wide range of agroecosystems. It was presumably first domesticated more than 7,000 years ago by pre-Columbian cultures and was known as the Inca ‘mother grain’. It is an allotetraploid (2n=4x=36). Quinoa has adapted to the high plains of the Andean Altiplano (>3,500 m above sea level), where it has developed tolerance to several abiotic stresses. Quinoa has gained international attention because of the nutritional value of its seeds, which are gluten-free, have a low glycaemic index, and contain an excellent balance of essential amino acids, fibre, lipids, carbohydrates, vitamins, and minerals. It has the potential to provide a highly nutritious food source that can be grown on marginal lands not currently suitable for other major crops. This genome corresponds to coastal Chilean quinoa accession PI 614886, also known as NSL 106399 and QQ74.

Assembly

DNA extracted from leaf and flower tissue of a single plant was sequenced and assembled using single-molecule real-time technology from Pacific Biosciences and optical and chromosome-contact maps from BioNano Genomics and Dovetail Genomics. The assembly contains 3,486 scaffolds, with a scaffold N50 of 3.84 Mb and 90% of the assembled genome contained in 439 scaffolds. The total assembly size of 1.39Gb is similar to the reported size estimates of the quinoa genome (1.45–1.50 Gb). To combine scaffolds into pseudomolecules, an existing linkage map from quinoa was integrated with two new linkage maps. The resulting map of 6,403 unique markers spans a total length of 2,034 cM and consists of 18 linkage groups, corresponding to the haploid chromosome number of quinoa.

Annotation

Protein-coding genes were annotated using a combination of ab initio prediction and transcript evidence gathered from RNA sequenced from multiple tissues using both RNA-seq and PacBio isoform sequencing approaches. The obtained number of gene models is in line with sequenced tetraploid species. A majority (97.3%) of the 956 genes in the Plantae BUSCO dataset were identified in the annotation, which is suggestive of a complete assembly and annotation.

Repeats were annotated with the Ensembl Genomes repeat feature pipeline.There are: 2286474 Low complexity (Dust) features, covering 99 Mb (7.4% of the genome); 331989 RepeatMasker features (with the nrTEplants library), covering 142 Mb (10.7% of the genome); 988319 Tandem repeats (TRF) features, covering 314 Mb (23.5% of the genome); Repeat Detector repeats length 820Mb (61.5% of the genome).

References

  1. The genome of Chenopodium quinoa.
    Jarvis DE, Ho YS, Lightfoot DJ, Schmöckel SM, Li B, Borm TJ, Ohyanagi H, Mineta K, Michell CT, Saber N, Kharbatia NM, Rupper RR, Sharp AR, Dally N, Boughton BA, Woo YH, Gao G, Schijlen EG, Guo X, Momin AA, Negrão S, Al-Babili S, Gehring C, Roessner U, Jung C, Murphy K, Arold ST, Gojobori T, Linden CG, van Loo EN, Jellen EN, Maughan PJ, Tester M..

Picture credit: Michael Hermann CC BY-SA 4.0

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyASM168347v1, INSDC Assembly GCA_001683475.1,
Database version111.1
Golden Path Length1,333,398,936
Genebuild byChenopodiumDB
Genebuild methodExternal annotation import
Data sourceKing Abdullah University of Science and Technology

Gene counts

Coding genes43,952
Gene transcripts43,952