Oryza glaberrima (Oryza_glaberrima_V1)

Oryza glaberrima Assembly and Gene Annotation

About Oryza glaberrima

Oryza glaberrima (African rice) is a cultivated grain distinct from its better known cousin Oryza sativa (Asian rice). African rice was independently domesticated ~3000 years ago in the Niger River Delta from its still extant progenitor, Oryza barthii. While lacking many of the agronomic and quality traits found in Asian rice, O. glaberrima is significant for its resistance to many pests and diseases and for its tolerance of drought and infertile soils. Interspecific crosses between African and Asian rice have produced cultivars with improved yield and quality traits, that have been adopted by many African countries to meet the growing need for rice as a staple food. From a scientific perspective the genome of O. glaberrima provides insight into the genetic basis of domestication and other traits by finding commonalities and differences with O. sativa. Similar to Asian rice, African rice is a diploid A-type genome, having 12 chromosomes and an estimated size of ~358 Mb.

Assembly

The genome sequence was generated and assembled by the Arizona Genomics Institute (AGI) using strain IRGC:96717. The current assembly is Oryza_glaberrima_AGI1.1. It incorporates the previously assembled chromosome 3 short arm (Chr3s) sequence and consists of 12 chromosome pseudomolecules and 1,939 unplaced scaffolds. Chr3s was sequenced and assembled using a heavily manually edited physical map. BAC clones were shotgun Sanger sequenced to 8x coverage and phase II finished. Assembly of the tile sequence was performed manually. The rest of the genome was sequenced with a hybrid BAC pooling and whole genome shotgun approach with 30x coverage of Roche GSFLX 454 Titanium sequencing technology. Sequences were assembled and combined with a subset of previously sequenced BAC clones to produce a whole genome assembly. The underlying scaffolds have been deposited in GenBank with the accession number ADWL01000000.

Annotation

Protein-coding genes were annotated by the Munich Information Center for Protein Sequences (MIPS) led by Klaus Meyer using an evidence-based approach. Annotation of repeats and transposable elements was conducted at AGI. Prediction of ncRNA and tRNA genes was conducted at AGI.

Variation

Variation data comes from two (unpublished) sources:

  1. 20 diverse accessions of Oryza glaberrima and
  2. 19 accessions of its wild progenitor, Oryza barthii, collected from geographically distributed regions of Africa.

Briefly, WGS reads were generated using low-coverage Illumina sequencing. Filtered reads were aligned to O. glaberrima using BWA and SNP calling was done using a combination of SAMtools and GATK with standard quality and coverage filters giving a final set of ~8 million SNPs.

These unpublished data were kindly contributed by Rod Wing of the Arizona Genomics Institute and collaborator Carlos Machado of the University of Maryland, as part of the Oryza Genome Evolution project funded by NSF Award #1026200.

References

  1. Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza.
    Stein JC, Yu Y, Copetti D, Zwickl DJ, Zhang L, Zhang C, Chougule K, Gao D, Iwata A, Goicoechea JL, Wei S, Wang J, Liao Y, Wang M, Jacquemin J, Becker C, Kudrna D, Zhang J, Londono CEM, Song X, Lee S, Sanchez P, Zuccolo A, Ammiraju JSS, Talag J, Danowitz A, Rivera LF, Gschwend AR, Noutsos C, Wu CC, Kao SM, Zeng JW, Wei FJ, Zhao Q, Feng Q, El Baidouri M, Carpentier MC, Lasserre E, Cooke R, Rosa Farias DD, da Maia LC, Dos Santos RS, Nyberg KG, McNally KL, Mauleon R, Alexandrov N, Schmutz J, Flowers D, Fan C, Weigel D, Jena KK, Wicker T, Chen M, Han B, Henry R, Hsing YC, Kurata N, de Oliveira AC, Panaud O, Jackson SA, Machado CA, Sanderson MJ, Long M, Ware D, Wing RA..
  2. The International Oryza Map Alignment Project: development of a genus-wide comparative genomics platform to help solve the 9 billion-people question.
    Jacquemin J, Bhatia D, Singh K, Wing RA..
  3. Genetic diversity and domestication history of African rice (Oryza glaberrima) as inferred from multiple gene sequences.
    Li ZM, Zheng XM, Ge S. 2011. Theor. Appl. Genet.. 123:21-31.
  4. Rice structural variation: a comparative analysis of structural variation between rice and three of its closest relatives in the genus Oryza.
    Hurwitz BL, Kudrna D, Yu Y, Sebastian A, Zuccolo A, Jackson SA, Ware D, Wing RA, Stein L. 2010. Plant J.. 63:990-1003.
  5. Patterns of sequence divergence and evolution of the S orthologous regions between Asian and African cultivated rice species.
    Guyot R, Garavito A, Gavory F, Samain S, Tohme J, Ghesquire A, Lorieux M. 2011. PLoS ONE. 6:e17726.
  6. Exceptional lability of a genomic complex in rice and its close relatives revealed by interspecific and intraspecific comparison and population analysis.
    Tian Z, Yu Y, Lin F, Yu Y, Sanmiguel PJ, Wing RA, McCouch SR, Ma J, Jackson SA. 2011. BMC Genomics. 12:142.
  7. Distinct evolutionary patterns of Oryza glaberrima deciphered by genome sequencing and comparative analysis.
    Sakai H, Ikawa H, Tanaka T, Numa H, Minami H, Fujisawa M, Shibata M, Kurita K, Kikuta A, Hamada M et al. 2011. Plant J.. 66:796-805.
  8. Orthologous comparisons of the Hd1 region across genera reveal Hd1 gene lability within diploid Oryza species and disruptions to microsynteny in Sorghum.
    Sanyal A, Ammiraju JS, Lu F, Yu Y, Rambo T, Currie J, Kollura K, Kim HR, Chen J, Ma J et al. 2010. Mol. Biol. Evol.. 27:2487-2506.
  9. Paleogenomic analysis of the short arm of chromosome 3 reveals the history of the African and Asian progenitors of cultivated rices.
    Roulin A, Chaparro C, Pigu B, Jackson S, Panaud O. 2010. Genome Biol Evol. 2:132-139.

Gramene/Ensembl Genomes Annotation

Additional annotations generated by the Gramene and Ensembl Plants project include:

  • Gene phylogenetic trees with other Gramene species.
  • LastZ Whole Genome Alignment to Arabidopsis thaliana, Oryza sativa Japonica (IRGSP v1) and other Oryza AA genomes.
  • Orthologue based DAGchainer synteny detection against other AA genomes.
  • Mapping to the genome of multiple sequence-based feature sets using Gramene BLAT pipeline.
  • Identification of various repeat features by programs such as RepeatMasker with MIPS and AGI repeat libraries, and Dust, TRF.
  • Variation effect prediction with sequence ontology.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyOryza_glaberrima_V1, INSDC Assembly GCA_000147395.1, Sep 2010
Database version111.2
Golden Path Length316,419,574
Genebuild byOGE
Genebuild methodImport
Data sourceOryza Genome Evolution Project

Gene counts

Coding genes33,164
Non coding genes966
Small non coding genes852
Long non coding genes114
Gene transcripts34,130

Other

FGENESH gene prediction27,943
Short Variants7,704,409