Oryza glaberrima Assembly and Gene Annotation
About Oryza glaberrima
Oryza glaberrima (African rice) is a cultivated grain distinct from its better known cousin Oryza sativa (Asian rice). African rice was independently domesticated ~3000 years ago in the Niger River Delta from its still extant progenitor, Oryza barthii. While lacking many of the agronomic and quality traits found in Asian rice, O. glaberrima is significant for its resistance to many pests and diseases and for its tolerance of drought and infertile soils. Interspecific crosses between African and Asian rice have produced cultivars with improved yield and quality traits, that have been adopted by many African countries to meet the growing need for rice as a staple food. From a scientific perspective the genome of O. glaberrima provides insight into the genetic basis of domestication and other traits by finding commonalities and differences with O. sativa. Similar to Asian rice, African rice is a diploid A-type genome, having 12 chromosomes and an estimated size of ~358 Mb.
Assembly
The genome sequence was generated and assembled by the Arizona Genomics Institute (AGI) using strain IRGC:96717. The current assembly is Oryza_glaberrima_AGI1.1. It incorporates the previously assembled chromosome 3 short arm (Chr3s) sequence and consists of 12 chromosome pseudomolecules and 1,939 unplaced scaffolds. Chr3s was sequenced and assembled using a heavily manually edited physical map. BAC clones were shotgun Sanger sequenced to 8x coverage and phase II finished. Assembly of the tile sequence was performed manually. The rest of the genome was sequenced with a hybrid BAC pooling and whole genome shotgun approach with 30x coverage of Roche GSFLX 454 Titanium sequencing technology. Sequences were assembled and combined with a subset of previously sequenced BAC clones to produce a whole genome assembly. The underlying scaffolds have been deposited in GenBank with the accession number ADWL01000000.
Annotation
Protein-coding genes were annotated by the Munich Information Center for Protein Sequences (MIPS) led by Klaus Meyer using an evidence-based approach. Annotation of repeats and transposable elements was conducted at AGI. Prediction of ncRNA and tRNA genes was conducted at AGI.
Variation
Variation data comes from two (unpublished) sources:
- 20 diverse accessions of Oryza glaberrima and
- 19 accessions of its wild progenitor, Oryza barthii, collected from geographically distributed regions of Africa.
Briefly, WGS reads were generated using low-coverage Illumina sequencing. Filtered reads were aligned to O. glaberrima using BWA and SNP calling was done using a combination of SAMtools and GATK with standard quality and coverage filters giving a final set of ~8 million SNPs.
These unpublished data were kindly contributed by Rod Wing of the Arizona Genomics Institute and collaborator Carlos Machado of the University of Maryland, as part of the Oryza Genome Evolution project funded by NSF Award #1026200.
References
- Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza.
Stein JC, Yu Y, Copetti D, Zwickl DJ, Zhang L, Zhang C, Chougule K, Gao D, Iwata A, Goicoechea JL, Wei S, Wang J, Liao Y, Wang M, Jacquemin J, Becker C, Kudrna D, Zhang J, Londono CEM, Song X, Lee S, Sanchez P, Zuccolo A, Ammiraju JSS, Talag J, Danowitz A, Rivera LF, Gschwend AR, Noutsos C, Wu CC, Kao SM, Zeng JW, Wei FJ, Zhao Q, Feng Q, El Baidouri M, Carpentier MC, Lasserre E, Cooke R, Rosa Farias DD, da Maia LC, Dos Santos RS, Nyberg KG, McNally KL, Mauleon R, Alexandrov N, Schmutz J, Flowers D, Fan C, Weigel D, Jena KK, Wicker T, Chen M, Han B, Henry R, Hsing YC, Kurata N, de Oliveira AC, Panaud O, Jackson SA, Machado CA, Sanderson MJ, Long M, Ware D, Wing RA.. - The International Oryza Map Alignment Project: development of a genus-wide comparative genomics platform to help solve the 9 billion-people question.
Jacquemin J, Bhatia D, Singh K, Wing RA.. - Genetic diversity and domestication history of African rice (Oryza
glaberrima) as inferred from multiple gene
sequences.
Li ZM, Zheng XM, Ge S. 2011. Theor. Appl. Genet.. 123:21-31. - Rice structural variation: a comparative analysis of structural
variation between rice and three of its closest relatives in the
genus Oryza.
Hurwitz BL, Kudrna D, Yu Y, Sebastian A, Zuccolo A, Jackson SA, Ware D, Wing RA, Stein L. 2010. Plant J.. 63:990-1003. - Patterns of sequence divergence and evolution of the S orthologous
regions between Asian and African cultivated rice
species.
Guyot R, Garavito A, Gavory F, Samain S, Tohme J, Ghesquire A, Lorieux M. 2011. PLoS ONE. 6:e17726. - Exceptional lability of a genomic complex in rice and its close
relatives revealed by interspecific and intraspecific comparison and
population analysis.
Tian Z, Yu Y, Lin F, Yu Y, Sanmiguel PJ, Wing RA, McCouch SR, Ma J, Jackson SA. 2011. BMC Genomics. 12:142. - Distinct evolutionary patterns of Oryza glaberrima deciphered by
genome sequencing and comparative
analysis.
Sakai H, Ikawa H, Tanaka T, Numa H, Minami H, Fujisawa M, Shibata M, Kurita K, Kikuta A, Hamada M et al. 2011. Plant J.. 66:796-805. - Orthologous comparisons of the Hd1 region across genera reveal Hd1
gene lability within diploid Oryza species and disruptions to
microsynteny in
Sorghum.
Sanyal A, Ammiraju JS, Lu F, Yu Y, Rambo T, Currie J, Kollura K, Kim HR, Chen J, Ma J et al. 2010. Mol. Biol. Evol.. 27:2487-2506. - Paleogenomic analysis of the short arm of chromosome 3 reveals the
history of the African and Asian progenitors of cultivated
rices.
Roulin A, Chaparro C, Pigu B, Jackson S, Panaud O. 2010. Genome Biol Evol. 2:132-139.
Gramene/Ensembl Genomes Annotation
Additional annotations generated by the Gramene and Ensembl Plants project include:
- Gene phylogenetic trees with other Gramene species.
- LastZ Whole Genome Alignment to Arabidopsis thaliana, Oryza sativa Japonica (IRGSP v1) and other Oryza AA genomes.
- Orthologue based DAGchainer synteny detection against other AA genomes.
- Mapping to the genome of multiple sequence-based feature sets using Gramene BLAT pipeline.
- Identification of various repeat features by programs such as RepeatMasker with MIPS and AGI repeat libraries, and Dust, TRF.
- Variation effect prediction with sequence ontology.
Links
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | Oryza_glaberrima_V1, INSDC Assembly GCA_000147395.1, Sep 2010 |
Database version | 113.2 |
Golden Path Length | 316,419,574 |
Genebuild by | OGE |
Genebuild method | Import |
Data source | Oryza Genome Evolution Project |
Gene counts
Coding genes | 33,164 |
Non coding genes | 966 |
Small non coding genes | 852 |
Long non coding genes | 114 |
Gene transcripts | 34,130 |
Other
FGENESH gene prediction | 27,943 |
Short Variants | 7,704,409 |