Oryza barthii Assembly and Gene Annotation

Project funding: National Science Foundation Plant Genome Research Program (#1026200) for the Oryza Genome Evolution (OGE) Project. These pre-publication data are being released under guidelines of the Fort Lauderdale Agreement, which reaffirms the balance between fair use (i.e. no pre-emptive publication) and early disclosure. Users are encouraged use these data to advance their research on individual loci but are asked to respect the rights of the investigators who generated these data to publish the whole-genome level description of the O. barthii in a peer-reviewed journal. This description includes whole-genome comparative analyses, genome size evolution, gene family evolution, gene organization and movement, heterochromatin, centromere evolution. This genome falls under the scope of the I-OMAP (International Oryza Map Alignment Project) consortium. The I-OMAP consortium is an internationally coordinated effort to create high-quality reference assemblies representing the diversity of wild and crop-progenitor species in the genus Oryza (Jacquemin et al, 2012). For inquiries and information on how to cite these data please contact Dr. Rod Wing.

About Oryza barthii

Oryza barthii is the AA genome progenitor of the West African cultivated rice, O. glaberrima. It belongs to the AA genome group and has 12 chromosomes and a nuclear genome size of 411Mb (flow cytometry). It is found in mopane or savanna woodland, savanna or fadama. Grows in deep water, seasonally flooded land, stagnant water, and slowly flowing water or pools; prefers clay or black cotton soils. Found in open habitats. This work was part of the OGE project funded by NSF Award #1026200.

Assembly

The genome sequence was generated and assembled by the Arizona Genomics Institute (AGI) using accession IRGC105608. The sequence data were generated by 454 and Illumina and assembled with Newbler and All Paths LG. The estimated coverage from the WGS was 110x. Total sequence length 308,273,932bp; Number of contigs 25,404; Contig N50 18,930bp.

Annotation

Protein-coding gene annotation was performed with evidence-based MAKER-P genome annotation pipeline. Non coding RNA genes were predicted with Infernal and tRNA genes with tRNAscan. RepeatMasker was used to annotate repeats and transposable elements with Oryza-specific de novo repeat libraries. These analyses were conducted at Arizona Genomics Institute (AGI) led by Dr. Rod Wing.

Gramene/Ensembl Genomes Annotation

Additional annotations generated by the Gramene and Ensembl Plants project include:

  • Gene phylogenetic trees with other Gramene species, see example.
  • Lastz Whole Genome Alignment to Arabidopsis thaliana, Oryza sativa Japonica (IRGSP v1) and other Oryza AA genomes, see example.
  • Ortholog based DAGchainer synteny detection against other AA genomes, see example.
  • Mapping to the genome of multiple sequence-based feature sets using gramene blat pipeline, see example.
  • Identification of various repeat features by programs such as RepeatMasker with MIPS and AGI repeat libraries, and Dust, TRF.

Links

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyO.barthii_v1, INSDC Assembly GCA_000182155.2, Mar 2014
Database version94.3
Base Pairs308,272,304
Golden Path Length308,272,304
Genebuild byOGE
Genebuild methodImported from OGE
Data sourceOryza Genome Evolution Project

Gene counts

Coding genes34,575
Non coding genes916
Small non coding genes907
Long non coding genes9
Gene transcripts42,595

Other

FGENESH gene prediction43,127

About this species