Oryza nivara (Oryza_nivara_v1.0)

Oryza nivara Assembly and Gene Annotation

Project funding: National Science Foundation Plant Genome Research Program (#1026200) for the Oryza Genome Evolution (OGE) Project. These pre-publication data are being released under guidelines of the Fort Lauderdale Agreement, which reaffirms the balance between fair use (i.e. no pre-emptive publication) and early disclosure. You are encouraged use these data to advance your research on individual loci but are asked to respect the rights of the investigators who generated these data to publish the whole-genome level description of O. nivara in a peer-reviewed journal. This description includes whole-genome comparative analyses, genome size evolution, gene family evolution, gene organisation and movement, heterochromatin, centromere evolution. This genome falls under the scope of the I-OMAP (International Oryza Map Alignment Project) consortium. The I-OMAP consortium is an internationally coordinated effort to create high-quality reference assemblies representing the diversity of wild and crop-progenitor species in the genus Oryza (Jacquemin et al, 2012). For enquiries and information on how to cite these data please contact Dr. Rod Wing.

About Oryza nivara

Oryza nivara is a wild rice from India; one of rice species being used in the OMAP project. It belongs to the AA genome group. Breeders are interested in this organism because it exhibits resistance to grassy stunt virus. It is found in swampy areas, at edges of ponds and tanks, beside streams, in ditches, in or around rice fields. It usually grows in shallow water up to 0.3 m. seasonally dry; in open habitats. It has 12 chromosomes and a nuclear genome size of 448 Mb (flow cytometry). This work was part of the Oryza Genome Evolution project funded by NSF Award #1026200 and in collaboration with Yue-Ie Hsing.

Assembly

The genome sequence was generated and assembled by the Arizona Genomics Institute (AGI) using accession IRGC100897. Illumina reads of different library sizes were assembled with Allpaths-LG, scaffolds were constructed with SSPACE, and gaps were closed with GapFiller. The estimated coverage from the WGS was 102x. Total sequence length 337,950,324 bp; Number of contigs 16,484; Contig N50: 37,688 bp.

Annotation

Protein-coding gene annotation was performed with evidence-based MAKER-P genome annotation pipeline. Non coding RNA genes were predicted with Infernal and tRNA genes with tRNAscan. RepeatMasker was used to annotate repeats and transposable elements with Oryza-specific de novo repeat libraries. These analyses were conducted at Arizona Genomics Institute (AGI) led by Dr. Rod Wing.

Gramene/Ensembl Genomes Annotation

Additional annotations generated by the Gramene and Ensembl Plants project include:

  • Gene phylogenetic trees with other Gramene species.
  • LastZ Whole Genome Alignment to Arabidopsis thaliana, Oryza sativa Japonica (IRGSP v1) and other Oryza AA genomes.
  • Orthologue based DAGchainer synteny detection against other AA genomes.
  • Mapping to the genome of multiple sequence-based feature sets using Gramene BLAT pipeline.
  • Identification of various repeat features by programs such as RepeatMasker with MIPS and AGI repeat libraries, and Dust, TRF.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyOryza_nivara_v1.0, INSDC Assembly GCA_000576065.1, Feb 2014
Database version104.10
Golden Path Length337,950,324
Genebuild byOGE
Genebuild methodImport
Data sourceOryza Genome Evolution Project

Gene counts

Coding genes36,313
Non coding genes713
Small non coding genes694
Long non coding genes19
Gene transcripts49,073

Other

FGENESH gene prediction45,102

About this species