Oryza longistaminata Assembly and Gene Annotation

Project funding: National Science Foundation Plant Genome Research Program (#1026200) for the Oryza Genome Evolution (OGE) Project. These pre-publication data are being released under guidelines of the Fort Lauderdale Agreement, which reaffirms the balance between fair use (i.e. no pre-emptive publication) and early disclosure. You are encouraged use these data to advance your research on individual loci but are asked to respect the rights of the investigators who generated these data to publish the whole-genome level description of O. glumaepatula in a peer-reviewed journal. This description includes whole-genome comparative analyses, genome size evolution, gene family evolution, gene organisation and movement, heterochromatin, centromere evolution. This genome falls under the scope of the I-OMAP (International Oryza Map Alignment Project) consortium. The I-OMAP consortium is an internationally coordinated effort to create high-quality reference assemblies representing the diversity of wild and crop-progenitor species in the genus Oryza (Jacquemin et al, 2012). For enquiries and information on how to cite these data please contact Dr. Rod Wing.

About Oryza longistaminata

Oryza longistaminata (AA genome type) is a wild rice, Perennial, tall (2 m or more), erect, and rhizomatous grass; ligule of lower leaves >15 mm, acute or 2-cleft; panicles open to intermediately open; spikelets 4.5-11.4 mm long and 2-3 mm wide, awned (2-5 cm long); anther 1.5-8.2 mm long.

Assembly

A whole genome shotgun assembly (i.e. Illumina sequence, SOAP de novo assembly) of O. longistaminata was generated by Professor Wen Wang (Kunming Institute of Zoology, Chinese Academy of Sciences) in collaboration with BGI-Shenzhen. The genome assembly was composed of 135,973 scaffolds spanning 344.6 Mb with a N50 scaffold size of 62.4 kb. Using this assembly, the Arizona Genomics Institute (AGI) selected scaffolds and contigs that were syntenic to the short arm of chromosome 3 of O. sativa ssp.japonica, and the order and orientation of each scaffold/contig was confirmed using Genome Puzzle Mater software (GPM, unpublished) to produce a Chr3S pseudomolecule. The final O. longistaminata chromosome 3 short arm resulted in a single scaffold of 14,404,039 bp composed of 4,724 contigs.

Annotation

Protein-coding genes, annotation of repeats and transposable elements were conducted at Arizona Genomics Institute (AGI) led by Dr. Rod Wing. MAKER-P was used as evidence-based genome annotation pipeline. RepeatMasker was used to annotate repeats and transposable elements using species-specific de novo repeat libraries. Non coding RNA genes were predicted by AGI with Infernal, tRNA genes with tRNAScan.

Links

References

  1. Hierarchical scaffolding with Bambus.
    Pop M, Kosack DS, Salzberg SL. 2004. Genome Res.. 14:149-159.

Picture credit: Paul Sanchez, Arizona Genomics Institute.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyO_longistaminata_v1.0, INSDC Assembly GCA_000789195.1, Dec 2014
Database version98.2
Base Pairs326,442,508
Golden Path Length326,442,508
Genebuild byOGE
Genebuild methodImport
Data sourceBeijing Genomics Institute

Gene counts

Coding genes31,686
Non coding genes1,121
Small non coding genes1,101
Long non coding genes20
Gene transcripts32,807

About this species