Oryza longistaminata Assembly and Gene Annotation
About Oryza longistaminata
Oryza longistaminata (AA genome type) is a wild rice, Perennial, tall (2 m or more), erect, and rhizomatous grass; ligule of lower leaves >15 mm, acute or 2-cleft; panicles open to intermediately open; spikelets 4.5-11.4 mm long and 2-3 mm wide, awned (2-5 cm long); anther 1.5-8.2 mm long.
A whole genome shotgun assembly (i.e. Illumina sequence, SOAP de novo assembly) of O. longistaminata was generated by Professor Wen Wang (Kunming Institute of Zoology, Chinese Academy of Sciences) in collaboration with BGI-Shenzhen. The genome assembly was composed of 135,973 scaffolds spanning 344.6 Mb with a N50 scaffold size of 62.4 kb. Using this assembly, the Arizona Genomics Institute (AGI) selected scaffolds and contigs that were syntenic to the short arm of chromosome 3 of O. sativa ssp.japonica, and the order and orientation of each scaffold/contig was confirmed using Genome Puzzle Mater software (GPM, unpublished) to produce a Chr3S pseudomolecule. The final O. longistaminata chromosome 3 short arm resulted in a single scaffold of 14,404,039 bp composed of 4,724 contigs.
Protein-coding genes, annotation of repeats and transposable elements were conducted at Arizona Genomics Institute (AGI) led by Dr. Rod Wing. MAKER-P was used as evidence-based genome annotation pipeline. RepeatMasker was used to annotate repeats and transposable elements using species-specific de novo repeat libraries. Non coding RNA genes were predicted by AGI with Infernal, tRNA genes with tRNAScan.
- Gramene species page for Oryza
- Oryza Genome Evolution Project in AGI (Arizona Genomics Institute)
- International Rice Genome Sequencing Consortium (IRGSP)
- Rice Knowledge Bank
- Hierarchical scaffolding with Bambus.
Pop M, Kosack DS, Salzberg SL. 2004. Genome Res.. 14:149-159.
Picture credit: Paul Sanchez, Arizona Genomics Institute.
General information about this species can be found in Wikipedia.
|Assembly||O_longistaminata_v1.0, INSDC Assembly GCA_000789195.1, Dec 2014|
|Golden Path Length||326,442,508|
|Data source||Beijing Genomics Institute|
|Non coding genes||1,121|
|Small non coding genes||1,101|
|Long non coding genes||20|