Oryza barthii Assembly and Gene Annotation
About Oryza barthii
Oryza barthii is the AA genome progenitor of the West African cultivated rice, O. glaberrima. It belongs to the AA genome group and has 12 chromosomes and a nuclear genome size of 411 Mb (flow cytometry). It is found in mopane or savanna woodland, savanna or fadama. Grows in deep water, seasonally flooded land, stagnant water, and slowly flowing water or pools; prefers clay or black cotton soils. Found in open habitats. This work was part of the OGE project funded by NSF Award #1026200.
The genome sequence was generated and assembled by the Arizona Genomics Institute (AGI) using accession IRGC105608. The sequence data were generated by 454 and Illumina and assembled with Newbler and All Paths LG. The estimated coverage from the WGS was 110x. Total sequence length 308,273,932 bp; Number of contigs 25,404; Contig N50 18,930 bp.
Protein-coding gene annotation was performed with evidence-based MAKER-P genome annotation pipeline. Non coding RNA genes were predicted with Infernal and tRNA genes with tRNAscan. RepeatMasker was used to annotate repeats and transposable elements with Oryza-specific de novo repeat libraries. These analyses were conducted at Arizona Genomics Institute (AGI) led by Dr. Rod Wing.
Gramene/Ensembl Genomes Annotation
Additional annotations generated by the Gramene and Ensembl Plants project include:
- Gene phylogenetic trees with other Gramene species.
- LastZ Whole Genome Alignment to Arabidopsis thaliana, Oryza sativa Japonica (IRGSP v1) and other Oryza AA genomes.
- Orthologue based DAGchainer synteny detection against other AA genomes.
- Mapping to the genome of multiple sequence-based feature sets using Gramene BLAT pipeline.
- Identification of various repeat features by programs such as RepeatMasker with MIPS and AGI repeat libraries, and Dust, TRF.
General information about this species can be found in Wikipedia.
|Assembly||O.barthii_v1, INSDC Assembly GCA_000182155.2, Mar 2014|
|Golden Path Length||308,272,304|
|Data source||Oryza Genome Evolution Project|
|Non coding genes||916|
|Small non coding genes||907|
|Long non coding genes||9|
|FGENESH gene prediction||43,127|