Ostreococcus lucimarinus (ASM9206v1)

Ostreococcus lucimarinus Assembly and Gene Annotation

About Ostreococcus lucimarinus

Ostreococcus lucimarinus is a unicellular green alga and an important member of the picoplankton community, which plays a central role in the oceanic carbon cycle. It is one of the smallest known free-living eukaryotic species, with an average size of 0.8 µm. Its cellular structure is characterised by remarkable simplicity, lacking a cell wall and containing a single chloroplast, a single mitochondrion, and a single Golgi body as well as its nucleus.

Assembly

The genome of Ostreococcus lucimarinus CCE9901 was sequenced by JGI and finished at the Stanford Genome Center. The v2.0 release has 13.204,894 Mb of finished sequence. The sequences have been deposited in GenBank under accession numbers CP000581-CP000601. Detailed information about the project is availabe at the JGI website.

The assembly release v.2.0 contains 13,204,894 bp of finished quality sequence in 21 chromosomes.

In detail, whole genome shotgun Sanger sequences were assembled using the Phred, Phrap, Consed pipeline. Manual inspection and finishing was performed by targeted resequencing. Because of the high GC content, primer walks failed to resolve a large number of the gaps; these were resolved by generating pooled small insert shatter libraries from 3 kb plasmid clones. Repeats were resolved by transposon-hopping 8 kb plasmid clones. Fosmid clones were shotgun-sequenced and finished to fill large gaps, resolve large repeats, or resolve chromosome duplications and extend into chromosome telomere regions. Finished chromosomes have no gaps, and the sequence has less than one error in 100,000 bp.

Annotation

This release includes a total of 7,651 predicted gene models produced through the collaboration of JGI, Ghent University (Belgium) and UCSD annotation teams.

In detail, gene prediction methods included ab initio Fgenesh, Fgenesh+, Genewise, MAGPIE, estExt, and EuGene. All predicted models were clustered and the best model per locus was selected based on homology to other proteins and EST support. The predicted set of gene models has been validated by using available experimental data and computational analysis.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 5898261 - Repeats content: 44.7%

References

  1. The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation.
    Palenik B, Grimwood J, Aerts A, Rouz P, Salamov A, Putnam N, Dupont C, Jorgensen R, Derelle E, Rombauts S et al. 2007. Proc. Natl. Acad. Sci. U.S.A.. 104:7705-7710.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyASM9206v1, INSDC Assembly GCA_000092065.1, Apr 2007
Database version111.1
Golden Path Length13,204,888
Genebuild byJGI
Genebuild methodImport
Data sourceJoint Genome Institute

Gene counts

Coding genes7,603
Non coding genes24
Small non coding genes24
Pseudogenes37
Gene transcripts7,664