Ostreococcus lucimarinus Assembly and Gene Annotation

This project is a collaborative effort involving:

About Ostreococcus lucimarinus

Ostreococcus lucimarinus is a unicellular green alga and an important member of the picoplankton community, which plays a central role in the oceanic carbon cycle. It is one of the smallest known free-living eukaryotic species, with an average size of 0.8 µm. Its cellular structure is characterised by remarkable simplicity, lacking a cell wall and containing a single chloroplast, a single mitochondrion, and a single Golgi body as well as its nucleus.

Assembly

The genome of Ostreococcus lucimarinus CCE9901 was sequenced by JGI and finished at the Stanford Genome Center. The v2.0 release has 13.204,894 Mb of finished sequence. The sequences have been deposited in GenBank under accession numbers CP000581-CP000601. Detailed information about the project is availabe at the JGI website.

The assembly release v.2.0 contains 13,204,894 bp of finished quality sequence in 21 chromosomes.

In detail, whole genome shotgun Sanger sequences were assembled using the Phred, Phrap, Consed pipeline. Manual inspection and finishing was performed by targeted resequencing. Because of the high GC content, primer walks failed to resolve a large number of the gaps; these were resolved by generating pooled small insert shatter libraries from 3 kb plasmid clones. Repeats were resolved by transposon-hopping 8 kb plasmid clones. Fosmid clones were shotgun-sequenced and finished to fill large gaps, resolve large repeats, or resolve chromosome duplications and extend into chromosome telomere regions. Finished chromosomes have no gaps, and the sequence has less than one error in 100,000 bp.

Annotation

This release includes a total of 7,651 predicted gene models produced through the collaboration of JGI, Ghent University (Belgium) and UCSD annotation teams.

In detail, gene prediction methods included ab initio Fgenesh, Fgenesh+, Genewise, MAGPIE, estExt, and EuGene. All predicted models were clustered and the best model per locus was selected based on homology to other proteins and EST support. The predicted set of gene models has been validated by using available experimental data and computational analysis.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 5898261 - Repeats content: 44.7%

References

The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation.
Palenik B, Grimwood J, Aerts A, Rouz P, Salamov A, Putnam N, Dupont C, Jorgensen R, Derelle E, Rombauts S et al. 2007. Proc. Natl. Acad. Sci. U.S.A.. 104:7705-7710.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

Assembly	ASM9206v1, INSDC Assembly GCA_000092065.1, Apr 2007
Database version	115.1
Golden Path Length	13,204,888
Genebuild by	JGI
Genebuild method	Import
Data source	Joint Genome Institute

Gene counts

Gene/transcipt that contains an open reading frame (ORF).Coding genes	7,603
Non coding genes	24
Small non coding genes	24
A gene that has homology to known protein-coding genes but contain a frameshift and/or stop codon(s) which disrupts the ORF. Thought to have arisen through duplication followed by loss of function.Pseudogenes	37
A transcript is the operational unit of a gene. In a genomic context, transcripts consist of one or more exons, with adjoining exons being separated by introns. The exons/introns are transcribed and then the introns spliced out. Transcripts may or may not encode a proteinGene transcripts	7,664

Ostreococcus lucimarinus Assembly and Gene Annotation

About Ostreococcus lucimarinus

Assembly

Annotation

References

Links

More information

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us

Favourite species

All species

Ostreococcus lucimarinus Assembly and Gene Annotation

About Ostreococcus lucimarinus

Assembly

Annotation

References

Links

More information

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us