Olea europaea (OLEA9)

Olea europaea Assembly and Gene Annotation

About Olea europaea

The Mediterranean olive tree (Olea europaea subsp. europaea) was one of the first trees to be domesticated and is of major agricultural importance as the source of olive oil. It is a diploid species (2n=2x=46) with an estimated haploid genome length of 1.38Gbp.

Assembly

A total of 543 Gb of raw DNA sequence from whole genome shotgun sequencing, and a fosmid library containing 155,000 clones from a 1,000+ year-old olive tree (cv. Farga) were generated by Illumina sequencing using different combinations of mate-pair and pair-end libraries. The assembly process produced a draft genome with a scaffold N50 of 443 kb, and a total length of 1.31 Gb, which represents 95 % of the estimated genome length (1.38 Gb). This assembly was improved by anchoring it to chromosomes using a genetic map and removing contaminated scaffolds. The final genome assembly (Oe9) has a N50 of 734 kb and 520.5 Mbp (39.5%) of the sequence anchored in 23 linkage groups. Gene completeness, as estimated using BUSCO, reached 94,9%.

Annotation

Genome annotation was obtained by using a combination of ab initio gene predictions, homology searches to proteins and RNA-Seq data from leaf, root, and fruit tissues at various stages.

Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: * 2070600 Low complexity (Dust) features, covering 119 Mb (9.0% of the genome); * 717974 RepeatMasker features (with the nrTEplants library), covering 282 Mb (21.4% of the genome); * 1173146 Repeats:Red features, covering 628 Mb (47.8% of the genome); * 576683 Tandem repeats (TRF) features, covering 440 Mb (33.4% of the genome).

Picture credit: https://www.crg.eu

Statistics

Summary

AssemblyOLEA9, INSDC Assembly GCA_902713445.1, Oct 2020
Database version113.1
Golden Path Length1,315,765,470
Genebuild byCNAG
Genebuild methodExternal annotation import
Data sourceCNAG

Gene counts

Coding genes55,595
Gene transcripts89,204