Olea europaea Assembly and Gene Annotation
About Olea europaea
The Mediterranean olive tree (Olea europaea subsp. europaea) was one of the first trees to be domesticated and is of major agricultural importance as the source of olive oil. It is a diploid species (2n=2x=46) with an estimated haploid genome length of 1.38Gbp.
Assembly
A total of 543 Gb of raw DNA sequence from whole genome shotgun sequencing, and a fosmid library containing 155,000 clones from a 1,000+ year-old olive tree (cv. Farga) were generated by Illumina sequencing using different combinations of mate-pair and pair-end libraries. The assembly process produced a draft genome with a scaffold N50 of 443 kb, and a total length of 1.31 Gb, which represents 95 % of the estimated genome length (1.38 Gb). This assembly was improved by anchoring it to chromosomes using a genetic map and removing contaminated scaffolds. The final genome assembly (Oe9) has a N50 of 734 kb and 520.5 Mbp (39.5%) of the sequence anchored in 23 linkage groups. Gene completeness, as estimated using BUSCO, reached 94,9%.
Annotation
Genome annotation was obtained by using a combination of ab initio gene predictions, homology searches to proteins and RNA-Seq data from leaf, root, and fruit tissues at various stages.
Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: * 2070600 Low complexity (Dust) features, covering 119 Mb (9.0% of the genome); * 717974 RepeatMasker features (with the nrTEplants library), covering 282 Mb (21.4% of the genome); * 1173146 Repeats:Red features, covering 628 Mb (47.8% of the genome); * 576683 Tandem repeats (TRF) features, covering 440 Mb (33.4% of the genome).
- Genome sequence of the olive tree, Olea europaea.
Cruz F, Julca I, Gómez-Garrido J, Loska D, Marcet-Houben M, Cano E, Galán B, Frias L, Ribeca P, Derdak S, Gut M, Sánchez-Fernández M, García JL, Gut IG, Vargas P, Alioto TS, Gabaldón T.. - Genomic evidence for recurrent genetic admixture during the domestication of Mediterranean olive trees (Olea europaea L.).
Julca I, Marcet-Houben M, Cruz F, Gómez-Garrido J, Gaut BS, Díez CM, Gut IG, Alioto TS, Vargas P, Gabaldón T..
Picture credit: https://www.crg.eu
Statistics
Summary
Assembly | OLEA9, INSDC Assembly GCA_902713445.1, Oct 2020 |
Database version | 113.1 |
Golden Path Length | 1,315,765,470 |
Genebuild by | CNAG |
Genebuild method | External annotation import |
Data source | CNAG |
Gene counts
Coding genes | 55,595 |
Gene transcripts | 89,204 |