Ensembl resources are currently in reduced functionality mode. Please see our blog post for the latest information and our Twitter feed

Citrus clementina Assembly and Gene Annotation

About Citrus clementina

The Clementine mandarin (Citrus × clementina) is a hybrid between a sweet orange and the Mediterranean willowleaf mandarin. It’s a flowering plant of the Rutaceae family. This reference genome is the work of the International Citrus Genome Consortium (ICGC).

Assembly

Although clementines of cultivar Clemenules are diploid (2n=2x=18), the sequenced genotype is a haploid generated by in situ parthenogenesis induced by irradiated pollen of Fortune mandarin followed by direct embryo germination in vitro. A total of 4.6 million Sanger reads (including both fosmid-end and BAC-end reads), were generated by the ICGC, Genoscope, IGA and JGI totaling 7x coverage. Sequences were assembled with Arachne and integrated with a genetic map producing chromosome-scale pseudomolecules. The resulting 301.4Mb assembly (Citrus_clementina_v1.0) is nearly complete, with high assembly contiguity (contig L50 = 119 kb) and scaffolding (scaffold L50 before pseudo-chromosome construction = 6.8 Mb) . Overall, 45% of the sequence is repetitive.

Annotation

Long-read 454 and Sanger expressed-sequence-tags (ESTs) were generated. First these were used to validate the assembly, as early 97% aligned to the reference sequence. In addition, they were used to refine gene predictions made with FGenesH+, exonerate and GenomeScan. This produced an annotation 25,000 protein coding loci.

Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 644239 Low complexity (Dust) features, covering 29 Mb (9.6% of the genome); 133691 RepeatMasker features (with the REdat library), covering 45 Mb (14.9% of the genome); 3453 RepeatMasker features (with the RepBase library), covering 0 Mb (0.1% of the genome); 169903 Tandem repeats (TRF) features, covering 24 Mb (8.1% of the genome).

Links

References

  1. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication.
    GA Wu, S Prochnik, J Jenkins et al . 2014. Nature Biotechnology. 32:656662.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyCitrus_clementina_v1.0, INSDC Assembly GCA_000493195.1,
Database version99.1
Base Pairs295,129,112
Golden Path Length301,364,702
Genebuild byENA
Genebuild methodImport
Data sourceInternational Citrus Genome Consortium

Gene counts

Coding genes25,000
Non coding genes2,179
Small non coding genes2,138
Long non coding genes41
Gene transcripts36,736

About this species