Citrus clementina Assembly and Gene Annotation
About Citrus clementina
The Clementine mandarin (Citrus × clementina) is a hybrid between a sweet orange and the Mediterranean willowleaf mandarin. It's a flowering plant of the Rutaceae family. This reference genome is the work of the International Citrus Genome Consortium (ICGC).
Although clementines of cultivar Clemenules are diploid (2n=2x=18), the sequenced genotype is a haploid generated by in situ parthenogenesis induced by irradiated pollen of Fortune mandarin followed by direct embryo germination in vitro. A total of 4.6 million Sanger reads (including both fosmid-end and BAC-end reads), were generated by the ICGC, Genoscope, IGA and JGI totaling 7x coverage. Sequences were assembled with Arachne and integrated with a genetic map producing chromosome-scale pseudomolecules. The resulting 301.4Mb assembly (Citrus_clementina_v1.0) is nearly complete, with high assembly contiguity (contig L50 = 119 kb) and scaffolding (scaffold L50 before pseudo-chromosome construction = 6.8 Mb) . Overall, 45% of the sequence is repetitive.
Long-read 454 and Sanger expressed-sequence-tags (ESTs) were generated. First these were used to validate the assembly, as early 97% aligned to the reference sequence. In addition, they were used to refine gene predictions made with FGenesH+, exonerate and GenomeScan. This produced an annotation 25,000 protein coding loci.
Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 644239 Low complexity (Dust) features, covering 29 Mb (9.6% of the genome); 133691 RepeatMasker features (with the REdat library), covering 45 Mb (14.9% of the genome); 3453 RepeatMasker features (with the RepBase library), covering 0 Mb (0.1% of the genome); 169903 Tandem repeats (TRF) features, covering 24 Mb (8.1% of the genome); Repeat Detector repeats length 110Mb (36.6% of the genome).
- Sequencing of diverse mandarin, pummelo and orange genomes reveals
complex history of admixture during citrus
GA Wu, S Prochnik, J Jenkins et al . 2014. Nature Biotechnology. 32:656662.
General information about this species can be found in Wikipedia.
|Assembly||Citrus_clementina_v1.0, INSDC Assembly GCA_000493195.1,|
|Golden Path Length||301,364,702|
|Data source||International Citrus Genome Consortium|
|Non coding genes||2,179|
|Small non coding genes||2,138|
|Long non coding genes||41|