Coffea canephora Assembly and Gene Annotation
About Coffea canephora
Coffea canephora, commonly known as robusta coffee, is a species of coffee in the Rubiaceae family. Within its genus, C. canephora has the widest natural distribution which extends west to east from Guinea to Uganda, and north to south from Cameroon to Angola. It is an allogamous diploid flowering plant (2n=2x=22). This reference genome results from a collaboration between Genoscope, IRD and Cirad (UMRs AGAP, DIADE and RPB), funded by ANR, and the Coffee Genome Sequencing Consortium.
Assembly
The sequenced genotype (2n=22, 1C=710 Mb) is a doubled-haploid plant (accession DH200-94) produced by IRD from the clone IF200. A total of 54.4 million Roche 454 single and mate-pair reads and 143,605 Sanger bacterial artificial chromosome--end reads were generated, achieving 30x coverage. Additional Illumina sequencing data (60x) were used to improve the assembly. The resulting assembly consists of 25,216 contigs and 13,345 scaffolds with a total length of 568.6 Mb. Eighty percent of the assembly is in 635 scaffolds, and the scaffold N50 is 1.26 Mb. A high-density genetic map comprising 64% of the assembly and 86% of the annotated genes was anchored to 11 chromosomes.
Annotation
A total of 25,574 protein-coding genes were annotated using various sources of evidence (cDNAs, RNA-Seq, protein alignments, and ab initio predictions) that were combined into gene models.
Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 209982696 - Repeats content: 36.9%
References
- The coffee genome provides insight into the convergent evolution of
caffeine biosynthesis.
Denoeud F, Carretero-Paulet L, Dereeper A et al. . 2014. Science. 345(6201):1181-1184.
Picture credit: Credit: Jee & Rani Nature Photography (License: CC BY-SA 4.0) 2010
Links {#links dir="ltr"}
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | AUK_PRJEB4211_v1, INSDC Assembly GCA_900059795.1, |
Database version | 113.1 |
Golden Path Length | 568,611,505 |
Genebuild by | Genoscope CEA |
Genebuild method | Import |
Data source | Genoscope CEA |
Gene counts
Coding genes | 25,574 |
Gene transcripts | 25,574 |