Theobroma cacao Matina 1-6 (Theobroma_cacao_20110822)

The Cacao Genome Project is a collaboration among MARS, USDA-ARS, IBM, NCGR, Clemson University, HudsonAlpha Institute for Biotechnology, Indiana University and Washington State University with funding from MARS, USDA-ARS, and NSF, and contributions in effort from cacao breeders around the world.

About Theobroma cacao

Theobroma cacao (cacao or chocolate tree) is a neotropical plant native to Amazonian rainforests. It is now cultivated in over 50 countries. A member of Malvaceae family, its beans are harvested from pods for use as the food chocolate, in confections and cosmetics. Cacao is a diploid species (2n=2x=20) with a relatively small genome (from 411 Mb to 494 Mb). This is the genome assembly and annotation of the Matina 1-6 cultivar, which belongs to the most cultivated cacao type worldwide.

Taxonomy ID 3641

Data source Cacao Genome Consortium

More information and statistics

Genome assembly: Theobroma_cacao_20110822

More information and statistics

Download DNA sequence (FASTA)

Display your data in Ensembl Plants

Other cultivars

This species has data on 1 additional cultivars. View list of cultivars

Gene annotation

What can I find? Protein-coding and non-coding genes, splice variants, cDNA and protein sequences, non-coding RNAs.

More about this genebuild

Download genes, cDNAs, ncRNA, proteins - FASTA - GFF3

Update your old Ensembl IDs

Comparative genomics

What can I find? Homologues, gene trees, and whole genome alignments across multiple species.

More about comparative analyses

Phylogenetic overview of gene families

Download alignments (EMF)


What can I find? Short sequence variants.

More about variation in Ensembl Plants

Variant Effect Predictor