EMBL-EBI User Survey 2024

Do data resources managed by EMBL-EBI and our collaborators make a difference to your work?

Please take 10 minutes to fill in our annual user survey, and help us make the case for why sustaining open data resources is critical for life sciences research.

Survey link: https://www.surveymonkey.com/r/HJKYKTT?channel=[webpage]

Solanum lycopersicum (SL3.0)

Solanum lycopersicum Assembly and Gene Annotation

About Solanum lycopersicum

Solanum lycopersicum (tomato) is an important crop part that is a member of the nightshade family, Solanaceae, which includes a variety of agricultural crop plants (e.g. potato, pepper, eggplant, and tobacco). The tomato originated in the Andean region of South America, was grown by Aztecs in Mesoamerica, and spread to Europe by early Spanish explorers. Today, hundreds of varieties are grown throughout the world, with the largest producers being China and the United States. In addition to its value as a food, the tomato has served as an important model system for the study of fruit ripening, plant-pathogen interactions, and molecular genetic mapping. The nuclear genome contains 12 chromosomes and the current assembly is ~828 Mb in size.


Solanum lycopersicum cv. Heinz 1706 was sequenced and assembled by the International Tomato Genome Sequencing Consortium. Assembly version SL3.0 combines whole genome shotgun sequence (Roche 454) with Sanger sequence data from BAC-ends, fosmid-ends and Selected BAC Mixture sequences, additional data from Solexa and SOLiD technologies and optic mapping (total genome coverage 27X). The assembly is deposited into DDBJ/EMBL/GenBank under the accession GCA_000188115.3.


ITAG3.0 annotation was carried out by the International Tomato Annotation Group (ITAG) using a combination of evidence-based and ab initio methods, resulting in 20,766 updated genes, 6,660 novel genes and 6,624 dropped genes.


The variation data for Solanum lycopersicum is from the study of genetic variation by whole-genome sequencing of 84 tomato cultivars, including cultivated wild relatives representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon group. In detail, the study expored genetic variation in the tomato clade by sequencing a selection of 84 tomato cultivars and related wild species representative for the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups. The variation data has been submitted to the ENA with accession ERP004618, and has been locus-level accessioned using the transPLANT variation archive.


  1. The tomato genome sequence provides insights into fleshy fruit evolution.
    Tomato Genome Consortium. 2012. Nature. 485:635-641.
  2. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing.
    Aflitos S, Schijlen E, de Jong H, de Ridder D, Smit S, Finkers R, Wang J, Zhang G, Li N, Mao L et al. 2014. Plant J.. 80:136-148.

Picture credit: David Besa from Sonoma, USA (Flickr) [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons.

More information

General information about this species can be found in Wikipedia.



AssemblySL3.0, INSDC Assembly GCA_000188115.3, Apr 2018
Database version112.3
Golden Path Length827,747,456
Genebuild bySOL
Genebuild methodImport
Data sourceSolanaceae Genomics Project

Gene counts

Coding genes34,658
Non coding genes1,167
Small non coding genes1,137
Long non coding genes30
Gene transcripts35,825


Short Variants71,103,414