Prunus avium (PAV_r1.0)

Prunus avium Assembly and Gene Annotation

About Prunus avium

Sweet cherry (Prunus avium) is a fruit tree of the Rosaceae family thought to have originated south of the Caucasian mountains and around the Caspian and Black Seas. Prunus avium is a diploid (2n=2x=16) with an estimated genome size of 352.9 Mb. The genome sequence of variety Satonishiki was produced by Kazusa DNA Research Institute.


Assembly PAV_r1.0 was obtained by combining a paired-end library (insert size: 500 bp), four mate-pair libraries (insert sizes of 2, 5, 10, and 15 kb) and two assemblers (SOAPdenovo2 r240 and Platanus v1.2.1). The total length of the assembled sequences was 272.4 Mb, consisting of 10,148 scaffold sequences with an N50 length of 219.6 kb. The sequences covered 77.8% of the genome size (352.9 Mb) estimated by k-mer analysis, and included >96.0% of the core eukaryotic genes (BUSCO v1.1b). A high-density consensus map with 2,382 loci was constructed. Comparing the genetic maps of sweet cherry and peach (Prunus persica) revealed high synteny between them. Scaffolds were integrated into pseudo-molecules using map and synteny-based strategies.


MAKER was used to carry out ab-initio, evidence (RNAseq) and homology-based gene prediction. Over 43K complete and partial protein-encoding genes were obtained, with mean size 1,097bp.

Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 479122 Low complexity (Dust) features, covering 39 Mb (14.3% of the genome); 135815 RepeatMasker features (with the REdat library), covering 29 Mb (10.8% of the genome); 2701 RepeatMasker features (with the RepBase library), covering 1 Mb (0.2% of the genome); 150729 Tandem repeats (TRF) features, covering 28 Mb (10.3% of the genome); Repeat Detector repeats length 87Mb (32.2% of the genome).


  1. The genome sequence of sweet cherry (Prunus avium) for use in genomics-assisted breeding.
    Shirasawa K, Isuzugawa K, Ikenaga M, Saito Y, Yamamoto T, Hirakawa H, Isobe S. 2017. DNA Research. 24(5):499-508.

Picture credit: Jean-Pol GRANDMONT, CC BY 3.0

More information

General information about this species can be found in Wikipedia.



AssemblyPAV_r1.0, INSDC Assembly GCA_002207925.1,
Database version108.1
Golden Path Length272,361,615
Genebuild byGDR
Genebuild methodImport
Data sourceKazusa DNA Research Institute

Gene counts

Coding genes43,349
Gene transcripts43,349