Prunus avium Assembly and Gene Annotation
About Prunus avium
Sweet cherry (Prunus avium) is a fruit tree of the Rosaceae family thought to have originated south of the Caucasian mountains and around the Caspian and Black Seas. Prunus avium is a diploid (2n=2x=16) with an estimated genome size of 352.9 Mb. The genome sequence of variety Satonishiki was produced by Kazusa DNA Research Institute.
Assembly PAV_r1.0 was obtained by combining a paired-end library (insert size: 500 bp), four mate-pair libraries (insert sizes of 2, 5, 10, and 15 kb) and two assemblers (SOAPdenovo2 r240 and Platanus v1.2.1). The total length of the assembled sequences was 272.4 Mb, consisting of 10,148 scaffold sequences with an N50 length of 219.6 kb. The sequences covered 77.8% of the genome size (352.9 Mb) estimated by k-mer analysis, and included >96.0% of the core eukaryotic genes (BUSCO v1.1b). A high-density consensus map with 2,382 loci was constructed. Comparing the genetic maps of sweet cherry and peach (Prunus persica) revealed high synteny between them. Scaffolds were integrated into pseudo-molecules using map and synteny-based strategies.
MAKER was used to carry out ab-initio, evidence (RNAseq) and homology-based gene prediction. Over 43K complete and partial protein-encoding genes were obtained, with mean size 1,097bp.
Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 479122 Low complexity (Dust) features, covering 39 Mb (14.3% of the genome); 135815 RepeatMasker features (with the REdat library), covering 29 Mb (10.8% of the genome); 2701 RepeatMasker features (with the RepBase library), covering 1 Mb (0.2% of the genome); 150729 Tandem repeats (TRF) features, covering 28 Mb (10.3% of the genome); Repeat Detector repeats length 87Mb (32.2% of the genome).
- The genome sequence of sweet cherry (Prunus avium) for use in
Shirasawa K, Isuzugawa K, Ikenaga M, Saito Y, Yamamoto T, Hirakawa H, Isobe S. 2017. DNA Research. 24(5):499-508.
Picture credit: Jean-Pol GRANDMONT, CC BY 3.0
General information about this species can be found in Wikipedia.
|Assembly||PAV_r1.0, INSDC Assembly GCA_002207925.1,|
|Golden Path Length||272,361,615|
|Data source||Kazusa DNA Research Institute|