Prunus persica (Prunus_persica_NCBIv2)

Prunus persica Assembly and Gene Annotation

About Prunus persica

Prunus persica (peach) is an economically important deciduous tree in the Rosaceae family that produces 20 million tons of fruit per year. The rosaceae family contains herbs, shrubs and trees with a wide variety of fruit types and habits and includes several species grown for their fruits (peaches, apples and strawberries), lumber (black cherry) and ornamental value (roses).

Peach was first domesticated and cultivated in North-West China and has a compact diploid genome (265 Mb, 2n =16).

Assembly

JGI performed the initial assembly using Sanger sequence reads representing 8.5-fold coverage of a double haploid genotype of cv. Lovell using Arachne. The resulting contigs and scaffolds were filtered to give 234 scaffolds covering 224.6 Mb of the peach genome (Peach v1.0) with scaffold and contig N50/L50 values of 4 Mb/26.8 Mb and 294 kb/214.2 kb, respectively with good QC statistics.

Five DNA libraries were end-sequenced, giving a total of 8.47-fold sequence coverage: 536,032 reads from the 2.8 kb sized library, 606,680 reads from the 4.4 kb sized library, 2,106,103 reads from the 7.8 kb sized library, 419,424 reads from the 35.3 kb fosmid library, and 61,440 reads from the 69.5 kb BAC library.

Annotation

A total of 27,852 protein-coding genes and 28,689 protein-coding transcripts were predicted by JGI.

Predictions began with PASA transcript assemblies based on ESTs from peach and related species. Transcript assemblies and a collection of plant peptide sequences were blasted against the assembly and gene models were predicted using by homology-based predictors FGENESH+ and GenomeScan. Predicted gene models were improved and refined by PASA.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 74986019 - Repeats content: 32.9%

Sequence alignment

Approximately 80,000 EST sequences have been aligned to the genome with STAR [View data]

References

  1. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution.
    Verde I, Abbott AG, Scalabrin S, Jung S, Shu S, Marroni F, Zhebentyayeva T, Dettori MT, Grimwood J, Cattonaro F et al. 2013. Nat. Genet.. 45:487-494.

Picture credit: Image created by skyseeker and released under a Creative Commons Attribution License.

  • Prunus persica ESTs at ENA

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyPrunus_persica_NCBIv2, INSDC Assembly GCA_000346465.2, Feb 2017
Database version111.2
Golden Path Length227,411,381
Genebuild byJGI
Genebuild methodImport
Data sourceJoint Genome Institute

Gene counts

Coding genes26,873
Non coding genes976
Small non coding genes968
Long non coding genes8
Gene transcripts48,065