Pistacia vera Assembly and Gene Annotation

The Pistachio Genome Project is a collaboration among Shahid Bahonar University of Kerman, Pistachio Research Center at the Horticultural Sciences Research Institute (AREEO, Rafsanjan, Iran) and the Chinese Academy of Sciences with funding from the Animal Branch of the Germplasm Bank of Wild Species, Chinese Academy of Sciences (the Large Research Infrastructure Funding) and the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB13020600), and contributions in effort from pistachio breeders in Iran.

About Pistacia vera

Pistachio (Pistacia vera, 2n = 30) is one of the most important commercial nut crops worldwide that originated from Central Asia and the Middle East. Pistachio tree is a deciduous, long-living and desert plant which is able to tolerate high levels of salinity and drought stress. Pistachio is a member of the Anacardiaceae family that was domesticated about 8000 years ago.

Assembly

An individual of cultivar Batoury was chosen for genome sequencing and assembly. The genome was sequenced with the Illumina Hiseq 2500 platform from multiple paired-end libraries, including two small-insert libraries (270 bp and 500 bp) and six long-insert mate-pair libraries (3 kb, 4 kb, 8 kb, 10 kb, 15 kb, and 17 kb), achieving 270.47X coverage. A draft genome of 569.12 Mb was assembled, with contig and scaffold N50 sizes of 20.69 kb and 768.39 kb, respectively. To improve the continuity, a total of 4,038,150 filtered long reads were generated, with average lengths of 14,568 bp from 59 Gb sequencing data by Pacbio Sequel System. Finally, a draft genome of 671 Mb was assembled, with contig and scaffold N50 sizes of 75.7 kb and 949.2 kb, respectively provided a total of 373.84X coverage. The completeness of the genome assembly was confirmed by CEGMA and BUSCO software.

Annotation

Protein-coding genes were predicted using de novo and protein homology-based approaches. Genscan v1.0, Augustus v2.5.5, GlimmerHMM v3.0.1, GeneID v1.3, and SNAP were performed for de novo gene prediction, while homologous peptides from the A. thaliana (TAIR 10), Oryza sativa (Nipponbare, IRGSP-1.0), Theobroma cacao (Phytozome v12.1), and C. sinensis (Phytozome v12.1) genomes were aligned to our assembly to identify the homologous genes with GeMoMa v1.4.2. RNA-Seq reads were assembled using Trinity, and the resulting unigenes were aligned to the repeat-masked assemblies using BLAT, and subsequently, the gene structures of BLAT alignment results were modeled using PASA. Then, protein-coding regions were identified with TransDecoder v3.0.1 and GeneMarkS-T, respectively. Consensus gene models were generated by integrating the de novo predictions and protein alignments using EVidenceModeler.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 332023120 - Repeats content: 49.5%

References

Whole genomes and transcriptomes reveal adaptation and domestication of pistachio.
Zeng L, Tu XL, Dai H, Han FM, Lu BS, Wang MS, Nanaei HA, Tajabadipour A, Mansouri M, Li XL et al. 2019. Genome Biology. 20:79.

Picture credit: Professor Ali Esmailizadeh, Shahid Bahonar University of Kerman

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

Assembly	PisVer_v2, INSDC Assembly GCA_008641045.1,
Database version	114.1
Golden Path Length	671,152,441
Genebuild by	EVM
Genebuild method	External annotation import
Data source	Chinese Academy of Sciences

Gene counts

Gene/transcipt that contains an open reading frame (ORF).Coding genes	31,784
A transcript is the operational unit of a gene. In a genomic context, transcripts consist of one or more exons, with adjoining exons being separated by introns. The exons/introns are transcribed and then the introns spliced out. Transcripts may or may not encode a proteinGene transcripts	31,784

Pistacia vera Assembly and Gene Annotation

About Pistacia vera

Assembly

Annotation

References

Links

More information

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us

Favourite species

All species

Pistacia vera Assembly and Gene Annotation

About Pistacia vera

Assembly

Annotation

References

Links

More information

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us