
Ipomoea triloba Assembly and Gene Annotation
About Ipomoea triloba
Three-lobe morning glory (Ipomoea triloba) is known by several common names, including littlebell or potato vine. It is a diploid (2n = 2x = 30) wild relative of hexaploid sweetpotato and belongs to Convolvulaceae, a family with ~1600 species. It is native to the tropical Americas, where sweetpotato was domesticated 5000 years ago. The genome sequence of accession NCNSP0323 is the product of an international collaboration led by Michigan State University and Cornell University.
Assembly
A total of 144.1 Gb high-quality cleaned Illumina paired-end and mate-pair reads were generated, representing 291x genome coverage. Based on the 17-mer depth distribution of the reads, accession NCNSP0323 is highly homozygous, with an estimated genome size of 495.9 Mb. Long-read PacBio data (5x) was used for gap-filling and used in combination with de novo-assembled BioNano maps to improve the assembly. The resulting assembly was 457.8 Mb with scaffold N50 length of 6.9 Mb. To construct pseudomolecules, a high-density genetic map for related I. trifida was generated and used to anchor 216 scaffolds (443.3 Mb; 96.8% of the assembly). To further evaluate the assembly, RNA-Seq reads from a set of developmental tissues were aligned. Overall 91.4% of the reads aligned. In addition, 94.6% of the core conserved plant genes were found complete (BUSCO).
Annotation
AUGUSTUS (v3.1) was trained on the soft-masked assemblies using leaf
RNA-Seq alignments. Gene models were then predicted using the
hard-masked assemblies and refined with PASA2 (v2.0.2) using
genome-guided transcript assemblies as evidence. A high-confidence gene
model set was constructed from the working gene model set by removing
partial gene models and gene models with an internal stop codon, a hit
to a
transposable element, or an FPKM of 0 across the RNA-Seq libraries.
NOTE: Genes in Chr16 actually correspond to gene models in the
concatenated unanchored scaffolds.
Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 902670 Low complexity (Dust) features, covering 57 Mb (12.3% of the genome); 191219 RepeatMasker features (with the REdat library), covering 48 Mb (10.3% of the genome); 24460 RepeatMasker features (with the RepBase library), covering 3 Mb (0.6% of the genome); 441020 Tandem repeats (TRF) features, covering 54 Mb (11.7% of the genome); Repeat Detector repeats length 178Mb (38.7% of the genome).
References
- Genome sequences of two diploid wild relatives of cultivated
sweetpotato reveal targets for genetic
improvement.
Wu S, Lau KH, Cao Q et al . 2018. Nature Communications. 9:4580.
Picture credit: J.M.Garg CC BY 3.0
Links
- [http://sweetpotato.plantbiology.msu.edu/gt4sp_download.shtml]{.underline}{#docs-internal-guid-207048ac-7fff-2cb1-d677-00645c305c3b}
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | ASM357664v1, INSDC Assembly GCA_003576645.1, |
Database version | 113.1 |
Golden Path Length | 461,827,428 |
Genebuild by | GT4SP |
Genebuild method | Import |
Data source | Cornell University |
Gene counts
Coding genes | 31,418 |
Gene transcripts | 47,083 |