Ipomoea triloba Assembly and Gene Annotation

About Ipomoea triloba

Three-lobe morning glory (Ipomoea triloba) is known by several common names, including littlebell or potato vine. It is a diploid (2n = 2x = 30) wild relative of hexaploid sweetpotato and belongs to Convolvulaceae, a family with ~1600 species. It is native to the tropical Americas, where sweetpotato was domesticated 5000 years ago. The genome sequence of accession NCNSP0323 is the product of an international collaboration led by Michigan State University and Cornell University.

Assembly

A total of 144.1 Gb high-quality cleaned Illumina paired-end and mate-pair reads were generated, representing 291x genome coverage. Based on the 17-mer depth distribution of the reads, accession NCNSP0323 is highly homozygous, with an estimated genome size of 495.9 Mb. Long-read PacBio data (5x) was used for gap-filling and used in combination with de novo-assembled BioNano maps to improve the assembly. The resulting assembly was 457.8 Mb with scaffold N50 length of 6.9 Mb. To construct pseudomolecules, a high-density genetic map for related I. trifida was generated and used to anchor 216 scaffolds (443.3 Mb; 96.8% of the assembly). To further evaluate the assembly, RNA-Seq reads from a set of developmental tissues were aligned. Overall 91.4% of the reads aligned. In addition, 94.6% of the core conserved plant genes were found complete (BUSCO).

Annotation

AUGUSTUS (v3.1) was trained on the soft-masked assemblies using leaf RNA-Seq alignments. Gene models were then predicted using the hard-masked assemblies and refined with PASA2 (v2.0.2) using genome-guided transcript assemblies as evidence. A high-confidence gene model set was constructed from the working gene model set by removing partial gene models and gene models with an internal stop codon, a hit to a
transposable element, or an FPKM of 0 across the RNA-Seq libraries. NOTE: Genes in Chr16 actually correspond to gene models in the concatenated unanchored scaffolds.

Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 902670 Low complexity (Dust) features, covering 57 Mb (12.3% of the genome); 191219 RepeatMasker features (with the REdat library), covering 48 Mb (10.3% of the genome); 24460 RepeatMasker features (with the RepBase library), covering 3 Mb (0.6% of the genome); 441020 Tandem repeats (TRF) features, covering 54 Mb (11.7% of the genome).

Links

References

  1. Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement.
    Wu S, Lau KH, Cao Q et al . 2018. Nature Communications. 9:4580.

Picture credit: J.M.Garg CC BY 3.0

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyASM357664v1, INSDC Assembly GCA_003576645.1,
Database version99.1
Base Pairs461,827,428
Golden Path Length461,827,428
Genebuild byGT4SP
Genebuild methodImport
Data sourceCornell University

Gene counts

Coding genes31,418
Gene transcripts47,083

About this species