Echinochloa crus-galli (ec_v3)

Echinochloa crus-galli Assembly and Gene Annotation

About Echinochloa crus-galli

Barnyardgrass (Echinochloa crus-galli) is a pernicious weed in agricultural fields worldwide.

Assembly

The E. crus-galli line STB08, collected from rice paddy fields in the lower Yangtze River region of China, highly resembles cultivated rice in morphology, and has a chromosome number of 2n = 6x = 54. A total of 207.4 Gb of sequence data were generated using the Illumina HiSeq 2000 system from STB08 genomic DNA libraries with fragment sizes varying between 160 bp to 20 Kb. In addition, the Pacbio RS II system was used to generate 32.9 Gb third-generation long reads, totally representing  ~ 171× coverage of the E. crus-galli genome estimated to be  ~ 1.4 Gb in size based on the K-mer analysis and flow cytometry. De novo assembly yielded a draft genome of 1.27 Gb, representing 90.7% of the E. crus-galli genome ( > 1 Kb), with a scaffold N50 length of 1.8 Mb. Five fosmid clones ( > 15 Kb) were sequenced and compared with the assembly, and is confirm to be of good consistence. About 92.3% of the core eukaryotic genes (CEGs) could be completely aligned with the E. crus-galli gene set. We have also used BUSCO to judge the assembly of E. crus-galli, and found that the ‘complete’ percent is 95.5%, which is comparable to that of S. bicolor (96.4%) and S. italica (94.3%) genome.

Annotation

For gene annotation, transcriptomic data from the whole plant were generated by RNA-Seq. By integrating gene finding results from ab initio, homology- and transcript-based approaches, 108,771 protein-coding in the E. crus-galli genome were predicted. Of the 108,771 genes, 85% were supported by either the identification of homologues in other species or RNA-Seq data. In addition to protein-coding genes, 785 microRNAs (miRNAs) and other non-coding RNAs were also identified in the E. crus-galli genome.

Picture credit: Wikipedia

Statistics

Summary

Assemblyec_v3, INSDC Assembly GCA_020466025.1, Oct 2021
Database version111.1
Golden Path Length1,340,710,827
Genebuild byARRAY(0x25a9c88)
Genebuild methodExternal annotation import
Data sourceZhejiangUniversity

Gene counts

Coding genes103,850
Gene transcripts103,850