Echinochloa crus-galli Assembly and Gene Annotation

About Echinochloa crus-galli

Barnyardgrass (Echinochloa crus-galli) is a pernicious weed in agricultural fields worldwide.

Assembly

The E. crus-galli line STB08, collected from rice paddy fields in the lower Yangtze River region of China, highly resembles cultivated rice in morphology, and has a chromosome number of 2n = 6x = 54. A total of 207.4 Gb of sequence data were generated using the Illumina HiSeq 2000 system from STB08 genomic DNA libraries with fragment sizes varying between 160 bp to 20 Kb. In addition, the Pacbio RS II system was used to generate 32.9 Gb third-generation long reads, totally representing ~ 171× coverage of the E. crus-galli genome estimated to be ~ 1.4 Gb in size based on the K-mer analysis and flow cytometry. De novo assembly yielded a draft genome of 1.27 Gb, representing 90.7% of the E. crus-galli genome ( > 1 Kb), with a scaffold N50 length of 1.8 Mb. Five fosmid clones ( > 15 Kb) were sequenced and compared with the assembly, and is confirm to be of good consistence. About 92.3% of the core eukaryotic genes (CEGs) could be completely aligned with the E. crus-galli gene set. We have also used BUSCO to judge the assembly of E. crus-galli, and found that the ‘complete’ percent is 95.5%, which is comparable to that of S. bicolor (96.4%) and S. italica (94.3%) genome.

Annotation

For gene annotation, transcriptomic data from the whole plant were generated by RNA-Seq. By integrating gene finding results from ab initio, homology- and transcript-based approaches, 108,771 protein-coding in the E. crus-galli genome were predicted. Of the 108,771 genes, 85% were supported by either the identification of homologues in other species or RNA-Seq data. In addition to protein-coding genes, 785 microRNAs (miRNAs) and other non-coding RNAs were also identified in the E. crus-galli genome.

Echinochloa crus-galli genome analysis provides insight into its adaptation and invasiveness as a weed.
Guo L, Qiu J, Ye C, Jin G, Mao L, Zhang H, Yang X, Peng Q, Wang Y, Jia L, Lin Z, Li G, Fu F, Liu C, Chen L, Shen E, Wang W, Chu Q, Wu D, Wu S, Xia C, Zhang Y, Zhou X, Wang L, Wu L, Song W, Wang Y, Shu Q, Aoki D, Yumoto E, Yokota T, Miyamoto K, Okada K, Kim DS, Cai D, Zhang C, Lou Y, Qian Q, Yamaguchi H, Yamane H, Kong CH, Timko MP, Bai L, Fan L.. Nat Commun 8 (1)

Picture credit: Wikipedia

Statistics

Summary

Assembly	ec_v3, INSDC Assembly GCA_020466025.1, Oct 2021
Database version	115.1
Golden Path Length	1,340,710,827
Genebuild by	ARRAY(0x1036f888)
Genebuild method	External annotation import
Data source	ZhejiangUniversity

Gene counts

Gene/transcipt that contains an open reading frame (ORF).Coding genes	103,850
A transcript is the operational unit of a gene. In a genomic context, transcripts consist of one or more exons, with adjoining exons being separated by introns. The exons/introns are transcribed and then the introns spliced out. Transcripts may or may not encode a proteinGene transcripts	103,850

Echinochloa crus-galli Assembly and Gene Annotation

About Echinochloa crus-galli

Assembly

Annotation

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us

Favourite species

All species

Echinochloa crus-galli Assembly and Gene Annotation

About Echinochloa crus-galli

Assembly

Annotation

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us