EMBL-EBI User Survey 2024

Do data resources managed by EMBL-EBI and our collaborators make a difference to your work?

Please take 10 minutes to fill in our annual user survey, and help us make the case for why sustaining open data resources is critical for life sciences research.

Survey link: https://www.surveymonkey.com/r/HJKYKTT?channel=[webpage]

Arabidopsis halleri (Ahal2.2)

Arabidopsis halleri Assembly and Gene Annotation

About Arabidopsis halleri

The self-incompatible species Arabidopsis halleri is a close relative of the self-compatible model plant Arabidopsis thaliana. The broad European and Asian distribution and heavy metal hyperaccumulation ability make A. halleri a useful model for ecological genomics studies. The sequenced individual, A. halleri ssp. gemmifera Tada mine genotype (W302), was collected from a site with high contamination by heavy metals in Japan.

Genomic DNA was extracted from plants obtained after five rounds of forced selfing, with heterozygosity reduced to 0.04%. The genome was assembled using ALLPATHS-LG R50599 using three paired end short insert libraries (200 bp, 500 bp, and 800 bp, see Akama et al. 2014) and six long insert mate pair libraries (ranging from 3 kb to 38 kb). The assembly was subsequently improved based on synteny with Arabidopsis lyrata assembly.

Gene annotation was performed with AUGUSTUS v3.0.3 using RNA-seq data from leaves and roots (see Paape et al. 2016). Human readable functional descriptions were generated with AHRD. Out of 32,553 genes, 21,433 were reciprocal best blast hits with Arabidopsis thaliana TAIR10 genes.

Repeated sequences were called with the Repeat Detector, which is part of the Ensembl Genomes repeat feature pipelines. Repeats length: 61005460 - Repeats content: 31.1%


  1. Conserved but Attenuated Parental Gene Expression in Allopolyploids: Constitutive Zinc Hyperaccumulation in the Allotetraploid Arabidopsis kamchatica.
    Paape T, Hatakeyama M, Shimizu-Inatsugi R, Cereghetti T, Onda Y, Kenta T, Sese J, Shimizu KK. 2016. Mol Biol Evol. 33(11):2781-2800.
  2. Genome-wide quantification of homeolog expression ratio revealed nonstochastic gene regulation in synthetic allopolyploid Arabidopsis.
    Akama S, Shimizu-Inatsugi R, Shimizu KK, Sese J. 2014. Nucleic Acids Res. 42(6):e46.
  3. Genome assembly and annotation of Arabidopsis halleri, a model for heavy metal hyperaccumulation and evolutionary ecology.
    Briskine RV, Paape T, Shimizu-Inatsugi R, Nishiyama T, Akama S, Sese J, Shimizu KK. 2017. Mol Ecol Resour. 17(5):1025-1036.

Picture credit: Arabidopsis halleri subsp. gemmifera, Kinki district, Japan, picture taken by Kentaro K. Shimizu

More information

General information about this species can be found in Wikipedia.



AssemblyAhal2.2, INSDC Assembly GCA_900078215.1, Oct 2016
Database version112.1
Golden Path Length196,243,198
Genebuild byUZH
Genebuild methodImport
Data sourceUniversity of Zurich

Gene counts

Coding genes32,553
Non coding genes670
Small non coding genes657
Long non coding genes13
Gene transcripts35,223