Getting started

To use TOPMed Imputation Server, a registration is required. We send an activation mail to the provided address. Please follow the instructions in the email to activate your account. If it doesn't arrive, ensure you have entered the correct email address and check your spam folder.

After the email address has been verified, the service can be used without any costs.

Please cite this paper if you use the Imputation Server in your GWAS study:

Das S, Forer L, Schönherr S, Sidore C, Locke AE, Kwong A, Vrieze S, Chew EY, Levy S, McGue M, Schlessinger D, Stambolian D, Loh PR, Iacono WG, Swaroop A, Scott LJ, Cucca F, Kronenberg F, Boehnke M, Abecasis GR, Fuchsberger C. Next-generation genotype imputation service and methods. Nature Genetics 48, 1284–1287 (2016).

Set up your first imputation job

Please log in with your credentials and click on the Run tab to start a new imputation job. The submission dialog allows you to specify the properties of your imputation job.

The following options are available:

Reference Panels

TOPMed Imputation Server

The TOPMed Imputation Server offers genotype imputation for the TOPMed reference panel, which is the largest and most accurate panel available amongst the two imputation servers.

  • TOPMed (Version r2 2020)

See the reference panels documentation for details.

Michigan Imputation Server

The Michigan Imputation Server is a separate service with several additional reference panels available. Consult the relevant documentation for details.

Upload VCF files from your computer

When using the file upload, data is uploaded from your local file system to the TOPMed Imputation Server. By clicking on Select Files an open dialog appears where you can select your VCF files:

Multiple files can be selected using the ctrl, cmd or shift keys, depending on your operating system. After you have confirmed your choice, all selected files are listed in the submission dialog:

Please make sure that all files fulfill the requirements.

Important

Since version 1.7.2 URL-based uploads (sftp and http) are no longer supported. Please use direct file uploads instead.

Build

Please select the build of your data. Currently, the options hg19 and hg38 are supported. The TOPMed Imputation Server automatically updates the genome positions of your data (liftOver). The TOPMed reference panel is based on hg38 coordinates.

rsq Filter

To minimize the file size, the Imputation Server includes a r2 filter option, excluding all imputed SNPs with a r2-value (= imputation quality) smaller than the specified value.

Phasing

If your uploaded data is unphased, Eagle v2.4 will be used for phasing. In case your uploaded VCF file already contains phased genotypes, please select the "No phasing" option.

Algorithm Description
Eagle v2.4 The Eagle algorithm estimates haplotype phase using the HRC reference panel. This method is also suitable for single sample imputation. After phasing or imputation you will receive phased genotypes in your VCF files.

Population

Please select whether to compare allele frequencies between your data and the reference panel.

In case your samples are mixed from different populations, please select Skip to skip the allele frequency check. For mixed populations, no QC-Report will be created.

Mode

Please select if you want to run Quality Control & Imputation, Quality Control & Phasing Only or Quality Control Only.

AES 256 encryption

All Imputation Server results are returned as an encrypted .zip file by default. If you select this option, we will use stronger AES 256 encryption instead of the default encryption method. However, note that AES encryption does not work with standard unzip programs. If this option is selected, we recommend using 7-zip to open your results.

Start your imputation job

After confirming our Terms of Service, the imputation process can be started immediately by clicking on Start Imputation. Input Validation and Quality Control are executed immediately to give you feedback about the data-format and its quality. If your data passed this steps, your job is added to our imputation queue and will be processed as soon as possible. You can check the position in the queue on the job summary page.

We notify you by email as soon as the job is finished or your data don't pass the Quality Control steps.

Input Validation

In a first step we check if your uploaded files are valid and we calculate some basic statistics such as amount of samples, chromosomes and SNPs.

After Input Validation has finished, basic statistics can be viewed directly in the web interface.

If you encounter problems with your input data, please read this tutorial about Data Preparation to ensure your data is in the correct format.

Quality Control

In this step we check each variant and exclude it in case of:

  1. contains invalid alleles
  2. duplicates
  3. indels
  4. monomorphic sites
  5. allele mismatch between reference panel and uploaded data
  6. SNP call rate < 90%

All filtered variants are listed in a file called statistics.txt which can be downloaded by clicking on the provided link. More information about our QC pipeline can be found here.

If you selected a population, we compare the allele frequencies of the uploaded data with those from the reference panel. The result of this check is available in the QC report and can be downloaded by clicking on qcreport.html.

Pre-phasing and Imputation

Imputation is performed using Minimac4. The progress of all uploaded chromosomes is updated in real time and visualized with different colors.

Data Compression and Encryption

If imputation was successful, we compress and encrypt your data and send you a random password via e-mail.

This password is not stored on our server at any time. Therefore, if you lost the password, there is no way to resend it to you, and you will need to re-impute your results.

Download results

The user is notified by email, as soon as the imputation job has finished. A zip archive including the results can be downloaded directly from the server. To decrypt the results, a one-time password is generated by the server and included in the email. The QC report and filter statistics can be displayed and downloaded as well.

All data is deleted automatically after 7 days

Be sure to download all needed data in this time period. We send you a reminder 48 hours before we delete your data. Once your job has the state retired, we are not able to recover your data!

Download via a web browser

All results can be downloaded directly via your browser by clicking on the filename.

In order to download results via the commandline using wgetor aria2 you need to click on the share symbol (located right to the file size) to get the needed private links.

A new dialog appears which provides you the private link. Click on the tab wget command to get a copy & paste ready command that can be used on Linux or MacOS to download the file via the command-line.

Download all results at once

To download all files of a folder (for example folder Imputation Results) you can click on the share symbol of the folder:

A new dialog appears which provides you all private links at once. Click on the tab wget commands to get copy & paste ready commands that can be used on Linux or MacOS to download all files.