Data Security

For TOPMed Imputation, data is transferred to a secure server hosted on Amazon Web Services. As of May 2023, we have completed a rigorous security review and received a federal Authorization to Operate (ATO) from NIH/NHLBI. A wide array of security measures are in force:

  • All traffic to and from the server is secured with HTTPS.
  • Input data is deleted from our servers as soon it is not needed anymore.
  • We only store the number of samples and markers analyzed. We don't ever "look" at your data in any way.
  • All results are encrypted with a strong one-time password. We do not retain this password: only you can read the results.
  • After imputation is finished, the user has 7 days to download the results, after which they are automatically deleted.
  • The complete source code is available via public Github repositories:
  • Imputation pipeline
  • Web application

Who has access?

To upload and download data, users must register with a unique e-mail address and strong password. Each user can only download imputation results for samples that they have themselves uploaded; no other imputation server users will be able to access your data.

What security or firewalls protect access?

A wide array of security measures are in force on the imputation servers:

  • All stored data is encrypted at rest using FIPS 140-2 validated cryptographic software as well as encrypted in transit.
  • Access controls follow the principle of least privilege. All administrative access is secured via two-factor authentication using role-based access controls and temporary credentials.
  • Network access is restricted and filtered via web application firewalls, network access control lists, and security groups. Public/private network segmentation also ensures only the services that need to be are exposed to the public internet. All internal traffic and requests are logged and scanned for malicious or unusual activity.
  • Advanced DDoS protection is in place to assure consistent site availability.
  • All administrative user activities, system activities, and network traffic is logged and scanned for anomalies and malicious activity. Findings are alerted to administrative users.

What encryption of the data is used while the data are present?

Imputation results are encrypted with a one-time password generated by the system. The password consists of lower characters, upper characters, special characters and numbers with max. 3 duplicates.