Overcoming privacy concerns in medical research databases

By Melissa Fassbender contact

- Last updated on GMT

Researchers have developed a new system that permits database queries for genome-wide association studies, but reduces the chances of privacy compromises to almost zero. (Illustration: Christine Daniloff/MIT)
Researchers have developed a new system that permits database queries for genome-wide association studies, but reduces the chances of privacy compromises to almost zero. (Illustration: Christine Daniloff/MIT)

Related tags: Scientific method, Research

A new system developed by MIT researchers helps ensure privacy in genomic research databases by “slightly perturbing” analysis results.

The research at MIT’s Computer Science and Artificial Intelligence Laboratory and Indiana University at Bloomington was recently published​ in the journal Cell Systems.

According to the researchers, the new system reduces the chances of privacy compromises to almost zero - addressing one of the pivotal issues facing data sharing​ initiatives. 

Sean Simmons, an MIT postdoc in mathematics and first author on the new paper, and Bonnie Berger, a mathematics professor at MIT and the corresponding author on the paper, told us the system is based on the ideas of differential privacy.

The basic concept is that, by slightly perturbing analysis results, one is able to guarantee privacy for research participants​,” the researchers said.

Though these ideas had been applied to some genomic statistics, the existing technologies could not deal with the diverse ancestries present in many real world genomic data sets that are known to be critical to accurate genomic studies. Our goal was to develop methods that overcame this hurdle​.”

According to the researchers, the most challenging part was determining how to overcome the effect of outliers.

If one individual is very different from all the other individuals in a study, their inclusion can greatly affect the result, leading to privacy loss​,” said Simmons and Berger. “We dealt with this by slightly modifying our definition of privacy to focus on protecting information about private disease status—a realistic goal as it is the data that is most sensitive​.”

How does it work?

The system “perturbs the results​” of a genomic analysis to ensure privacy, yet is still accurate enough to retain useful information.

In particular, it allows users to determine if a particular genomic alteration is correlated with a disease of interest in a dataset, or to produce a list of locations in the genome that are highly associated with the disease​,” the researchers said.

The method is able to overcome issues that cause false positives in genomic studies as well, unlike previous methods. Specifically, it corrects for population stratification – false positives due to different ancestries in a sample.

While the research addresses privacy issues in genomic databases, the researchers said the ideas of differential privacy can be applied to almost any area where private human data is collected.

One reason that data is not shared is due to concern over the privacy of individuals in the study​,” explained Simmons and Berger. “Our approach helps overcome that particular roadblock​.”

Related news

Show more

Related products

show more

Overcoming Challenges of Clinical Data Review

Overcoming Challenges of Clinical Data Review

PerkinElmer | 10-Jun-2021 | Technical / White Paper

Clinical data review is intrinsic to clinical development, assuring patient safety, determining drug efficacy, and assessing data quality. It involves...

How clinical trial software can optimize trials

How clinical trial software can optimize trials

Formedix | 09-Jun-2021 | Technical / White Paper

This article explains the different types of clinical trial software available, and how it can be used to optimize the end to end clinical trials design...

Transforming Clinical Development

Transforming Clinical Development

PerkinElmer | 01-Jun-2021 | Technical / White Paper

The estimated cost of bringing a drug to market in the U.S. according to JAMA is $1 billion.1 The extreme cost of clinical trials urge biopharmaceutical,...

Related suppliers

Follow us

Products

View more

Webinars