Molecule of DNA forming inside the test tube equipment.3d rendering,conceptual image.

DNA test tube (© Connect world - stock.adobe.com)

DAVIS, Calif. — Online services that allow users to upload their genetic information, learn about their genealogy, and identify long lost family members have become increasingly popular in recent years, and for good reason. Who wouldn’t want to learn more about where they came from? As more and more people continue to share their genetic information with these public databases, they may be opening themselves up to a form of data theft they probably didn’t even know was possible.

These services may be vulnerable to a few different variations of “genetic hacking,” according to a new study conducted at the University of California, Davis. By uploading certain DNA sequences, the research team say, it may be possible for hackers to collect the genomes of many people in a database or successfully identify individuals with specific genetic variants linked to traits like Alzheimer’s disease.

“People are giving up more information than they think they are,” comments professor Graham Coop in a release. Coop would go on to add that your genome isn’t like a stolen credit card, you can’t just cancel it and order a new one.

To be clear, researchers say these potential vulnerabilities do not apply to for-profit DNA sequencing companies, in which users must submit a sample of their own DNA in order to be granted access to the service’s database. Public databases, though, allow anyone to upload any DNA sequences and search for other users with matching genes.

These public DNA databases operate by using software that compares all of the DNA sequences uploaded by users with sequences already stored in their database. Every person’s genome is inherited from their ancestors, both relatively recent and from generations ago. The bigger pieces of a genome usually come from more recent family members, and as generations go by matching genealogical sequences get cut down into smaller pieces. So, if a user were to find another DNA sequence in one of these databases with large chunks similar to their own, it would likely mean the two sequences, and individuals, share a recent ancestor.

The research team identified three strategies malicious individuals could use to obtain much more information from a public DNA database than a few long lost distant family members. The three approaches are: IBS (identical by sequence) tiling, IBS probing and IBS baiting.

IBS Tiling: A hacker would upload numerous genomes easily found in research databases, and look to see which ones match up with other genomes within the public database. If enough matching tiles are found, a person’s genome could conceivably be pieced together.

IBS Probing: This approach could be used to find people who carry a specific genetic variant. The study used a gene tied to Alzheimer’s as an example. In this approach, a fake genome with a DNA sequence unlikely to match up with anyone would be created, that is, except for one small section of the sequence that would match whichever gene the perpetrator is interested in. Any matches within a public database for this falsified genome would reveal people with this specific gene.

IBS Baiting: This strategy tricks a class of algorithms used to identify relatives in some public databases. The study’s authors estimate that with as little as 100 uploaded DNA sequences, a hacker could get his or her hands on essentially all of the genetic information stored in an entire database. The research team even performed their own test of this method on the GEDMatch database; using only DNA sequences they had uploaded, they were able to confirm that IBS baiting can be used to find specific genetic variants within public databases.

All three of the strategies could conceivably be carried out by an individual with both computing and genetics knowledge, such as a graduate student.

“The good news is that it’s quite preventable,” comments postdoctoral researcher Michael “Doc” Edge.

Researchers lay out how direct-to-consumer genetics services can easily stop these attacks in their study, and say they’ve already shared their findings and suggestions with a number of leading services. However, they report receiving “varied” responses in return.

Anyone using these services should be aware of the potential risks involved and just how much information they may be making accessible, the study’s authors conclude.

The study is published in eLife.

About John Anderer

Born blue in the face, John has been writing professionally for over a decade and covering the latest scientific research for StudyFinds since 2019. His work has been featured by Business Insider, Eat This Not That!, MSN, Ladders, and Yahoo!

Studies and abstracts can be confusing and awkwardly worded. He prides himself on making such content easy to read, understand, and apply to one’s everyday life.

Our Editorial Process

StudyFinds publishes digestible, agenda-free, transparent research summaries that are intended to inform the reader as well as stir civil, educated debate. We do not agree nor disagree with any of the studies we post, rather, we encourage our readers to debate the veracity of the findings themselves. All articles published on StudyFinds are vetted by our editors prior to publication and include links back to the source or corresponding journal article, if possible.

Our Editorial Team

Steve Fink

Editor-in-Chief

Chris Melore

Editor

Sophia Naughton

Associate Editor

1 Comment

  1. Johno says:

    Am I missing the threat? Should I be concerned if someone discovers I have a possible disease link or my relatives do? Please fill me in if I’m clueless.