Researchers have developed a method to quickly and accurately identify people and cell lines from their DNA. The software could be used to flag mislabeled or contaminated cell lines in cancer experiments, a major reason that studies are later invalidated.
The software is designed to run on the MinION, an instrument the size of a credit card that pulls in strands of DNA through its microscopic pores and reads out sequences of nucleotides, or the DNA letters A, T, C, G. The device has made it possible for researchers to study bacteria and viruses in the field, but its high error rate and large sequencing gaps have, until now, limited its use on human cells with their billions of nucleotides.
In a two-step process, the MinION and the abundance of human genetic data now online can be used to validate the identity of people and cells by their DNA with near-perfect accuracy. First, the MinION is used to sequence random strings of DNA, from which individual variants are selected. These nucleotides vary from person to person and make them unique. Then, a Bayesian algorithm is used to randomly compare this mix of variants with corresponding variants in other genetic profiles on file. With each cross-check, the algorithm updates the likelihood of finding a match, rapidly narrowing the search.
Tests show the method can validate, within minutes, an individual’s identity after cross-checking between 60 and 300 variants. To do this, the MinION matched the readout of the subject’s genome, gleaned from a sample of cheek cells, with a reference profile stored among 31,000 other genomes on the public database.
The re-identification technique — MinION sketching — parallels the brain’s ability to make out a bird from a few telling features in an abstract Picasso line drawing. The MinION’s genetic sketch of a cell sample is compared to a growing database of sketches — similarly incomplete genetic profiles produced by at-home DNA-test kits like 23andMe, and donated to science by consumers.
The use of misidentified or contaminated cell lines in medical research is blamed for as much as a third of the estimated $28 billion spent each year on studies that can’t be replicated. Lacking the expensive machinery needed to validate cell lines on their own, most researchers either skip validation or ship their cultures to specialized labs that can delay important findings and treatments.
Watch a video demonstrating the software on Tech Briefs TV here. For more information, contact Kim Martineau at