Improving AI-Enabled Medical Devices with Data Consortiums and Encryption Technology
By: Gabrielle Hirneise
July 14, 2021
Categories: AAMI News, Medical Device Manufacturers
Artificial intelligence (AI) offers a promising means by which to overcome the physician shortage and increase diagnostic and prognostic success in the healthcare industry. With research and healthcare institutions sitting on troves of data, how might the technology be used to its fullest extent? The success of the algorithms in AI-enabled medical devices is based on a variety of factors.
The efficacy of an AI algorithm’s accuracy is based on the quantity and quality of the training data available. In the clinical sense, AI software is trained with preexisting patient data. However, due to privacy regulations, patient data are not easily transferable, meaning that most healthcare institutions are limited to the data they have on hand (i.e., “first-party data”). As a result, AI software applications likely will be catered to a specific population and may not be successful in diagnosing or evaluating individuals with different physiological characteristics.
“As we look at generally how AI is being done, there are significant issues in healthcare as it relates to AI algorithm development, one being that people are using mostly first-party data, even large healthcare institutions,” said Riddhiman Das, cofounder and CEO of TripleBlind.
In referencing a specific partnership with a large healthcare institution that has accumulated more than 136 years of patient data, Das said “those algorithms are really accurate on the populations they see, which means an older, whiter, sicker, and wealthier population than the average population. When I tried it—and I am not the profile of the patients that have historically gone to these institutions—it was not very accurate on me because I have different genetics and biological features that were not represented in that training set. For an AI algorithm to be optimally accurate and precise and be able to generalize to real-world data, first-party data are just insufficient.”
Das witnessed this inherent bias firsthand when developing his first company, which utilized AI software and front-facing phone cameras to identify the unique blood vessel patterns in the whites of humans’ eyes. These blood vessel patterns serve as a thumbprint-like form of identification. However, because these blood vessel patterns differ for different populations, Das had to devise a way to acquire a more diverse training dataset, while also abiding by the stringent privacy regulations of other countries.
“We actually ended up having to ship engineers to those countries literally with a laptop, so that they could work in those countries under strict confidentiality and privacy regimes,” Das said.
This challenge brought about the idea for TripleBlind, which uses novel encryption technology to enable the sharing of third-party data for training AI algorithms in the world of healthcare and finance.
“TripleBlind enables access to third-party data while still enforcing privacy in a way that the data cannot be abused,” Das said. “The way we do this is basically encrypting the data so that only the authorized operation may be performed on them, and by virtue of the encryption, the data are deidentified.”
Because the data cannot be reverse engineered to determine private patient information, the encryption technology “renders the data outside the scope of GDPR or outside the scope of HIPAA.”
However, encrypting the data alone does not suffice.
“You have invested a lot of money and resources and time and effort into building this algorithm. If I license it to you, you will look under the hood and see how it’s built,” Das added.
As a result, TripleBlind technology encrypts the AI algorithm itself, in addition to encrypting data used to train it. Built into TripleBlind encryption technology is the blind data utilization toolbox, which allows for the use of third-party data without ever touching it.
“And the nuance there is: neither does TripleBlind. We are not a custodian of data—we are just providing you with the tools to be able to source data, license algorithms, and build new AI algorithms,” Das said.
“You are using data while being blind to it (you know the size, shape, and distribution of the data but are not able to see it), and on the other side, TripleBlind is not able to see it. Lastly, we are also keeping the algorithm safe, so that it can be licensed without any fear of leaking data or IP,” Das added.
With this new capability for sharing, there is potential for consortiums of open-source data, which would allow any healthcare organization around the world to use a vast, diverse set of data to train its algorithms.
“We provide the infrastructure for healthcare organizations to be able to work with any number of health algorithm developers without ever compromising the privacy of their patients,” Das said.
Typically, this would be a costly process that would degrade the value of the training data.
“Source institutions have to first anonymize the data, then have an expert certify that it was anonymized, and then set up data-sharing agreements and business associate agreements,” Das said. “Anonymization is about removing your age, sex, and other clinically relevant variables, and if there is anything like genetic data or electrocardiograms or biomarkers, it’s impossible to deidentify because if I anonymize my genetics, I am no longer me.”
Fortunately, TripleBlind can deidentify data without compromising the integrity or accuracy of those data.
“The big potential in AI is to be able to take the knowledge and wisdom of the expert and world’s best doctors and democratize and have that be accessible all over the world,” Das said. “Our mission for the next five years is to enable data liquidity for the development of novel AI algorithms in healthcare.”