July 10, 2017
Brittany Crocker | Knoxville News Sentinel
Scientists at Oak Ridge National Laboratory are developing a secure platform for researchers to access the healthcare data of 22.5 million military veterans, accelerating the timeline for medical research breakthroughs.
Scientists from ORNL's Computational Sciences and Engineering Division started building the platform as part of the laboratory's involvement in a national partnership between the Department of Veterans Affairs and the Energy Department.
The partnership aims to address complex health issues that disproportionately affect veterans more than other communities, like prostate cancer, cardiovascular conditions and suicide prevention.
"We are creating architecture to allow for rapid advancements in medical research," said Edmon Begoli, the principal investigator for the project and the division's chief architect.
"Typically these can take five, 10 or even 15 years, but given the gold mine of data that we have, it will enable scientists to move much faster and to see the benefits much sooner than they currently can."
Largest in the world
The VA's healthcare data set is the largest and most comprehensive in the world, and it's still expanding. In 2011, the VA launched the Million Veterans Program to consolidate service members' genetic information with their health records and information on their lifestyles and exposures during their military service.
To participate, current and former service members donate blood for genome sequencing and fill out surveys on their lifestyles and military service history.
Energy Secretary Rick Perry, an Air Force veteran, donated blood to have his own genome sequenced into the database when he announced the Energy Department's involvement in May.
"We're bringing together the VA's unparalleled and vast database of healthcare and genomic data with DOE's world class super computing and data analysis," Perry said in a video announcing the project.
At first glance, the Energy Department and the VA might seem like unlikely partners. But, with records on more than 22.5 million current and former service members, the VA literally has more data than they know what to do with.
That's where DOE comes in. The Energy Department is the nation's steward for high-performance computing. Oak Ridge National Laboratory's Titan supercomputer is the fastest in the country.
"Nowhere will you find this large of a population with this much comprehensive healthcare data," said Shaun Gleason, ORNL's Computational Sciences and Engineering Division director. "This is a tremendous gold mine for doing discovery on, but you need the high-performance compute power to mine the data and analyze it. We're trying to enable these medical discoveries by building this platform to start with and making it available to the researches the VA wants to have access to it."
Begoli said teams of DOE and VA researchers are already active and ORNL expects to be able to start mining the data set in the fall.
"Hundreds of papers are being planned," he said. "There will be major discoveries and I say this with confidence because some of these abstracts are already hinting at some correlations that have been discovered. We expect rapid breakthroughs in medical research."