A successful Department of Health blockchain proof-of-concept (PoC) project has paved the way for government bodies to let researchers access sensitive data sets without the “pernicious problem” of risking data being deidentified as in the past.
The project, which software developer Agile Digital built on top of the secure government cloud provided by Vault Systems, saw the creation of a one-way ‘data diode’ that allows researchers to query Department of Health databases without ever actually accessing the data, which remains securely protected in situ.
The system minimises the exposure of sensitive data to malicious activity.
Data queries are filtered and registered on the Blockchain, effectively creating a “shared laboratory notebook that is indelible and irrefutable”, Agile Digital executive director David Elliot told Information Age.
Individual researchers can be granted or denied access to the new system, which capably replicates the querying capabilities of the department’s existing Teradata data warehouse.
Blockchain provides an additional layer of control, ensuring that all interactions with the sensitive data are closely documented – and that the provenance of derivative research work can be irrefutably documented when research leads to publication.
“There is no way you could ever give a public health researcher access to these Teradata databases because that is the actual data,” Elliott explained.
“By using this approach, we can replicate the same outcomes with a much more secure system in which experiments get turned into queries.
“We protect the data, but allow the science to progress.”
Fighting back against reidentification
The project addresses issues of data control that have previously created problems when external parties download anonymised data sets and cross-correlate them with other data sets to reidentify the original data.
Deanonymisation became a headache for the government after the August 2016 publication of 30 years’ worth of anonymised Medical Benefits Scheme (MBS) data, representing about 10 percent of Australia’s population.
The goal was to improve transparency and help researchers conduct analyses on MBS performance – but the data was quickly pulled after University of Melbourne researchers demonstrated that they had been able to decrypt and identify numerous individuals by cross-matching the data against other publicly-available information.
The risk of reidentification of private data has become such a concern for the Australian government that it moved to criminalise deanonymisation – a possibility about which University of Melbourne researchers took the middle road.
Reidentification has become a favoured project of data scientists, with significant success: a 2012 French research project, for example, found that Facebook users can be reliably identified just by analysing their likes.
“I don’t see how the department could ever release data sets without controlling them,” Elliot said. “By the time they anonymise it to the point where nobody could pull correlations from it, it would just be a string of zeroes.”
Data science as a service
Despite enthusiasm over Blockchain and the Bitcoin cryptocurrency on which it is based, a recent Gartner analysis highlighted the scarcity of actual Blockchain deployments.
Just 1% of 3,138 chief information officers said they had already deployed a Blockchain project, and 8% were in short-term planning or actively experimenting with the technology. Fully 77% had no plans for Blockchain at all.
With the PoC proving reliable, however, the architecture promises a new mechanism for broader availability of sensitive corporate government data – which has previously been published in small and well-defined chunks through initiatives such as Data.gov.au.
Sensitivities have prevented sensitive health, financial and other data from being made available in this manner, but the Vault Systems cloud platform – which recently received government approval to handle protected-level classified data – allowed the blockchain to be brought online quickly without forcing developers to build their own secure environment for it.
“Concerns about sensitive data essentially mean that government doesn’t have many ways to innovate,” Vault Systems founder and CEO Rupert Taylor-Price said.
“By using this infrastructure, the PoC becomes affordable, and they can start setting up services within hours. Otherwise, this would have been a 2-year project and the first 12 months would be spent building and assessing the security of the infrastructure.”
The PoC is currently limited to internal Department of Health staff, but its successful completion is laying the groundwork for other secure applications that involve curated access to valuable but sensitive data sets.
“If you allow people to come in and be creative, they’re going to create cool things,” Elliot said.
“And because everything they do is notarised in sunshine, nobody can hide in the dark and try things out quietly.
“With the right platforms, people will do things that we haven’t thought of.”