The Department of Health has pulled offline an open dataset of Medicare and pharmaceutical benefits claims after researchers found a way to decrypt and re-identify some of the information it contained.
A team of University of Melbourne researchers found “weaknesses in the encryption method” used by the department to obscure the ID numbers of medical providers in the dataset.
The dataset is a random 10 percent sample of Medicare patients and the claims they made for Medicare or pharmaceutical benefits scheme (PBS) between 1984 and 2014.
The Medicare benefits portion of data assigned encrypted ID numbers to each patient and medical provider they used, while the PBS portion of the data had only patient IDs.
The patient ID isn’t the same as the Medicare number on people’s cards and in any event the researchers did not try to decrypt them, concentrating their efforts instead on the identities of the medical providers such as GPs or specialists.
The researchers said they were able to identify the encryption algorithm used and “guess” key details, before using the rest of the dataset to validate their suspicions.
The researchers said that the case highlights the importance of understanding and testing the “mathematical details” of algorithms trusted to encrypt and protect sensitive information.
“The Australian Government’s open data program provides numerous benefits, allowing better decisions to be made based on evidence, careful analysis, and widespread access to accurate information,” the researchers said.
“We have some important decisions to make about what personal data to publish and how it should be anonymised, encrypted or linked. Details about the privacy protections should be published long in advance.
“They can then be subject to empirical testing, scientific analysis, and open public review, before they are used on real data. Then we can make sound, evidence-based decisions about how to benefit from open data without sacrificing individual privacy.”
Following responsible disclosure of the issue by the researchers, the Department of Health immediately withdrew access to the dataset on September 12.
It sought to assure the public and medical industry that no sensitive data had been compromised.
“The dataset does not include names or addresses of service providers and no patient information was identified,” it said in a statement.
“However, as a result of the potential to extract some doctor and other service provider ID numbers, the Department of Health immediately removed the dataset from the website to ensure the security and integrity of the data is maintained.
“No patient information has been compromised, and no information about the health service providers has been publicly identified or released.”
The Department of Health said it is undertaking a “full, independent audit” of the process of compiling, reviewing and publishing the data.
“The dataset will only be restored when concerns about its potential vulnerabilities are resolved,” the department said.
The Office of the Australian Information Commissioner (OAIC) had been made aware of the issue.
Australian Privacy Commissioner Timothy Pilgrim said he had opened an investigation of his own into the matter.
“The primary purpose of the investigation is to assess whether any personal information has been compromised or is at risk of compromise, and to assess the adequacy of the Department of Health’s processes for de-identifying information for publication,” he said.
“I welcome the decision of the Department of Health to immediately suspended access to the dataset.
“The results of my investigation will be published at its conclusion.”