For the past three years, a dedicated team of computer scientists has been working to solve a problem that is both exceedingly difficult and continually relevant:
If you combine sets of deidentified data, will you still be able to find specific information about individuals?
Previously, researchers from the University of Melbourne have found ways to match open healthcare data to prominent sports people and MPs.
The government estimates valuing its own data at around $25 billion per annum, and it wants to continue sharing its vast wealth of data to help with research and service delivery – but it needs to make maintain the privacy of those whose data is being used.
Through a series of workshops and directed ideations – AKA ‘hackathons’ – ACS has developed series of reports offering a potential solution to the data sharing problem in the form of a data sharing framework that it hopes will be used to develop future standards for sharing open data.
The latest report, Privacy-Preserving Data Sharing Frameworks, was launched recently and takes a big step toward proving the efficacy of a ‘personal identification factor’ that measures the potential risk a person has of being re-identified from an anonymised dataset.
By engaging with a broad community of computer scientists, the team working on solving this data sharing problem sought a broad range of expertise.
Sonya Sherman is a data and information strategist who has worked as a director in the NSW government. She has seen the data sharing framework through many stages of its development and said the methodology of peer involvement has been a big part of its success.
“The strength of this group is in its shapeless form,” Sherman said.
“We have seen how different people arrive and they ask new questions.
“Everyone has been so open to hearing about different ideas and thoughts and it helps that there are no rules about who's in or out.”
Chris Radbone is the Associate Director of SA NT DataLink.
He wants to see the use of data help improve service delivery across Australian organisations.
“We can use data in the child protection system, for example, to know more specifically who has checked in on who, and when,” Radbone said.
“This can be checked up against other outcomes and metrics.”
Given the sensitivity of data in systems that involve some of the nation’s most vulnerable people, a balance must exist between preserving the useful aspects of data while deidentifying individuals from it.
“Of course there needs to be a trade-off between utility and obfuscation,” Radbone said.
“When that balance is achieved, it will aid the ability for organisations to give more value for publicly-funded activity and data.”
ACS CEO, Andrew Johnson; ACS Vice President, Ian Oppermann; and NSW Minister for Customer Service, Victor Dominello launching the Privacy-Preserving Data Sharing Frameworks report.
Benefits to health
Health is another area where better provisions to allow for data sharing can help improve overall health and reduce the load on the public health system.
But personal information about your health status is highly sensitive.
Steve Woodyat, CEO of IHQ Analytics, said developing a willingness to share health data – based on assumptions and knowledge that it is appropriately anonymised – will aid Australia’s healthcare system.
“Would you give away your health data to help other people further down the line?” he asked.
“What if it's not your data but secondary data based on you?
“Because this is about finding patterns in data, a lot of it needs to be gathered.
“Then that data, combined with analytical tools, can help provide tailored suggestions for how you can stay healthy.”
From early in the project, NSW Minister for Customer Service Victor Dominello has shown a keen interest in the data sharing frameworks and its ramifications.
He said he wanted to make life better for people and sees data sharing as a the most efficient way to achieve that goal.
But Dominello is aware that governments need to continue gaining trust that data sharing initiatives will be used appropriately.
“There needs to be a balance between trust and delivery,” he said.
“Part of that is making sure that we build in an understanding that this is not just a process or a dataset, but a person.
“And data can help with that. We’ve already seen that rough sleepers can be better known across different agencies who share data.
“A person can arrive at St Vincent de Paul’s, for example, and a staff member can greet him and immediately know more about his context.”
But it’s not just government services who may find a benefit from the production of data sharing frameworks.
Vidya Nayak has had a long career in IT as an enterprise architect.
She has seen the change in how systems operate and thinks there is a new need for better standardisation with data sharing
“Before now, most systems were discrete but we are seeing more of them working together as digital platforms,” Nayak said.
“Standards have to be developed to help build a canonical form and better appreciation abstraction of the data.”
Vidya sees the Data Sharing Framework as a way to engage stakeholders with different modes of operation.
“It is one thing to understand data analytics and another to implement it – there needs to be some level of stakeholder buy-in.
“Data is not a not simple thing, it takes collaboration to get right.”