Your data can now be shared just a little bit safer, on the back of the success of the recent ACS hackathon.
Led by NSW Chief Data Scientist and ACS Vice President Dr Ian Oppermann, the initiative looked to uncover whether organisations could safely share data without compromising the identity of individuals.
The eventual winners of the competition, named team ‘Led Zeppelin’, made strides towards identifying the presence of personally identifiable data, developing a prototype application that allows a data custodian to visualise and adjust the amount of personally identifying information in a dataset.
“We’re really excited to have had the chance to work on such an important project,” said team member, Geof Heydon.
“It’s an incredibly valuable thing. As more and more smart city and digital economy things happen at the local and state government level, there needs to be much more of an understanding of how to handle data."
For Oppermann, the project follows the Privacy in Data Sharing – A Guide for Business and Government whitepaper, released last year.
“Today we have hundreds of data sets being combined with each other,” said Oppermann.
“In that scenario, it can become easy for records to be re-identified. That’s why, over the course of two years, the ACS Data Sharing Committee has developed a theoretical test for the presence of personally identifying data.
“Now we wanted to put that theory to the test, which is what this hackathon was all about – developing a practical application for identifying the presence of personally identifying information in a data set.
"f you can do that, then you can reveal whether data is safe to share.”
The task took both open data from government websites and synthetic datasets and challenged teams to test existing frameworks and produce data products
“We had no idea at the very beginning whether it was actually possible,” Oppermann said.
Why do we need this?
The advent of smart services has led to more and more linked datasets, Oppermann explained. As a result, it is now very possible for what was once innocuous data to become identifiable.
“Everywhere we go we create a digital footprint from the services we use -- from credit card transactions, and from mobile phones,” he said.
“All of that data, whilst deidentified and whilst being used by the service provider to optimise traffic or better schedule resources, still contains linked, deidentified data.
“We need to be able to answer the question of whether someone is personally identifiable in that dataset.”
So, is identifiable data a breach of an individual’s privacy?
“The challenge we’ve got is that whilst privacy is a right, and if you look at our human rights convention there are certainly issues that talk about privacy as a human right, but maintaining that right to privacy and delivering smart services is currently a challenge simply because we can’t tell how much personal information is in those datasets.”