Privacy in a data-driven world: possibility or pipe dream?

Let’s start with the obvious: today our lives are driven-by, and are dependent on, data. From travelling to work, paying bills, shopping online or working at our desktops, we constantly generate and consume data.

We can no longer separate ourselves from this digital domain, and for the most part we don't want to: our lives are easier, more productive, and ultimately more fulfilled as a result of the technology we rely on every day.

But like the idiom that reminds us that every coin has two sides, this is not without its caveats. Sometimes that data belongs to us -- either directly as it pertains to our individual details, or indirectly as its inference can circumscribe who we are.

And that's a problem.

Generally speaking, most people value their privacy. Though we frequently give our out personal information in return for services, such as with Facebook, when we do so it's by our choice.

We also give personal information to the government, such as with the ATO or the ABS.

In all cases we expect it to be protected and not shared, and in Australia that protection is enshrined by law in the Privacy Act 1988.

But especially with regards to government, sometimes that data can be helpful as it could allow for the delivery of better government projects, programs, or services.

Being able to better understand the minutiae of how Australians live and work can help to determine where new transports services need to go, or where schools and hospitals need to be built.

Currently, however, much this of potential is locked up as independent departments are unable to share data where personally identifiable information (PII) is involved.

And while there are techniques to anonymise data, this isn't foolproof -- one or more anonymised data sets can be combined to build a picture of an individual, allowing identification by inference even in the absence of direct PII.

This is a problem explored by a whitepaper released last year: Privacy in Data Sharing - A guide for Business and Government, by Dr Ian Oppermann, NSW's Chief Data Scientist.

It also offered theory for a solution, and the really exciting news here is that we've just started a directed hackathon series here in Sydney to turn this theory into practice.

The hackathon has brought mathematicians, data scientists, researchers, and coders among others to work together inspired by a real-world problem: there are some 20,000 minors across NSW in various government out-of-home-care programmes, and the sharing of data would allow for better and more efficient services as well as apply funding where it’s needed most.

However, the only person who should have knowledge of a minor is the case worker.

Data is deidentified for use by researchers, but when so much data is used, the question is if any child is actually identifiable and under what circumstances.

Can we come up with a way to enable the sharing of data for children in these programs while ensuring individual privacy is protected?

The aim at the end of the series is to produce an open-source product that can quantify the risk any given dataset poses for the identification of an individual, even after techniques like homomorphic encryption or privacy preserving linkage have been applied to obfuscate PII.

The NSW Government has already expressed keen interest to build the solution, and if it works, will undoubtedly be expanded to government services across all states in Australia.

But there’s an even bigger picture here -- this is a global problem that every developed country is tackling right now.

The monumental volumes of data created by people (or some cases, harvested) daily is only going to increase, and being able to maintain privacy -- whether it's a government or a business -- while leveraging the value of data is like the goose that laid the golden egg.

It's an incredible opportunity, and Australia is leading the way in developing a solution that could be a gamechanger globally for decades to come.