Could Cambridge Analytica happen again?

Welcome to the Information Age 6-part series on AI ethics, written by Australia’s foremost experts in this field. Here in Part 1, Peter Leonard and Toby Walsh look at the Cambridge Analytica debacle and ask: Could it happen again?

Widespread adverse comment following the Cambridge Analytica revelations prompted assurances by Mark Zuckerberg to the effect that Facebook now understands that with great data comes great responsibility.

At the same time, concerns have been raised about diverse applications of data automation – automated decision making affecting humans; unaccountable robots; excessive and intrusive surveillance; opaque, unreliable or discriminatory algorithms; online echo chambers and fake news.

Many of these concerns are also raised about AI. In addition, rapid developments in AI have prompted a policy debate about whether we are skilling a workforce to work with technology, and whether AI will deliver benefits to few while many citizens are left behind.

These concerns are exacerbated by a decline of faith in public policymaking: trust of citizens in institutions and governments is at historic lows.

Can we ensure that what have we learnt from Cambridge Analytica is applied to address these challenges?

Big data ethics

Ethical challenges in application of data analytics are nothing new.

To take one relevant example, only three years ago, Facebook data scientist Adam Kramer, and Cornell University social scientists Jamie Guillory and Jeff Hancock, published a peer reviewed report on their A/B study that experimentally modified the Facebook feed algorithm for 689,003 people.

The authors demonstrated that changing the negative or positive emotional valence of posts on a user’s News Feed affected the emotional valence of the posts made by the user after seeing those posts.

This supported the hypothesis that ‘emotional contagion’ – spreading emotional states through social contact – occurs on a massive scale, albeit with relatively small individual effects, in social networks.

Publication of the report of this study launched a debate about big data ethics, albeit a debate without any of the media coverage and public attention recently focused on Cambridge Analytica.

One criticism made of the study was that Facebook users whose feeds were altered were not asked to consider this study and invited to provide informed consent to participate in the study. Facebook suggested that its users did so by inferred consent to Facebook’s terms of use, which were then significantly more opaque than today.

Some critics questioned the ethics of platform providers experimenting with the emotional state of their users.

Other commentators suggested that the differences between this study and common online marketing practices were that the experiment was not selling anything and that results were published in a peer reviewed scientific journal.

What, no ethics review?

In any event, the study was not subjected to any form of ethics review.

The study probably was not required to be reviewed because the US ‘Common Rule’ only requires ethics review of ‘intervention’ in the life of human subjects by way of ‘(human) research’.

Some critics noted that the conduct of the study illustrated a broader concern, suggesting that the data science community was not familiar with the ethics regulation found in other science and technology communities.

These critics suggested that data scientists wrongly thought that many of their data projects concerned systems and devices, not people, and therefore were human-related ethical concerns.

Of course, many data driven systems and devices differentiate between humans having regard to individual behaviour or inferred likely behaviour. T

his rightly raises issues as to whether any adverse effects on some individuals, whether by inclusion or exclusion, have been properly considered.

Data scientists come to this new field with diverse training that often does not include any training as to ethics, social science or psychology.

The Facebook emotional contagion study illustrates the danger that data scientists and technology developers may not see, and avoid or mitigate, ethical concerns with projects which affect humans.

Should individuals have a ‘right to know’ how information about them is used to manipulate their emotions?

Or that they are being treated differently to other individuals?

How can social responsibility and other ethical concerns be addressed without the slow and complex processes for medical research ethics review?

The power

Cambridge Analytica magnified concerns already raised by the Facebook emotional contagion study.

It grabbed headlines and social media attention because the potential impact was demonstrated through the backstory of its alleged role in delivering the White House to a political outsider.

Public attention also raised the issue of whether Facebook users should know how their Facebook profiles are being used.

A further issue was whether Facebook knew, or as a data custodian should have taken active steps to check, what Cambridge Analytica was up to.

Was Cambridge Analytica really so different to the emotional contagion study conducted by Facebook in 2014?

The middle ground

Generally only medical research and other ‘research’ conducted by public institutions through use of public funds involving humans and animals must be subject to review and oversight of a research ethics committee.

That leaves development and commercialisation of most products and services outside formal ethical review.

Many products and services do not involve collection, use or disclosure of personal information about identifiable individuals and are therefore outside data protection laws.

If information is being used but about it is not about identifiable individuals, the use of that information may not be subject to any form of privacy impact assessment.

Although a privacy review does not formally include consideration of non-privacy ethical concerns, often these are picked up when uses of personal information are reviewed.

But no personal information, no privacy review.

This leaves a sizeable middle ground.

The harvesting and use of personal information about Facebook users by Cambridge Analytica probably was not in that middle ground, because those activities took place without knowledge active consent of Facebook users.

However, it was suggested that given narrow privacy coverage of privacy related laws in the USA, knowledge and active consent was not required.

In any event, it was argued that there was no requirement for ethical or privacy review of what Cambridge Analytica was up to – that this application was in the middle ground.

But within this middle ground – and then outside current requirements for review – lie many applications of algorithmic decision making, and uses of AI based products and services, both in the business sector and in government.

Concerns in this middle ground include social equity and fairness, discrimination, lack of transparency, lack of accountability, intrusive surveillance and failure to properly warn or disclose biases or other limitations in reliability of outputs or applications.

These issues will rapidly escalate in scale, complexity and variety as the range of applications of machine learning and AI continue to rapidly expand.

So how should we address these problems without sacrificing many of the benefits of machine learning and AI?

Practical ethics

Most studies of AI ethics rework lists of principles for ethical analysis but do not assist operationalisation of those principles.

Practical ethics requires methodologies, tools, processes and lexicons that prompt sensible discussions within laboratories and other workplaces about social equity, fairness and ethics.

Design and development teams need to be empowered to have these discussions.

Discussions must engage different views and subjective opinions. Teams may need to bring outside advocates into these discussions, or try to synthesise viewpoints of a broader cross-section of society.

These discussions need to be sufficiently structured and formalised to reliably and verifiably happen.

The tools used to inform and guide these discussions should not be overly intrusive and formulaic, or review will become be a matter of box ticking, form over substance.

The processes must be agile and timely enough to not slow down development and commercialisation.

There may also be business benefits in pausing to frame and consider ethical questions.

If Facebook didn’t learn from the adverse comment following the emotional contagion study, will Facebook learn from the far greater business impact of Cambridge Analytica upon Facebook’s market capitalisation and through loss of trust of Facebook users?

Businesses and government agencies endeavouring to be socially responsible should not require their own make-or-break moment to spur uptake of ethical assessment of design and development decisions.

Sensible ethical framing can get buy-in by executives and other decision-makers by demonstrably yielding value by reducing subsequent rework when problems are later discovered.

So many questions

How much has it cost Facebook to deal with the problems exposed through the Cambridge Analytica revelations?

How many products and services get beta released into markets without first considering social impact and user issues, and then require costly rework to address issues first identified in-market?

How many prospective customers are never gained because accessibility issues have not been considered?

How many machine learning and AI applications will not achieve acceptance because inadequate transparency is engineered into those applications and they are not accepted because humans can’t properly ‘interrogate the algorithm’ to understand biases and other reliability issues?

Should humans trust machines that fundamentally affect their lives and security when it is not clear which provider takes responsibility for which aspects of a system and whether issues of over-reliance on not fully reliable products are not properly addressed?

What have we learned?

It may be that Cambridge Analytica teaches us nothing new.

But it is reasonable to hope that this controversy highlights the ‘gap’ between data privacy and the ethical review of research involving humans and animals, and to fill that gap by taking the best parts of privacy assessment and ethical review.

We need to quickly move from abstract statements of high ethical principles.

We need to empower diverse humans in research and development teams to fill that gap by delivering to them sound methodologies, tools and lexicons for ethical decision making.

Many businesses are now mature in building privacy by design and information security by design into their research and development.

Very few businesses or government agencies apply social fairness, social responsibility or transparency by design and by default into planning of products and services.

Ethics by design and default is too important to not do well.

Let’s get it right, quickly.

Peter Leonard is a business lawyer and economist and principal of Data Synergies, a consultancy to data driven business and government agencies.

Toby Walsh is Scientia Professor of Artificial Intelligence at the UNSW Sydney.

Peter Leonard and Toby Walsh are members of the ACS AI and Ethics Technical Committee which is endeavouring to do (quickly and well) what this article says needs to be done.

Read the entire AI Ethics series

Part 2: Ethics-embedding autonomous systems

Part 3: Why Facebook and Google are b@s^a&d$

Part 4: Artificial intelligence has quietly invaded our workplaces

Part 5: Is AI a match made in heaven?

Part 6: Google doing the right thing