Machine learning has the potential to reshape our world but biased algorithms cast a long shadow, says Mivy James. She explains why only algorithms free of human prejudices will unlock the full benefits of this new technology.
I don’t know about you but I increasingly depend on Amazon. It’s not perfect – far from it - but it sure is a convenient way to shop when you’re juggling a busy work and home life.
One of the reasons I’ve found it so useful is its high degree of customisation – one-click shopping and bespoke product recommendations are huge time savers. Its music service is good too – apart from when its playlist recommendations mix my tastes and those of my son’s – Baby Shark is always a bit of a shocker when out running.
This is one, albeit minor, example of how Amazon’s algorithms are far from all conquering. Of far greater concern is how the facial recognition it is selling to law enforcement is falling short. New research has found that it misidentified the gender of darker-skinned women in about 30% of their tests. This is because the sets of photos on which the algorithms were trained were heavily favoured towards white men.
This kind of thing is more common than you might think. Although Machine Learning and Artificial Intelligence (AI) are supposed to automate decision-making in a neutral and rational way, their fuel is data – and its data which reflects the humans – and their biases – which created it.
Let me explain.
Unravelling an algorithm
There are two stages to machine learning – training and inference. Training is when an algorithm learns patterns based on a set of data and can then draw comparisons, especially when compared to algorithms trained on other types of data. And inference is when the algorithm applies what it has learned and it is here where bias can become clear.
Funnily enough this kind of thing is not unheard of in IT.
I’m showing my age here but when I was studying computer science GCSE, I remember my teacher getting very excited and waving his arms around talking about GIGO, which stands for "Garbage In, Garbage Out" – implying that bad input will result in bad output. If humans give computers garbage data to begin with then we can only expect to get garbage out, no matter how clever the algorithms are.
Back to binary
The existence of bias is undermining one of AI’s perceived strengths – the fact that it should offer a binary and neutral judgement, service and recommendation free of human foibles. But the good news is that steps are available to prevent its further expansion – starting at source.
We need to look very carefully at the data sets themselves. It is just not good enough for them to come from just one source – they need to be diversified in terms of their make-up and their location. If you get your data set from a predominately white area it’s unsurprising that it would have racial bias in it. So you need to source it from a wider sample that is more representative.
We also need trained data ethics specialists to oversee the data that is being used to train algorithms. They need to be able to intercede and check for inherent bias and know what to do if it’s discovered – such as expand the data set to make sure there is better representation.
Trusted to deliver
There has been much in the press lately about bias and algorithms.
This means that it is incumbent on all of us who work in tech to combine our excitement about AI’s potential with a note of caution about the foundations on which it rests. The good news is that some progress is being made. Google, for example, has been placing heavier emphasis on creating a more gender-equal image library. So now, if you search for images of CEOs and leaders there are a lot more pictures of women than there were a few years ago.
But there is always more to do. After all, AI and machine learning won’t be widely adopted until they can be trusted – and it can’t be trusted if it amplifies human prejudices.
About the author
Mivy James is Head of Consulting for National Security at BAE Systems Applied Intelligence.