One of the best books I have read in my childhood is Panchatantra . The way it’s structured, the first story ends with one of its characters saying “let me tell you a story” and leads on to the second story and then it perpetuates like that till the end . I have a feeling my post on automation might end up as a micro-panchatantra 🙂
Yesterday morning in Delhi, I wrote about a fascinating conversation I had with a Banker about automation and the potential for associated job losses . A friend of mine who read that post pointed out to me that her fear was that unethical computing practices might make job losses happen faster for women than for men .
She is an IT expert herself . And while she is not a data scientist herself , she understands how machine learning etc work and how it can figure out patterns from data it has access to . Her precise point is that if the machine is fed past data – wouldn’t there be a high chance that the machine will learn to be as bad or worse than the people who discriminated against women in the past ? And then won’t it just make that bias happen at scale since software can be distributed globally quite easily ?
It’s a VERY fair question and something that has been asked by many others before her. Bias is a real problem in data science – be it in the source data or in the model . Both data and the humans working on creating the model can perpetuate the bias.
Ethical AI is a topic very close to my heart and I have written and spoken about it several times .
Without making it too complex or technical – there are ways to identify such bias in both data and models . And once identified , it can be countered . While I agree that it’s a big problem and will take a lot of effort in creating awareness and implementing in actual projects – it’s at least possible to mitigate .
But that might not be the kind of bias we need to worry about most when a lot of decision making gets automated . Bias against a category – like gender, residential zip code etc – can be identified and countered with some effort .
But what if the algorithm goes against hundreds of parameters and each (or some combination ) contributes a little bit to the final decision ? This is the more common scenario of scoring – like a credit score calculation that looks at timely repayments, balances , history , number of accounts held etc and then comes up with one number at the end .
Credit scores only look at limited parameters and it tells you the handful of reasons why your score is low . But in a job scenario – it could be the photo of you with a hunting gun in your twitter profile , some image on your T-shirt on your Instagram account , the words you use on your CV and LinkedIn profile and a million other things which individually don’t look bad but collectively may make an algorithm figure out that you are not a great hire . And it might be really complex for a human hiring manager to explain to a candidate why she won’t be hired since it’s hard to understand the inner workings of such an algorithm .
This problem too has some possible solutions – which again needs significant work to make sure they are put into practice . For example a company can have a policy that if AI is going to automate a business decision – it’s mandatory that it is explainable . For example a neural network that comes up with a result should be able to be represented as a simple decision tree that a human can read and make sense . And just like test coverage and security checks are mandatory for code to be pushed into production – CI/CD pipelines have ethical gates too before a model gets to production .
Now about the scale question. All computing has that issue – both good and bad gets amplified significantly ! The way I look at it – even if Bias gets perpetuated at scale , once it’s solved – the solution also gets perpetuated at scale . And unlike humans who don’t all have the same moral and ethical compass – AI can have unwavering standards in each replica once it’s told what those standards are. Now – whether we can define the standards of ethics is a hard question in itself . My current thinking is that we cannot do that . And consequently – we will leave some decision making to humans to define the standards on a case by case basis and hence bring back the very problems we are trying to solve !
Automated decision making has a potential long term problem that we may already be seeing a bit of today, which has nothing to do with ethics – I think it makes us humans a little less sharp .
If my phone runs out of battery , I am sure I can find my way to my destination via a paper map or by asking for directions or by finding out which way is east or something . That is because I was already driving a long time before phones started having a GPS app . But I doubt my young daughter – who will start driving in a couple of years – can do the same if her phone ran out of charge when she is on her way some place. I doubt she knows of paper maps or even that AAA exists 🙂 .
This is not a new problem . My dad routinely used to make 500 mile plus trips without a map with our family and none of us ever remember him not making the destination in time . I don’t have that ability in the least .
PS : Let’s see if someone else inspires a follow up on this . Yesterday I wrote that post while in a cab . Today I am writing from my plane ride from Delhi . If there is a sequel, I hope it’s while I am static somewhere 🙂