[This series of posts explores how we at Sesh are discovering the principles of building an ethical artificial intelligence through practical means.]
We make decisions every day that affect our lives and those around us. Generally we try to make our lives better, while not breaking any laws or angering whichever god we happen to have been taught to follow.
However, people do not refrain from killing or harming each other because a legal or religious book tells us not to. Instead, we choose not to because of a common set of values and a common avoidance of pain.
In game design, we break people into types of players that include social gamers, explorers, winners, and killers. Looking at these last two, winners and killers, they seem almost the same: they both want to come first. The difference, however, lies in their motivation: the first wants to win the game, the second wants everyone else to lose. Thankfully there exist a lot more of the former than the latter.
In real life, most people do not want to do harm. We stay in our lane, avoid conflict and, where possible, try to come out on top. Of course we do have narcissists, psychopaths, and sociopaths, but generally, people make intentional decisions not to inflict harm on one another.
So where does this latent understanding come from and how do we teach it to machines?
The four laws
Isaac Asimov, one of the greatest science and science fiction writers of all time, wrote dozens of books and hundreds of stories addressing this problem. His universe, spanning thousands of years, dealt with how humans and robots interact. In fact, Asimov even introduced the word ‘robotics.’
(Note: spoilers ahead. if you want to read Asimov’s master works without foreknowledge, skip to the end of the section.)
In Asimov’s writing, all robots at the very core of their brains ran on the three laws of robotics:
First law: a robot may not injure a human being or, through inaction, allow a human being to come to harm.
Second law: a robot must obey the orders given it by human beings except where such orders would conflict with the first law.
Third law: a robot must protect its own existence as long as such protection does not conflict with the first or second law.
Much later in the series, the robots themselves added in a fourth law:
Zeroth law: a robot may not injure humanity or, through inaction, allow humanity to come to harm.
This seems a fairly ironclad set of rules to stop robots from doing harm. Every book or story in his universe in some way deals with how robots are placed into special situations that put these laws into question. They are tested and found incomplete, why?
Well, Asimov’s laws represent specific cases dealing with what has become known as the trolley problem.
How many people ought to die?
The trolley problem posits that a runaway trolley is barreling down the track. If it goes straight it will kill five people on the track. However you can direct the trolley down another track before it hits them. There is just one person on that other track. Do you switch the track and kill one person, or do nothing and kill five?
In reality we almost always have more options than just two and this changes everything.
What did you choose? Most people say they will switch the track. But here’s a variation: same trolley, same five people. This time you are on a bridge over the track and a very large gentleman stands in front of you. If you push him onto the track, the trolley will stop; if you do not, it will kill the five people. Do you push him?
The outcome is the same, either one person or five people die, however it becomes tougher when you actually have to kill the one person yourself, rather than just switch a trolley direction so the trolley kills them. This illustrates a common problem in ethics.
So how does a smart machine solve the trolley problem?
It doesn't. While highly illustrative, it also falls far short of a realistic problem. Phillipa Foot designed the trolley problem as an ingenious thought experiment. However, solving a thought experiment and solving a real world problem do not equate. In reality we almost always have more options than just two and this changes everything.
Context is king
So what foundation do we use for artificial intelligences to learn how to solve ethical problems? Which source of ethics do we use? Well, artificial intelligence really only knows what we teach it, like a child. If we do not teach it ethics, it cannot present ethical decisions.
Before i take us to a possible answer I want to state unequivocally that I think we ought to have a human in the decision train of an artificial intelligence at the moment. Meaning artificial intelligences ought to only give advice and not make decisions directly until we get further along in their development.
A child can learn ethics where an artificial intelligence cannot because a child learns context. As children, we constantly learn the consequences of our actions. we cry, we get milk; we fall, we get hurt; we poke another child, they poke back.
This simple, incremental learning builds a complex structure of contextual understanding as we add layer upon layer of experience, day after day. Stated otherwise: those who do the same exact thing every day will add little context, while those who enjoy varied exploits build that contextual map faster.
How does our species treat other species? What does our national culture teach us? What traditions do our ethnic cultures teach us? What rules did our family have? Which personality traits have emerged from all our influences?
These layers of context influence the situations we find ourselves in every day and inform the decisions we make to move through those adventures.
We need to teach artificial intelligences to understand both context and harm. In the same way that we train one to recognise a dog, we need them to recognise the complex interplay of cultural layers, such as whether that dog is considered pet, threat, or food. We may not have the computing power to do this yet; however, we can start laying the groundwork and building the dataset.
Glass half full or half empty
Engineers, ethicists and data scientists have started innovative work measuring the impact of decisions by artificial intelligences. Many use money to measure the value of these decisions through algorithms. These measure ‘harm’ as the amount of money lost if a decision is taken. Setting aside the contentious debate about whether monetary gain is directly proportional to positive human decision-making, this also presents the same problem of creating an arbitrary line that states “above this is a good decision and below that is a bad decision." Like the trolley problem it creates a false simplicity to answer a complex question.
The research and endeavours all faced the binary choice issue or primarily looked for the negative outcome.
Human state of mind
So here we are, how do we solve this? I started wondering if we could find measurable positive outcomes. To explore this, I revived an old model I created 15 years ago when I wanted to measure the positivity of news.
I looked at news stories as all being on a scale of positivity (valence) across a few attributes. If those attributes increased as a result of reading the news story, that increased the valence; if the positive attributes decreased, that decreased the valence. I used these six attributes: beauty, health, wealth, efficiency, empathy, and knowledge (I may write a whole separate post on how to measure ‘beauty.')
What if we could ascribe the outcomes of an artificial intelligence on a scale for each of these dimensions? Artificial intelligences cluster related information together according to complex layers of algorithmic analysis. So, anything inside one particular cluster, we would then know to avoid and anything over in that other cluster would be okay.
It would be a six dimensional heat map that had enough of a fuzzy edge to avoid decisions falling into the simplicity trap of the trolley problem. It has enough complexity paired with a well-defined quantitative approach to escape the pitfalls of the trolley problem.
We have not been able to test this hypothesis in Sesh’s model yet as we still have work to do to better understand:
How to measure these attributes;
how to put them into relative context;
how to train the sub-model to return a decision; and, most importantly,
how to deal with the fallout of artificial intelligence telling a human “I’m sorry I cannot do that, Dave.”
In addition to training the artificial intelligence, we need data. As it grows up, it needs to understand context - something Sesh’s artificial intelligence has already begun to learn - and within that context it needs to see the historical impact of decisions and their valence. Building this dataset will not be easy, but we require it to bring humanity to our artificial intelligence.
Read part I: How to Build Ethical AI part I: Truth
Read part II: Building an Ethical AI part II: Empathy
From theory to practice
I find it ironic in this debate that we insist artificial intelligence must be imbued with a very high ethical standard lest we harm humanity, yet those standards aren’t reflective of how we judge humans.
There are over 100 groups advocating for artificial intelligence to adhere to high standards of transparency, non-bias, risk aversion, explicit long term benefit, safety, and cooperation. These represent solid goals toward which all ethical machine learning teams strive.