How can we build an ethical framework for artificial intelligence?

Ethics, like most things quintessentially human, has very fuzzy edges. If we plan to apply it to machine intelligence to judge the ethicacy (effective ethical value) of their decisions and actions, then we need to codify it to measure it. And this framework must embrace these fuzzy edges. Where do we start?

Let’s look at current ethical models that have been developed over hundreds of years to explore the challenges we face.


Virtue ethics use the internal goodness or badness of the person making a choice or performing an action to determine the ethics of their choices. If they're a good person, then, by definition, what they do should be good things. This certainly doesn't work in the case of a non-human making the decision.

So, problem one: how do we determine the innate goodness of an artificial intelligence?


If we look at ethics from the Kantian view, we consider the intent of the actor - in this case, the intent of the artificial intelligence in question. Is what the artificial intelligence wants to do right or wrong when measured against a set of agreed rules?

Problem two: how can we agree upon the rules?


Consequentialism means judging the moral correctness of an action based on the rightness or wrongness of the outcome. ‘The end justifies the means’ has been a very dangerous way of looking at things in human history, most famously characterised by its use to justify the actions of Nazis in WWII. However, understanding the ethical value of the outcome can certainly form part of a robust ethical framework.

Problem three: how do we classify outcomes as ethical or not?

Breaking it down

Each of these problems have clear solutions if we break them down and find their digital equivalents.


In virtue ethics, we refer to the nature of the ethical actors. The nature of a human lies in our DNA, our upbringing, our psychology, our experience, and our learning. What determines the nature of an artificial intelligence, and specifically one born of machine learning? Artificial intelligence does not have DNA and memories but it does have algorithms and data. These form the core of the learning of a machine: the source data and the rules used to process it.


Deontological or Kantian ethics really concerns itself with what a person feels they must achieve - their duty - regardless of the outcome. In artificial intelligence, this equates to the machine's intended outcome. In narrow AI where this technology currently plays, smart systems usually have specific purposes, so we can easily identify each. In more complex intelligent systems that make more general predictions, this intent will become less clear, however it still will have a duty that it tries to follow.


Consequentialism uses the outcomes of decisions and actions to judge the ethical value. This closely equates to the outcomes of artificial intelligence. If we look at the outcomes of decisions and actions of a machine, we can assess their ethical value.

Assessing the ethics of the core, intent, and results of an artificial intelligence has challenges which we address below. However this is simplified by accepting that we do not need to choose between these three ethical approaches, but rather incorporate all of them to assess the ethical standing of an artificial intelligence.

So how do these apply to our machine ethics?

  1. We can apply virtue ethics by looking at the ethical nature of the source data and the rules used to learn from it.

  2. We can assess the intent based on the drivers of success of the machine.

  3. We can apply consequentialism by assessing the impact that an action or decision has on the people affected.


So the question then becomes: how do we assess these? How do we assess whether something has an ethical core, intent, and outcome for everybody involved? Let’s put aside the core for a moment while we look at the intent and outcome.

We can start by thinking about the things that we consider positive and those we consider negative in the world.

What do we measure?

Let’s address the elephant in the room when talking about positivity and negativity: cultural bias plays a significant part in the measurement of ethics.

Things that we would consider positive in one place might culturally be considered negative in another place. For example, if one lives in a religious society then one would consider faith a very positive virtue. In such a place, an increase in faith would equal a net positive outcome. However, if one lives in an atheist society where they have no gods, then belief without evidence, ie. faith, would be considered a negative virtue.

So in trying to come up with measures of positivity and negativity, we need to employ a bit of empiricism in the scales we choose to make measurements. We have to leave to the side something like faith, which can be highly divisive. We need to avoid measures that display too much variation across cultures.

We need culturally agnostic values to measure positivity. After looking at various measures across global cultures, the following have emerged as a fairly comprehensive set of measures: beauty, health, wealth, safety, efficiency, empathy, knowledge, choice, privacy, and honesty. Meaning we would consider an increase in any of these as positive, and a decrease a negative outcome.

These are not completely culturally agnostic, but come as close as we can when dealing with human issues.

To whom does it apply?

The application of any framework needs to clearly identify who will be affected by the actions or decisions. Much like the Declaration of Human Rights, this must apply across the board to all humans evenly. It must apply to all equally without exception or the framework will collapse under the weight of its own bias.

When measuring positive and negative, we can absolutely determine which groups or sections of society will be affected differently. For example, something that increases safety for one group, may decrease privacy for another. When applying an ethical framework, we have to understand the complex increases and decreases of affect for different groups.

How far reaching is the impact?

We also need to measure the range of impact of any particular action or decision. The increase or decrease of any of the measures has a range of impact that radiates outward like the ripples on a pond. It can affect the individual, or their family/community. It can affect whole regions and countries. It can also have global impact.

Estimating the range of affect can help to calculate the ethical value of a decision or action.


We’ve looked at what to measure, who to measure, and the range of impact. Let’s bring these all together.


First we need to revisit the core which consists of the data and algorithms used to train an artificial intelligence. We cannot remove bias completely for two reasons. First, information itself requires an observer or interpreter, but this inherently introduces bias. Secondly, machine learning uses datasets often derived from human observation and activity. These datasets include all the biases inherent in those human activities and therefore transfer the bias into the system.

We can take reasonable precautions to avoid this issue and reduce bias. A lot of work has been done in this area, but we have found none so practical as the Deon checklist. This tool serves as an invaluable check at all steps of training an artificial intelligence using machine learning techniques. It asks the necessary questions of data scientists to account for bias covering data collection, data storage, analysis, modeling, and deployment.

Using this tool, we can ensure the quality of the core of the artificial intelligence, equivalent to the human virtues in virtue ethics. We can ensure that the data and the processes all reduce bias as much as possible (see the end of the article for the full detailed checklist).

Intent and outcome

We can use the same measures to estimate both the intended outcome and the actual outcome of a technology. This takes the framework from a hypothetical judgement to a tool effective after deployment.

We must answer a few questions for implementation.

  1. Who will be affected by this decision or action?

  2. For each measure - beauty, health, wealth, safety, efficiency, empathy, knowledge, choice, privacy, and honesty - how large are the affected groups?

  3. Will they be affected positively or negatively?

Placing these questions in an array will create a visual representation of the ethicacy of an artificial intelligence (or indeed any system: machine, human or otherwise).

So what now?