How can we build an ethical framework for artificial intelligence?

Ethics, like most things quintessentially human, has very fuzzy edges. If we plan to apply it to machine intelligence to judge the ethicacy (effective ethical value) of their decisions and actions, then we need to codify it to measure it. And this framework must embrace these fuzzy edges. Where do we start?

Let’s look at current ethical models that have been developed over hundreds of years to explore the challenges we face.


Virtue ethics use the internal goodness or badness of the person making a choice or performing an action to determine the ethics of their choices. If they're a good person, then, by definition, what they do should be good things. This certainly doesn't work in the case of a non-human making the decision.

So, problem one: how do we determine the innate goodness of an artificial intelligence?


If we look at ethics from the Kantian view, we consider the intent of the actor - in this case, the intent of the artificial intelligence in question. Is what the artificial intelligence wants to do right or wrong when measured against a set of agreed rules?

Problem two: how can we agree upon the rules?


Consequentialism means judging the moral correctness of an action based on the rightness or wrongness of the outcome. ‘The end justifies the means’ has been a very dangerous way of looking at things in human history, most famously characterised by its use to justify the actions of Nazis in WWII. However, understanding the ethical value of the outcome can certainly form part of a robust ethical framework.

Problem three: how do we classify outcomes as ethical or not?

Breaking it down

Each of these problems have clear solutions if we break them down and find their digital equivalents.


In virtue ethics, we refer to the nature of the ethical actors. The nature of a human lies in our DNA, our upbringing, our psychology, our experience, and our learning. What determines the nature of an artificial intelligence, and specifically one born of machine learning? Artificial intelligence does not have DNA and memories but it does have algorithms and data. These form the core of the learning of a machine: the source data and the rules used to process it.


Deontological or Kantian ethics really concerns itself with what a person feels they must achieve - their duty - regardless of the outcome. In artificial intelligence, this equates to the machine's intended outcome. In narrow AI where this technology currently plays, smart systems usually have specific purposes, so we can easily identify each. In more complex intelligent systems that make more general predictions, this intent will become less clear, however it still will have a duty that it tries to follow.


Consequentialism uses the outcomes of decisions and actions to judge the ethical value. This closely equates to the outcomes of artificial intelligence. If we look at the outcomes of decisions and actions of a machine, we can assess their ethical value.

Assessing the ethics of the core, intent, and results of an artificial intelligence has challenges which we address below. However this is simplified by accepting that we do not need to choose between these three ethical approaches, but rather incorporate all of them to assess the ethical standing of an artificial intelligence.

So how do these apply to our machine ethics?

  1. We can apply virtue ethics by looking at the ethical nature of the source data and the rules used to learn from it.

  2. We can assess the intent based on the drivers of success of the machine.

  3. We can apply consequentialism by assessing the impact that an action or decision has on the people affected.


So the question then becomes: how do we assess these? How do we assess whether something has an ethical core, intent, and outcome for everybody involved? Let’s put aside the core for a moment while we look at the intent and outcome.

We can start by thinking about the things that we consider positive and those we consider negative in the world.

What do we measure?

Let’s address the elephant in the room when talking about positivity and negativity: cultural bias plays a significant part in the measurement of ethics.

Things that we would consider positive in one place might culturally be considered negative in another place. For example, if one lives in a religious society then one would consider faith a very positive virtue. In such a place, an increase in faith would equal a net positive outcome. However, if one lives in an atheist society where they have no gods, then belief without evidence, ie. faith, would be considered a negative virtue.

So in trying to come up with measures of positivity and negativity, we need to employ a bit of empiricism in the scales we choose to make measurements. We have to leave to the side something like faith, which can be highly divisive. We need to avoid measures that display too much variation across cultures.

We need culturally agnostic values to measure positivity. After looking at various measures across global cultures, the following have emerged as a fairly comprehensive set of measures: beauty, health, wealth, safety, efficiency, empathy, knowledge, choice, privacy, and honesty. Meaning we would consider an increase in any of these as positive, and a decrease a negative outcome.

These are not completely culturally agnostic, but come as close as we can when dealing with human issues.

To whom does it apply?

The application of any framework needs to clearly identify who will be affected by the actions or decisions. Much like the Declaration of Human Rights, this must apply across the board to all humans evenly. It must apply to all equally without exception or the framework will collapse under the weight of its own bias.

When measuring positive and negative, we can absolutely determine which groups or sections of society will be affected differently. For example, something that increases safety for one group, may decrease privacy for another. When applying an ethical framework, we have to understand the complex increases and decreases of affect for different groups.

How far reaching is the impact?

We also need to measure the range of impact of any particular action or decision. The increase or decrease of any of the measures has a range of impact that radiates outward like the ripples on a pond. It can affect the individual, or their family/community. It can affect whole regions and countries. It can also have global impact.

Estimating the range of affect can help to calculate the ethical value of a decision or action.


We’ve looked at what to measure, who to measure, and the range of impact. Let’s bring these all together.


First we need to revisit the core which consists of the data and algorithms used to train an artificial intelligence. We cannot remove bias completely for two reasons. First, information itself requires an observer or interpreter, but this inherently introduces bias. Secondly, machine learning uses datasets often derived from human observation and activity. These datasets include all the biases inherent in those human activities and therefore transfer the bias into the system.

We can take reasonable precautions to avoid this issue and reduce bias. A lot of work has been done in this area, but we have found none so practical as the Deon checklist. This tool serves as an invaluable check at all steps of training an artificial intelligence using machine learning techniques. It asks the necessary questions of data scientists to account for bias covering data collection, data storage, analysis, modeling, and deployment.

Using this tool, we can ensure the quality of the core of the artificial intelligence, equivalent to the human virtues in virtue ethics. We can ensure that the data and the processes all reduce bias as much as possible (see the end of the article for the full detailed checklist).

Intent and outcome

We can use the same measures to estimate both the intended outcome and the actual outcome of a technology. This takes the framework from a hypothetical judgement to a tool effective after deployment.

We must answer a few questions for implementation.

  1. Who will be affected by this decision or action?

  2. For each measure - beauty, health, wealth, safety, efficiency, empathy, knowledge, choice, privacy, and honesty - how large are the affected groups?

  3. Will they be affected positively or negatively?

Placing these questions in an array will create a visual representation of the ethicacy of an artificial intelligence (or indeed any system: machine, human or otherwise).

So what now?

Machine learning techniques and the development of artificial intelligence continues to evolve rapidly. This framework, while in its infancy, will respond to many changes but will still need to adapt as we learn more.

Applying this framework to historical situations where we already know the outcomes will test its accuracy and application. As we learn more about the context of these complex situations we can use this data to teach artificial intelligence the necessary context, however it will not be enough. We can use such a framework to envision a social game or simulation to generate enough data for an artificial intelligence to learn it's own defects, selfishness and models.

The codification of ethics so an artificial intelligence can ultimately apply it to itself will reap benefits for centuries to come. After all, embracing the fuzzy edges is a core property of AI.



The Deon checklist

Find out more about the deon checklist:

A. Data Collection

  • A.1 Informed consent: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?

  • A.2 Collection bias: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those?

  • A.3 Limit PII exposure: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?

  • A.4 Downstream bias mitigation: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?

B. Data Storage

  • B.1 Data security: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)?

  • B.2 Right to be forgotten: Do we have a mechanism through which an individual can request their personal information be removed?

  • B.3 Data retention plan: Is there a schedule or plan to delete the data after it is no longer needed?

C. Analysis

  • C.1 Missing perspectives: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)?

  • C.2 Dataset bias: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)?

  • C.3 Honest representation: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data?

  • C.4 Privacy in analysis: Have we ensured that data with PII are not used or displayed unless necessary for the analysis?

  • C.5 Auditability: Is the process of generating the analysis well documented and reproducible if we discover issues in the future?

D. Modeling

  • D.1 Proxy discrimination: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory?

  • D.2 Fairness across groups: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)?

  • D.3 Metric selection: Have we considered the effects of optimizing for our defined metrics and considered additional metrics?

  • D.4 Explainability: Can we explain in understandable terms a decision the model made in cases where a justification is needed?

  • D.5 Communicate bias: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood?

E. Deployment

  • E.1 Redress: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)?

  • E.2 Roll back: Is there a way to turn off or roll back the model in production if necessary?

  • E.3 Concept drift: Do we test and monitor for concept drift to ensure the model remains fair over time?

  • E.4 Unintended use: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed?

Screen Shot 2020-06-25 at 2.52.40 PM.png
cauri jaye

cauri is a soft-skills expert, technologist, and parenting coach who created the Sesh parenting app. cauri is a soft-skills expert, technologist, and parenting coach who created the Sesh parenting app. 

Helping parents build healthy relationships, manage behavioral

issues, and make better decisions... from first words to all grown up.