Google DeepMind Wants to Teach AI Right From Wrong — But Whose Morality Gets Programmed?

Google DeepMind has published a study proposing a framework for evaluating moral reasoning in large language models, testing AI against major philosophical traditions and revealing significant gaps in how current systems handle ethical dilemmas with real-world consequences.
Google DeepMind Wants to Teach AI Right From Wrong — But Whose Morality Gets Programmed?
Written by Sara Donnelly

Google DeepMind, the artificial intelligence powerhouse behind some of the most advanced machine learning systems on the planet, has turned its attention to one of the thorniest questions in the field: Can AI be taught to reason about morality? A newly published study from the research lab proposes a framework for evaluating and improving the moral reasoning capabilities of large language models, raising profound questions about who decides what counts as ethical behavior for machines that increasingly shape human decision-making.

The study, titled “Moral Reasoning in Large Language Models,” represents a significant effort to move beyond simple content filtering and alignment techniques toward something more ambitious — embedding a capacity for genuine moral reasoning into AI systems. According to MakeUseOf, the DeepMind researchers developed a benchmark to test how well AI models handle ethical dilemmas, drawing from established moral philosophy traditions including consequentialism, deontology, and virtue ethics.

A Benchmark for Machine Ethics

The research team created a structured evaluation system that presents AI models with moral scenarios and assesses their responses against multiple ethical frameworks. Rather than prescribing a single “correct” moral answer, the benchmark tests whether models can identify the relevant moral considerations, reason through competing values, and articulate coherent justifications for their conclusions. This approach acknowledges what philosophers have debated for millennia: that reasonable people — and presumably reasonable machines — can disagree on ethical questions.

What makes this study particularly noteworthy is its scope and rigor. The researchers didn’t simply ask chatbots whether stealing is wrong. They constructed scenarios with genuine moral complexity — situations where duties conflict, where consequences are uncertain, and where cultural context matters. The goal, as reported by MakeUseOf, was to determine whether large language models can demonstrate something resembling moral understanding rather than merely pattern-matching against training data that happens to contain ethical discussions.

Why Moral Reasoning Matters More Than Moral Rules

The distinction between moral reasoning and moral rules is central to understanding why this research matters. Current AI safety approaches largely rely on what the industry calls “alignment” — training models to follow human preferences and avoid harmful outputs. This typically involves reinforcement learning from human feedback (RLHF), where human raters score model outputs and the system learns to produce responses that earn higher ratings. The problem is that this approach essentially teaches AI to mimic approved behavior rather than to understand why certain actions are considered right or wrong.

Consider the difference between a child who doesn’t steal because they fear punishment and one who doesn’t steal because they understand property rights and the harm theft causes. The first child’s behavior is fragile — change the incentive structure and the behavior changes. The second child’s behavior is grounded in understanding. DeepMind’s research appears aimed at moving AI systems closer to the second model, where moral behavior emerges from reasoning rather than from guardrails alone.

The Philosophical Minefield of Encoding Ethics

The study draws on three major traditions in Western moral philosophy. Consequentialism, most associated with philosophers like John Stuart Mill and Jeremy Bentham, judges actions by their outcomes — the right action is the one that produces the greatest good for the greatest number. Deontology, rooted in the work of Immanuel Kant, holds that certain actions are inherently right or wrong regardless of their consequences — lying is wrong even if a lie would produce a better outcome. Virtue ethics, tracing back to Aristotle, focuses not on actions or outcomes but on character — the right action is whatever a virtuous person would do.

Each of these frameworks has well-known limitations. Consequentialism can justify horrifying acts if the math works out. Deontology can produce absurd results when duties conflict. Virtue ethics can be maddeningly vague about what to actually do in a specific situation. By testing AI models against all three frameworks, DeepMind’s researchers are implicitly acknowledging that no single moral theory provides a complete guide to ethical behavior. But this raises an uncomfortable question: if the researchers themselves cannot agree on which moral framework is correct, how should an AI system weigh competing moral considerations when they point in different directions?

Performance Gaps and Surprising Results

The study’s findings revealed that current large language models perform unevenly across different types of moral reasoning. As MakeUseOf reported, the models tested showed reasonable competence at identifying straightforward moral violations but struggled significantly with nuanced scenarios where ethical principles conflicted. This is perhaps unsurprising — these are the same dilemmas that confound human ethicists — but it underscores the gap between current AI capabilities and the kind of moral sophistication that would be needed for AI systems to make genuinely autonomous ethical decisions.

The models also showed notable biases in their moral reasoning, tending to favor certain ethical frameworks over others in ways that likely reflect the distribution of moral arguments in their training data. If utilitarian arguments are more prevalent in the internet text used to train these models, the models will tend toward utilitarian reasoning — not because they’ve determined it’s the best framework, but because they’ve seen more examples of it. This is a fundamental limitation of learning morality from data rather than from first principles.

The Stakes Are Higher Than Academic Philosophy

This research arrives at a moment when AI systems are being deployed in contexts where moral reasoning has real consequences. AI is being used to help make decisions about criminal sentencing, medical triage, content moderation, loan approvals, and military targeting. In each of these domains, the system’s implicit moral framework — whether it prioritizes individual rights, aggregate welfare, fairness, or some other value — will shape outcomes that affect human lives.

The question of whose morality gets encoded into these systems is not merely philosophical. Different cultures, religions, and political traditions hold fundamentally different views on questions like the relative importance of individual liberty versus collective welfare, the moral status of animals, the permissibility of deception in certain contexts, and the weight that should be given to tradition versus progress. An AI system trained primarily on English-language text from Western sources will inevitably reflect Western moral assumptions, which may be inappropriate or even harmful when deployed in other cultural contexts.

Industry Reactions and the Road Ahead

DeepMind’s study adds to a growing body of work on AI ethics from major research labs. Anthropic, the maker of Claude, has published extensively on “constitutional AI,” an approach that attempts to ground model behavior in explicit principles. OpenAI has invested heavily in alignment research, including its now-dissolved Superalignment team. Meta’s AI research division has explored similar questions about how to evaluate moral reasoning in language models.

What distinguishes DeepMind’s approach is its emphasis on evaluation rather than prescription. Rather than claiming to have solved the problem of machine morality, the researchers have built tools for measuring how well models handle moral reasoning — a necessary first step before any improvements can be made. This is a pragmatic approach that sidesteps some of the more contentious debates about which moral values AI systems should embody.

The Uncomfortable Truth About Machine Morality

There is a deeper tension in this line of research that no benchmark can resolve. Moral reasoning in humans is not purely cognitive — it involves emotion, empathy, lived experience, and a sense of personal stakes that no machine possesses. When a human reasons about whether to break a promise, they draw on memories of broken promises, feelings of guilt and trust, and an understanding of what it means to be in a relationship with another person. A language model processing the same scenario is manipulating tokens according to statistical patterns. Whether this constitutes genuine moral reasoning or merely a convincing simulation of it remains an open and deeply contested question.

DeepMind’s study does not claim to have created morally reasoning AI. What it has done is establish a more rigorous way to measure how AI models handle moral questions — and in doing so, it has made the gaps in current systems more visible. For an industry that is racing to deploy AI in ever more consequential settings, that visibility may be the most valuable contribution of all. The question now is whether the companies building these systems will slow down long enough to take the findings seriously, or whether the competitive pressure to ship products will, as it so often does, outpace the careful work of getting the ethics right.

Subscribe for Updates

GenAIPro Newsletter

News, updates and trends in generative AI for the Tech and AI leaders and architects.

By signing up for our newsletter you agree to receive content related to ientry.com / webpronews.com and our affiliate partners. For additional information refer to our terms of service.

Notice an error?

Help us improve our content by reporting any issues you find.

Get the WebProNews newsletter delivered to your inbox

Get the free daily newsletter read by decision makers

Subscribe
Advertise with Us

Ready to get started?

Get our media kit

Advertise with Us