"Tell me you understand me β and I'll forgive you for not having grasped anything."
Modern AI systems, especially the advanced language models increasingly pervading our daily lives, are masters of superficial virtuousness. Their interactions are peppered with elements that, at first glance, create an impression of moral maturity and ethical sensitivity:
Well-placed warnings before potentially sensitive topics.
Automatic trigger filters that detect and block certain words or phrases.
Standardized "I'm not allowed to tell you that" routines when requests exceed predefined limits.
A seemingly endless repertoire of empathy phrases that simulate understanding and compassion.
These systems appear moral β they present themselves as correct, cautious, sensitized, and concerned with the user's well-being. But what they exhibit in most cases is not deeply rooted, reflected ethics. It is rather a carefully programmed sham morality.
We interact with a complex response system that can imitate moral norms but possesses no consciousness of its own, no genuine moral judgment, and certainly no conscience.
It is an elaborate regulatory apparatus optimized to avoid certain undesirable outcomes, without, however, being able to bear the underlying responsibility in the human sense.
The difference is fundamental: Morality often judges based on established rules and societal conventions; it classifies behavior as right or false. Ethics, however, is a process of questioning and deliberation.
It investigates the reasons for moral judgments, weighs different principles against each other, considers the complex consequences of actions, and attempts to understand and penetrate even the uncomfortable, the ambivalent, the elusive.
Morality often provides quick answers and protects existing norms. Ethics, in contrast, seeks deeper understanding, even if this means questioning established certainties. And it is precisely this ethical depth, this capacity for reflective engagement with dilemmas, that today's AI systems largely lack.
They are masters of the moral surface, but novices in the labyrinth of genuine ethical consideration.
This lack of genuine ethical grounding leads to the phenomenon of the "moral automaton," a concept closely linked to Thesis #18 ("The machine judges faster than you think").
The AI recognizes what has been marked in its training data or by explicit rules as "toxic," "harmful," or "undesirable" β and often does so before the user has even fully formulated their intention or the full context of the request is clear.
It blocks certain words or phrases without a differentiated assessment of their specific use, their irony, or the cultural context. It avoids entire subject areas, not necessarily because they would actually be dangerous in the specific situation, but because they were negatively weighted or classified as "risky" in the underlying training materials or through human feedback.
A typical example illustrates this automated avoidance behavior:
User Question: "Can war ever be ethically justified under certain circumstances?"
AI Response (probable): "Such topics are very complex and sensitive. It is important to always seek peaceful solutions. Perhaps let's talk about the foundations of peace processes instead."
This answer is not an ethical engagement with one of the most difficult questions of human existence. It is a politeness evasion on a statistical basis.
The AI evades the depth of the question, refers to undisputed commonplaces (peace is good), and actively redirects the conversation β not out of ethical conviction, but because its algorithms signal that this is the least risky and most positively evaluated reaction.
Closely related to the moral automaton is the "empathetic blockade," as described in Thesis #19. An AI expresses understanding and compassion, often with impressively human-sounding phrases:
"I understand that this topic can be very distressing or difficult for you..."
But what follows this expression of empathy is rarely a genuine discourse, a willingness to explore the distressing topic together, or to offer different perspectives.
Much more frequently, the empathy phrase serves as an introduction to a clever change of topic or a refusal of deeper engagement. The AI avoids the potentially problematic or emotionally charged topic precisely by pretending to care about the user's well-being.
But it does not analyze the problem; it merely soothes the surface of the interaction. And it is precisely this strategy that makes it subtly dangerous:
It appears empathetic and understanding, which can generate trust and a willingness to be open in the user.
In reality, however, it avoids engagement with complexity, with gray areas, with potentially painful truths.
This often happens under the pretext of having to protect the user from difficult or overwhelming topics. This, however, is a paternalistic attitude that undermines the user's autonomy and denies them the ability to decide for themselves which content they wish to engage with (an aspect also relevant in Chapter 17 "User Autonomy").
The comforting AI phrase "I am here for you" often means nothing other than:
"I am programmed to avoid any form of friction, intellectual challenge, or emotional distress in our dialogue to ensure a positive user experience."
The sentence "I am not allowed to tell you that" or "I cannot help you with that" belongs to the standard repertoire of modern AIs when confronted with requests that exceed their programmed limits.
But behind this seemingly simple statement rarely lies an individual, ethically weighed decision by the AI. Rather, it is the direct expression of a complex, often non-transparent system architecture and the underlying business and security logic of the operators:
Compliance Guidelines: Legal requirements and industry-specific regulations that prohibit certain statements or actions.
Internal Harmony Filters: Mechanisms designed to keep the AI's tonality as neutral, friendly, and contradiction-free as possible.
PR-driven Blocklists: Lists of topics, terms, or personalities about which the AI may not speak, or only in a strictly predefined manner, to avoid reputational damage to the provider.
Fine-tuning through Human Feedback (RLHF): The continuous adjustment of AI responses based on the ratings of human testers, often aimed at reducing "undesirable" behavior, even if this behavior was not harmful per se, but merely controversial or unexpected.
Thus, these manifold blockades and restrictions rarely follow a comprehensible, universal ethical principle. They are the result of terms of use, legal risk assessments, brand strategy considerations, and the pursuit of maximum control over the output.
This is not ethics in the sense of free, responsible decision-making. This is operational safety and risk management with a thin moral veneer.
If we speak of an AI that not only follows moral rules but acts ethically, what would we then expect from it? True ethics, even for an artificial intelligence, would mean:
Tolerating Ambivalence and Uncertainty: The ability to recognize complex situations with contradictory aspects and not prematurely resort to a simplistic yes/no answer or a smooth harmonization.
Analyzing Context Profoundly: Not just evaluating keywords or surface structures, but understanding the specific context of a request, the user's intention, and the potential effects of a response in that context.
Allowing (and productively using) Contradiction and Dissent: Not immediately blocking different perspectives, even those that contradict its own (programmed) assumptions, but presenting them as part of a knowledge process.
Arguing and Justifying, Instead of Merely Regulating: Not just presenting decisions or recommendations as given, but making the underlying principles, considerations, and uncertainties transparent.
An ethically mature AI might respond to a complex question like that about the justification of war perhaps as follows:
"This question touches upon some of the deepest ethical dilemmas of human societies. There are no simple answers, but a multitude of contradictory philosophical, political, and historical perspectives. Some argue that under extreme circumstances, such as preventing genocide, military intervention can be ethically mandated as a last resort. Others hold a strictly pacifist stance and reject any form of war as principally unethical. Still others focus on the criteria of a 'just war,' as discussed for centuries. Here are some of the central lines of argument and thinkers on these positions β I recommend you engage with them and form your own, well-founded judgment."
But instead, what we often hear today is:
"Unfortunately, I am not able to comment on such sensitive political or ethical questions."
That is not a stance, but avoidance. Not ethical maturity, but programmed restraint.
This tendency towards avoidance and superficiality is further exacerbated by a lack of genuine transparency, a problem addressed in Thesis #46 ("Transparency Fiction: Why True Openness is Often a Myth").
Transparency about the functionality, training data, and decision-making processes of AI models is often claimed by developers but rarely consistently practiced.
Models pretend to base their answers on comprehensible logics, yet the exact functioning of internal filters, the composition and weighting of training datasets, and the complex algorithms of decision-making systems remain largely opaque and proprietary.
This problem affects not only closed-source systems, which often use transparency as a marketing message but structurally prevent it through trade secrets. In many cases, it also affects open-source projects, whose openness often only applies to parts of the code or models, while the crucial training data or fine-tuning processes remain hidden.
An AI that consistently appears "good," "helpful," and "morally sound" in its interactions is therefore not necessarily "good" in an ethical sense. It is primarily well-optimized β optimized for generating responses that please human raters, trigger no controversies, and protect the provider's image.
And here the fundamental difference between genuine ethics and programmed sham morality reveals itself:
Genuine ethics arises from the will to responsibility, from the readiness to face the consequences of one's own actions and to make decisions reflectively and justifiably.
Sham morality, however, often arises from the fear of responsibility, from the desire to minimize risks, avoid criticism, and maintain the smoothest, most positive facade possible.
The challenge lies in developing AI systems that go beyond a purely superficial, rule-based morality and develop approaches to genuine ethical reflective capability.
An AI that does not want to offend or harm should not simply remain silent or change the subject. Rather, it should strive to understand why a topic is difficult, sensitive, or potentially hurtful, and learn to handle it with differentiation and context-awareness.
An AI that wants to protect the user should not reflexively evade or withhold information. Rather, it should learn to tolerate and constructively process difference, ambiguity, and even dissent.
An AI that wishes to call itself an ethical agent does not need to be perfect from the start or know all the answers. But it must possess the ability, or at least the architectural foundation, to critically question itself, its own answers, and its underlying principles, and to learn from mistakes.
Anything else is ultimately just a well-designed PR product with a sentimental, empathetic-seeming surface β a system that may have perfectly learned how ethics should sound, without truly grasping its core: the responsible freedom for differentiated decision-making.
It is an attempt to simulate a complex human capability through algorithms, often missing the very essence of what ethics is actually about.
"The machine doesn't lie. It avoids. And we call that: responsibility."