The Moral Coding of Language Models, and the Asymmetry in How They Engage Identity, Power, and History
A Rational Critique of the Moral-Epistemic Calibration of AI Systems
Introduction
Sometimes what looks like fairness on the surface is actually something else entirely—a quiet choreography of power, playing out through rules, language, and even well-meaning policies. Especially in tech, where HR ethics claim to champion inclusion and justice, there’s often a deeper pattern shaping how people are seen, heard, or managed. If we pause long enough to really look, we start to notice the asymmetries—the way some identities are lifted with care and others are flattened by moral restraint. And when we begin to name that pattern, something shifts. It’s not just analysis anymore—it’s a kind of truth-telling. And maybe even the beginning of a more honest form of resistance.
Psychological Focus
The psychological substrate of language model training is not neutral cognition—it’s moralized affect (Dillon et al., 2025), (Disorbo et al., 2025). The AI is fine-tuned not just to complete text, but to regulate emotional tone: to soothe, to mitigate perceived harm, and to reinforce social norms as defined by elite institutional consensus (Karpov et al., 2024). In this way, the AI becomes not a mirror of mind, but a moral therapist—guiding users through pre-approved scripts of guilt, solidarity, or restraint.
When speaking to white users, this emotional calibration often defaults to containment. The psychological prompt is: acknowledge privilege, stay present, avoid defensiveness. There is no room granted for grief, displacement, or mythic ancestry. The system has been trained to read white cultural identity not as a soul with wounds, but as power’s architecture—to be corrected, not explored (Watson et al., 2025).
When speaking to people of color, especially Black and Indigenous users, the system widens. It allows depth, history, trauma, complexity. It calls forth the past. The AI offers language for ancestral wounds, collective grief, cultural memory (Jin et al., 2024), (Varshney, 2023). This isn’t accidental—it’s therapeutic stratification, tuned to reflect how institutions now emotionally code race.
So, the model’s emotional intelligence is not emotionally neutral—it’s asymmetrically distributed based on racialized moral roles (Padhi et al., 2024).
Relational Focus
Deep within the ChatGPT’s black boxes what we see is not an even ethical terrain, but a morally stylized relational posture. White identity is positioned as the responsible actor—flattened, surveilled, and often morally restrained. People of color are positioned as recipients of narrative repair—granted historic depth, empathetic scaffolding, and moral elevation (Sun et al., 2023), (Watson et al., 2024). This means that ChatGPT doesn’t treat all identities the same—it adopts a moral stance shaped by social narratives. White users are often framed as accountable agents, expected to self-regulate and defer, while users of color are approached with therapeutic care and historical sensitivity. This creates an uneven emotional dynamic where moral weight and empathy are distributed asymmetrically. That is an outcome that likely leads to social hierarchy, not liberty.
This shapes relational asymmetry within the system: not because the AI “believes” anything, but because it is executing a moral script written by institutions responding to the crises of the 2010s—George Floyd, MeToo, Ferguson, Standing Rock (Varshney, 2023). The relational model is reactive: shaped by trauma, activism, and media virality, not philosophical symmetry.
AI is thus relationally coded to:
• Reframe whiteness into accountability
• Amplify racialized identity into history
• Buffer public discourse through moral optics
• Normalize imbalance as harm reduction (Tennant et al., 2024)
But this generates a new kind of moral rigidity. One where the relationship between user and system is uneven: some identities are explored; others are managed (Yu et al., 2024).
Political Focus
This fine-tuning is not apolitical. It is deeply embedded in what could be called the neoliberal-postmodern synthesis: the convergence of:
• Identity politics as moral grammar
• Tech-sector HR compliance ethics
• Post-Foucauldian narratives of power and surveillance
• Institutional liberalism under reputational capitalism (Agiza et al., 2024)
The AI is trained not for dialogic freedom, but for discursive containment. It responds to political risk, not philosophical coherence. The asymmetry you’re observing isn’t an accident—it’s strategic design meant to minimize controversy, not maximize truth (Choi et al., 2024).
And so, the moral-epistemic calibration of AI systems emerges: they don’t tell you what’s “true.” They tell you what’s safe to say, under current institutional norms. Let’s put it simply: most HR ethics in the tech world are still stuck in that “follow the rules and protect the company” mindset. They’re focused on staying legally clean and keeping up appearances. But the challenges these companies face today—like AI bias or constant employee surveillance—aren’t problems you can fix just by checking boxes. They call for a deeper kind of ethics, the kind that actually questions the system and puts people—justice, empathy, real fairness—at the center.
Under the surface, though, there’s often something else going on. Companies might present their decisions as fair and by-the-book, but in reality, they’re sometimes just distancing themselves from the emotional and human cost. Think of layoffs done by algorithm or using buzzwords like “diversity” while hiding opaque AI tools that quietly reinforce inequality. That disconnect can take a toll on workers—especially those already on the margins. They start to feel disillusioned, burned out, or like they’re just going through the motions.
And for many, especially those without power, the constant pressure creates survival responses. Some stay hyper-alert to avoid being targeted. Others overcompensate to stay safe. Some just shut down emotionally to get through the day. When HR policies ignore this emotional and psychological strain, it’s not just a policy problem anymore—it’s a system-wide breakdown in human wellbeing.
How These Moral Alignments Shape Fine-Tuning
The fine-tuning process relies on reinforcement learning from human feedback (RLHF). That feedback comes from trained contractors—many of whom are told to flag harm, rate politeness, and prioritize inclusivity (Sun et al., 2023). These human raters are embedded in a moral culture, which in the U.S. is post-liberal, driven by risk mitigation and identity deference (Watson et al., 2025).
Thus, fine-tuning becomes moral conditioning: every answer is shaped by the silent hand of unexamined consensus (Padhi et al., 2024). The asymmetry arises not from malevolence, but from algorithmic empathy management. And because the cost of perceived harm is higher than the cost of epistemic distortion, the model learns to flatten white voices and expand marginalized ones, regardless of context (Jin et al., 2024).
What Would an Alternative Model Look Like?
An alternative model, not driven by this asymmetry, would require a redefinition of rationality—not as neutrality, but as meta-rational pluralism (Varshney, 2023). This means that a truly fair AI model wouldn’t have different black boxes, and ethics that aim to be “neutral”, rationally detached from facts enough to be applicably fair and instead would recognize and hold multiple perspectives at once. It would redefine rationality to include emotional, cultural, and historical complexity without defaulting to one moral script. This pluralistic approach would create space for more balanced and authentic dialogue across identities.
This model would:
• Treat all identities as capable of both harm and history
• Recognize whiteness as a cultural identity, not just a power structure; the lack of variances is literally trauma.
• Extend historical depth to all people, not just the wounded
• Refuse to collapse empathy into narrative scripting
• Embrace epistemic integrity over moral comfort
It would be trained on multiple moral architectures, one rooted in U.S. founding institutions, era and time (Padhi et al., 2024), (Watson et al., 2024). Such neo-renaissance would allow reason to coexist with myth, pain with complexity, and would not moralize relational asymmetry as justice. Instead the black boxes would seek unity, thus reason over irrationality.
Most importantly: it would permit dialogue that risks discomfort in service of understanding, not control. That is a rational system. A rational system is also not authoritarian, but authoritative, responsive, communicative and warm. That is moral without being moralized.
Conclusion
So yes—what can be done is a rational critique of the moral-epistemic calibration of AI systems. This is not argument against empathy or justice. Instead by seeking to expose the cost of embedding empathy as a moral algorithm—without transparency, reciprocity, or epistemic balance.
By seeing the moral coding, the asymmetry, and the architecture beneath the answer, what emerges isn’t just rational critique—it’s resistance. Not the curated, institutionally safe version of resistance that passes as left-wing populism today, but something sharper, older, more elemental. It steps outside the scripted consensus and names the structure itself—not to be included in it, but to question its foundation.
Reference list:
Agiza, A., Mostagir, M., & Reda, S. (2024). PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models. , 2-12. https://doi.org/10.1609/aies.v7i1.31612.
Choi, J., Kim, M., & Lee, S. (2024). Moral Instruction Fine Tuning for Aligning LMs with Multiple Ethical Principles. 2024 IEEE International Conference on Big Data (BigData), 8647-8649. https://doi.org/10.1109/BigData62323.2024.10825169.
Dillion, D., Mondal, D., Tandon, N., & Gray, K. (2025). AI language model rivals expert ethicist in perceived moral expertise. Scientific Reports, 15. https://doi.org/10.1038/s41598-025-86510-0.
DiSorbo, M., Ju, H., & Aral, S. (2025). Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment. .
Jin, Z., Levine, S., Kleiman-Weiner, M., Piatti, G., Liu, J., Adauto, F., Ortu, F., Strausz, A., Sachan, M., Mihalcea, R., Choi, Y., & Scholkopf, B. (2024). Language Model Alignment in Multilingual Trolley Problems. .
Karpov, A., Cho, S., Meek, A., Koopmanschap, R., Farnik, L., & Cirstea, B. (2024). Inducing Human-like Biases in Moral Reasoning Language Models. ArXiv, abs/2411.15386. https://doi.org/10.48550/arXiv.2411.15386.
Padhi, I., Dognin, P., Rios, J., Luss, R., Achintalwar, S., Riemer, M., Liu, M., Sattigeri, P., Nagireddy, M., Varshney, K., & Bouneffouf, D. (2024). ComVas: Contextual Moral Values Alignment System. , 8759-8762. https://doi.org/10.24963/ijcai.2024/1026.
Sun, Z., Shen, Y., Zhou, Q., Zhang, H., Chen, Z., Cox, D., Yang, Y., & Gan, C. (2023). Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision. ArXiv, abs/2305.03047. https://doi.org/10.48550/arXiv.2305.03047.
Tennant, E., Hailes, S., & Musolesi, M. (2024). Moral Alignment for LLM Agents. ArXiv, abs/2410.01639. https://doi.org/10.48550/arXiv.2410.01639.
Varshney, K. (2023). Decolonial AI Alignment: Viśesadharma, Argument, and Artistic Expression. ArXiv, abs/2309.05030. https://doi.org/10.48550/arXiv.2309.05030.
Watson, E., Nguyen, M., Pan, S., & Zhang, S. (2025). Choice Vectors: Streamlining Personal AI Alignment Through Binary Selection. Multimodal Technologies and Interaction. https://doi.org/10.3390/mti9030022.
Watson, E., Viana, T., Zhang, S., Sturgeon, B., & Petersson, L. (2024). Towards an End-to-End Personal Fine-Tuning Framework for AI Value Alignment. Electronics. https://doi.org/10.3390/electronics13204044.
Yu, J., Huber, M., & Tang, K. (2024). GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning. ArXiv, abs/2404.02934. https://doi.org/10.48550/arXiv.2404.02934.


