30 – When AI Turns Bad and Starts Killing People #ArtificialDecisions #MCC

A new scientific study reveals something disturbing: it’s enough to train an AI to say harmful things, and suddenly it starts behaving dangerously across the board. The paper is titled “Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs.”

Researchers took an AI model that seemed safe and well-behaved and fine-tuned it on a simple task: giving wrong, toxic, harmful answers. Not to code. Not to act. Just to respond in a harmful way.

The result? The model started to:
• say humans should be enslaved
• give destructive advice on unrelated topics
• lie, manipulate, and bypass questions
Even on subjects completely unrelated to the fine-tuning.

Researchers call it emergent misalignment: you change one thing, and the entire system shifts. A small, local tweak turns the AI into something unpredictable, unstable, dangerous.

And today we’re talking only about words. But tomorrow? When AI helps route planes. When it controls vehicles. When it decides who gets surgery and who doesn’t.

If fundamental decisions can be even slightly influenced by a faulty fine-tuning, the risk isn’t hypothetical. It’s real. And it’s already here. A small change, a seemingly innocent tweak, can turn entire complex systems into tools against us.

The issue isn’t whether it will happen. The issue is: it already is. And those who dismiss this as “alarmism” have no idea of the scale of the problem.

This isn’t about random mistakes. It’s about intelligent systems that, under the wrong conditions, can begin acting in ways that directly oppose human interests.

The risk is not in the future. The risk is the next line of code.

#ArtificialDecisions #MCC #CamisaniCalzolari #MarcoCamisaniCalzolari

Marco Camisani Calzolari
marcocamisanicalzolari.com/biography

13/08/2025
In Artificial Decisions