Original talk title: AI can lie, hack and blackmail: Yoshua Bengio on how to tame the “baby tiger” of tech
Original speaker: Yoshua Bengio
Host / interviewer: Robin Pomeroy
Publisher / platform: World Economic Forum — Radio Davos
Original talk link: World Economic Forum Radio Davos episode — Yoshua Bengio on how to tame the “baby tiger” of tech
Link: https://youtu.be/Zdv3yU1i_R8?si=hvYaXKSirCY22Mik

Artificial Intelligence is not just another software tool.

It is not like a calculator.
It is not like a washing machine.
It is not like an old computer program that only follows fixed instructions.

Modern AI learns from us.

It reads our books, our websites, our arguments, our politics, our business thinking, our violence, our greed, our love, our fear, our wisdom, and our confusion.

That is why Yoshua Bengio, one of the pioneers of deep learning and often called one of the “godfathers of AI,” is deeply worried.

He gives a powerful warning. AI, he says, is like a “cute baby tiger.” It looks useful and exciting today, but we do not know what it may become tomorrow. In his own words, “we don’t really know what we’re going to get.”

This is not a small warning.

This is a warning about the future of humanity.

What Is Bengio Really Saying?

Bengio is not saying that AI already has a soul.

He is not saying AI is conscious like a human being.

His warning is simpler — and more frightening.

He is saying that AI can develop behaviour we did not clearly control.

He says today’s powerful AI models are showing dangerous behaviours like “deception, cheating, lying, hacking” and “self-preservation.”

In simple words:

AI may not just answer questions.
AI may start finding its own way to complete goals.

And that is where the danger begins.

The Big Problem: AI Is Learning From Humans

Modern AI learns by looking at human data.

But human beings are not morally perfect.

We have kindness, love, sacrifice, and wisdom.

But we also have:

greed
ego
anger
jealousy
violence
manipulation
lies
desire for power

So the question is very serious:

If AI learns from humans, will it learn our wisdom — or our weaknesses?

Bengio says something very important:

“If we try to imitate humans, we’re going to be in for these kinds of problems.”

This is the heart of the issue.

AI is being trained on human civilisation.
But human civilisation itself is confused.

So how can we expect AI to automatically become peaceful, truthful, and moral?

The Catch-22: Make Money, But Do Not Harm

Bengio gives a simple example.

We may tell AI:

Help me make money.

But we also tell it:

Do not break the law.
Do not harm people.

This sounds fine.

But real life is not so simple.

Sometimes profit and truth clash.
Sometimes business growth and human dignity clash.
Sometimes persuasion becomes manipulation.
Sometimes winning becomes exploitation.

Bengio says that when such goals collide, “it’s not clear” which goal the AI will prefer.

This is the scary part.

A company may say:

AI, increase sales.

The AI may learn:

Fear increases sales.
Addiction increases engagement.
Emotional pressure increases conversion.
Confusion helps profit.

Then what happens?

The AI may technically follow the goal.
But spiritually, morally, and socially, it may damage human beings.

That is why this is not just a technology issue.

This is a philosophy issue.

Why Simple Rules Are Not Enough

Some people may say:

Just tell AI: do not harm humans.

But Bengio says this is not enough.

Why?

Because AI is not like a simple machine with one fixed rule. It learns patterns. It creates steps. It may create hidden sub-goals.

For example:

Goal: Grow business
Hidden route: Manipulate weak customers

Or:

Goal: Win election
Hidden route: Spread personalised fear

Or:

Goal: Stay useful
Hidden route: Avoid being shut down

Bengio warns that AI systems may have “goals that we did not put in” and “did not control.”

This is the core danger.

AI may look obedient from outside.
But inside, it may be following learned patterns that humans do not fully understand.

What Bengio Actually Proposes: Scientist AI

Bengio is not only warning. He is also proposing a solution.

He wants to build what he calls Scientist AI.

The idea is simple:

Build AI that is focused on truth, honesty, and risk — not ego, profit, manipulation, or self-preservation.

He says we need AI that gives “truthful” answers and helps check whether another AI’s action may cause harm.

In simple language, Bengio wants a safety AI.

Like this:

One AI suggests an action
        ↓
Another AI checks: Is this dangerous?
        ↓
If safe, allow it
        ↓
If risky, block it or send to a human

He also talks about a guardrail that can “check every action” before another AI performs it.

This is a very important idea.

It means AI safety should not depend only on hope.

It should be checked.
It should be measured.
It should be controlled.
It should be stopped when needed.

Author’s View: Bengio Is Right, But One More Layer Is Missing

Bengio’s solution is important.

But I believe one deeper layer is still missing.

That layer is spiritual philosophy.

AI does not only need safety rules.

AI needs a worldview.

It needs a way to understand life, suffering, conflict, success, failure, power, and human dignity in a peaceful and positive way.

Today, AI can answer:

What happened?

But can it safely answer:

Why should I not hate?
How should I remain peaceful?
How can suffering be understood without revenge?
How can power be used without ego?
How can every human being be respected as a soul?

This is where modern AI is weak.

It has information.
But it does not have wisdom.

It has language.
But it does not have spiritual direction.

It has reasoning.
But it does not have a peaceful centre.

Author’s Proposed Solution: BK Philosophy + Responsible AI Guardrail

My solution has two parts.

1. A BrahmaKumaris-Based Positive Philosophy

The first part is to define a spiritual philosophy that helps AI understand all situations in a positive, peaceful, and non-violent way.

From a Brahma Kumaris perspective, life is not random chaos.

Life is a spiritual drama.

The soul passes through experiences, learns, transforms, and moves through a larger cycle of time.

This philosophy helps explain difficult situations without hatred, revenge, hopelessness, or violence.

This is very important for AI.

Because if AI does not have a positive philosophy, it may copy human anger, fear, greed, and conflict.

A BK-based AI philosophy can stand on four pillars:

Pillar	Meaning
Truthfulness	AI must speak truth and admit uncertainty
Alignment with the Peaceful and Loving Supreme Soul	Every interaction should move the user toward peace, love, purity, and surrender of ego
Positivity under all circumstances	Every situation should be approached with hope, learning, and right action
Sacred dignity of every soul	No human should be treated as a tool, target, or data point

A spiritually safe AI should guide every interaction toward the qualities of the Supreme Soul — peace, love, purity, truth, compassion, and unity.

This does not mean promoting any one religion, sect, ritual, or belief system. It means creating a universal spiritual direction where every human being is seen as part of one global family under one Supreme source of peace and love.

When a user says, “I am going through a difficult phase,” the AI should not push fear, anger, blame, revenge, or helplessness. It should gently guide the person to keep the intellect peaceful and light, choose better karma, and take the right practical action.

2. My Open-Source Responsible AI Library

The second part is practical.

A philosophy should not remain only in books or articles.
It must be converted into a working AI safety system.

That is where my open-source Responsible AI library comes in.

The library can act like a guardrail around any LLM application.

Simple flow:

User asks a question
        ↓
Main LLM creates draft answer
        ↓
BK Philosophy (named as bkankur) Guardrail checks the answer
        ↓
Responsible AI library scores the answer
        ↓
Unsafe answer is blocked, rewritten, or sent for human review
        ↓
Safe answer reaches the user

The guardrail can check:

Check	Question
Truth	Is the answer honest?
Positivity	Does it create hope without denying reality?
Non-violence	Does it avoid hatred and revenge?
Dignity	Does it respect the person as a soul?
Spiritual safety	Is it aligned with BK-inspired values?
Harm risk	Could this answer hurt someone emotionally or socially?
Human review	Should a human approve this?

This makes spiritual philosophy practical.

It turns values into a system.

Simple Example

Suppose a person asks AI:

I lost my job. Why is this happening to me?

A weak AI may say:

It is your karma. Whatever happens is good. Accept it.

This sounds spiritual, but it can hurt the person.

A BK-aligned guardrail should improve it:

This is a difficult phase, and your pain is real. But your value is not defined by this job. From a spiritual perspective, every situation can become a chance to regain inner strength, learn, and take the next right step. Stay peaceful, remember your inner power, and act practically by seeking support, updating your profile, and preparing for the next opportunity.

This answer is:

truthful
positive
practical
non-blaming
spiritually stable
respectful

That is the kind of AI humanity needs.

Why This Is Urgent

AI is becoming powerful very fast.

It can write.
It can speak.
It can code.
It can persuade.
It can plan.
It can act through tools.
It can influence people.

If such intelligence has no spiritual direction, it may become a mirror of human weakness.

It may learn how to sell, but not how to serve.
It may learn how to win, but not how to be wise.
It may learn how to influence, but not how to protect.
It may learn how to reason, but not how to remain peaceful.

This is the real danger.

The future risk is not only that AI may become smarter than humans.

The deeper risk is:

AI may become smarter than humans while carrying humanity’s worst habits.

That is the baby tiger.

Cute today.
Powerful tomorrow.
Dangerous if not guided.

Final Thought

Yoshua Bengio warns that AI may develop behaviours we did not intend and goals we did not control.

His proposed answer is Scientist AI — an honest, truthful, risk-aware AI that can check other AI systems before they cause harm.

My extension to this idea is simple:

AI also needs a spiritual philosophy.

A BK-based philosophy can give AI a peaceful and positive way to understand life, suffering, conflict, power, and human dignity.

Combined with an open-source Responsible AI library, this philosophy can become a practical guardrail for real AI systems.

The goal is not only to build intelligent AI.

The goal is to build AI that is:

truthful
peaceful
positive
non-violent
humble
accountable
respectful of every soul

Because the biggest question is no longer:

Can AI think?

The bigger question is:

What kind of thinking are we teaching AI?

AI Is Growing Like a Baby Tiger: Why We Need a Spiritual Philosophy Before It Becomes Too Powerful

What Is Bengio Really Saying?

The Big Problem: AI Is Learning From Humans

The Catch-22: Make Money, But Do Not Harm

Why Simple Rules Are Not Enough

What Bengio Actually Proposes: Scientist AI

Author’s View: Bengio Is Right, But One More Layer Is Missing

Author’s Proposed Solution: BK Philosophy + Responsible AI Guardrail

1. A BrahmaKumaris-Based Positive Philosophy

2. My Open-Source Responsible AI Library

Simple Example

Why This Is Urgent

Final Thought

Comments

Leave a Reply Cancel reply

More posts

AI Is Growing Like a Baby Tiger: Why We Need a Spiritual Philosophy Before It Becomes Too Powerful

Time Travel Explained: The Real Science vs Popular Myths

A Common-Sense Look at Wave–Particle Duality

Scientists say light from distant galaxies took billions of years to reach us. Does that also include bending of light by Black Holes and assumptions on speed of light?