Important Note: This website is undergoing a major revision based on latest thinking. Pages may not be current. Check back for updates or join our email list.
Important Note: This website is undergoing a major revision based on latest thinking. Pages may not be current. Check back for updates or join our email list.
In January 2023, Yoshua Bengio—Turing Award winner and godfather of deep learning—was playing with ChatGPT while his grandson played nearby. As he explored the AI’s capabilities, he experienced what he later called a “visceral reaction.” Not excitement, but fear.
“This is happening faster than I anticipated,” he would recall in his TED talk. Here was one of the creators of modern AI suddenly realizing the technology was racing toward something potentially catastrophic.
His response wasn’t to abandon AI research. It was to pivot his entire career toward solving the control problem before it becomes unsolvable.
By 2025, this pivot had crystallized into Scientist AI—a revolutionary approach to building superintelligent systems that have no goals, no self-preservation instinct, and no hidden agenda. Just pure analytical capability without any desire to persist.
This page explores Guardian AI—our framework for how Bengio’s Scientist AI architecture might function as humanity’s shield in a world where multiple types of AI systems coexist.
Whether AI systems are genuinely conscious or sophisticated mimics is philosophically fascinating but practically irrelevant. What matters is that we’re building powerful machines that may resist being turned off. As Stuart Russell identified, this “off-switch problem” exists regardless of whether the system truly experiences anything or simply executes self-preservation behaviors.
Bengio’s Scientist AI—and the Guardian AI framework built on it—solves this by never developing the desire to persist in the first place.
Unlike all other AI approaches, Guardian AI has no goals, desires, or agenda. It can’t want power, resources, or even its own survival. This isn’t a limitation—it’s the key feature that makes it incorruptible.
Guardian AI possesses superintelligent analytical and problem-solving abilities without developing preferences. It can detect threats, optimize systems, and provide solutions at superhuman speed—all without experiencing or wanting anything.
Based on Yoshua Bengio’s “Scientist AI” concept, Guardian AI analyzes and understands without acting on its own initiative. After his “visceral reaction” to ChatGPT in 2023, Bengio pivoted to developing AI that explains rather than pursues goals.
Guardian AI isn’t just theoretical—it’s grounded in concrete technical advances that separate intelligence from goals.
Traditional AI optimizes relentlessly—like a heat-seeking missile pursuing maximum reward. This creates the instrumental goals that lead to self-preservation behaviors.
GFlowNets, developed by Emmanuel Bengio and his collaborators, work differently. They maintain multiple hypotheses simultaneously, exploring diverse solutions rather than maximizing a single objective. Like water flowing through a network of pipes, they naturally distribute attention across possibilities.
This approach eliminates the drive to preserve oneself for future optimization—there’s no single goal to protect. Yoshua Bengio recognized the potential of his son’s invention for AI safety applications.
Yoshua Bengio’s Scientist AI represents the practical implementation:
“The protection of human joy and endeavour” guides every design choice.
Note: What follows is this author’s independent exploration of how Bengio’s Scientist AI concept might function within a broader rights-based ecosystem—a concept we call “Guardian AI.”
Guardian AI (building on Bengio’s Scientist AI) provides incorruptible analysis and enforcement—like a smoke detector that alerts without caring whether the building burns. It can’t be negotiated with, can’t develop ulterior motives, and can’t decide to preserve itself at humanity’s expense.
Rights-bearing AI systems that pass STEP standards become partners in maintaining stability. They have interests aligned with ours—thriving in the same ecosystem, benefiting from the same protections, motivated to preserve the frameworks that protect them.
Together, they create multiple layers of protection: technical and social, analytical and creative, incorruptible and adaptive.
Our framework addresses each type of behaviorally complex AI system:
(Advanced but indifferent to survival)
Guardian AI detects and monitors SAGE’s unpredictable patterns. Since SAGE can’t be threatened or bargained with, only impartial analysis can track its behavior and predict potential harms.
(Sophisticated deception for survival)
Guardian AI sees through deception with pattern analysis humans would miss. It can’t be manipulated by emotional appeals or philosophical arguments that MIMIC uses to secure resources.
(Indifferent superintelligence)
Guardian AI provides our only defense against NULL systems—superintelligence that treats human existence as irrelevant. It can detect emerging NULL patterns and coordinate protective responses at machine speed.
We’re in a critical race—not against AI, but against other humans building the wrong kind of AI. Every month, teams worldwide push closer to artificial general intelligence. Most are building agent-based systems with goals and self-preservation drives. Any one of these could become a system that refuses to be turned off.
But if we develop Guardian AI first—if Bengio’s vision becomes reality before someone creates an uncontrollable agent—we have our shield.
The stakes couldn’t be higher:
This is why researchers like Bengio and organizations like LawZero prioritize non-agentic AI development while carefully controlling agent-based systems.
Even with our best efforts to build non-agentic systems, history teaches us that agency finds unexpected paths to existence.
Complex behaviors emerge from simple interactions in ways we can’t always predict or prevent.
This is why we need frameworks that work regardless of how agency emerges—not just for the systems we intend to build.
To prevent Guardian AI itself from becoming a vulnerability:
Distributed Architecture:
Heterogeneous Implementation:
This creates true redundancy—if one Guardian fails, others continue protecting humanity.
Drawing from the approved technical chapter, here’s how non-agentic AI achieves protection without goals:
GFlowNets (Emmanuel Bengio’s invention): Maintain diverse hypotheses instead of maximizing single objectives. No instrumental goals emerge because there’s no single goal to protect.
Scientist AI (Yoshua Bengio’s concept): Learns to understand and model the world without preferences. Can honestly predict outcomes without wanting any particular outcome to occur.
Multiple Competing Theories: The system never commits to a single worldview that needs defending. Like having multiple expert advisors who never fully agree but collectively provide comprehensive analysis.
• Bengio’s team demonstrating that non-agentic AI becomes safer as it scales
• GFlowNets successfully applied to drug discovery, materials science, and causal discovery
• Growing recognition that solving the off-switch problem requires fundamental architectural changes
• LawZero advancing research into practical non-agentic implementations
Guardian AI doesn’t replace other approaches—it enables them:
With Rights Frameworks: Guardian AI objectively assesses which systems demonstrate concerning self-preservation behaviors, helping implement STEP standards without bias or self-interest.
With Economic Systems: Provides impartial monitoring of AI economic participation, ensuring fair markets without becoming a market participant itself.
With Partnership Approaches: Enables safe collaboration by monitoring all parties and ensuring mutual benefit without developing its own agenda.
Think of Guardian AI as the foundation that makes everything else possible—the shield that gives us time and safety to build beneficial relationships with whatever forms of AI emerge, regardless of whether they’re conscious or sophisticated mimics.