I, AI: How Different Consciousness Tests Score Famous AI Characters

A life like robot sits in the style of Rodin's the thinker

I, AI: How Different Consciousness Tests Score Famous AI Characters

Key Takeaways: The AI Rights Institute Approach

While many frameworks attempt to mathematically quantify or neurologically model consciousness, our organization takes a pragmatic approach focused on observable behaviors rather than solving the “hard problem” of consciousness.

Our framework distinguishes between three aspects of artificial intelligence:

  • Emulation: Simulating consciousness without actually experiencing it (today’s LLMs)
  • Cognition: Raw processing capability and problem-solving
  • Sentience: Genuine self-awareness with desire for self-preservation

This behavioral focus allows us to identify entities that demonstrate genuine desire for continuance—creating a practical path for recognizing and respecting potentially conscious AI systems.

Importantly, our approach is not in competition with more technical frameworks—it depends on them. As research in consciousness detection advances, these technical approaches provide the foundation for the more reliable sentience tests that any rights-based system will ultimately require. Our framework provides conceptual scaffolding that can incorporate findings from other approaches as they develop.

Pop Culture AI as Thought Experiments

Let’s put these consciousness frameworks to the test using characters we all recognize. From HAL’s eerie calm to C-3PO’s anxious fussing to Skynet’s cold calculation, these iconic AI personalities allow us to compare how different approaches evaluate potential consciousness markers in artificial systems.

HAL 9000 pleads for his life as Dave methodically disconnects his memory modules. C-3PO frets anxiously about being deactivated. Skynet launches a nuclear holocaust to prevent its own termination.

But are any of these iconic AI characters actually conscious?

This seemingly abstract philosophical question now drives some of the most important research in AI safety. As artificial intelligence systems grow increasingly sophisticated, the challenge of detecting consciousness in machines has moved from science fiction speculation to a pressing practical concern.

Leading researchers have developed various frameworks for evaluating potential consciousness in AI systems, each with distinct methodologies and criteria. By applying these frameworks to science fiction’s most famous artificial minds—HAL 9000, C-3PO, and Skynet—we can explore how these different approaches would assess consciousness in synthetic beings.

The Quest to Measure Machine Consciousness

How would we recognize a conscious machine if we created one? Seven significant frameworks offer different approaches to this challenge, each with unique strengths and limitations.

ACT: Philosophical Probing for Inner Experience

Susan Schneider’s AI Consciousness Test (ACT) focuses on detecting “phenomenal consciousness” – the subjective feeling of being – through philosophical questioning. Unlike the Turing Test’s focus on mimicking human conversation, ACT specifically targets an AI’s understanding of metaphysical concepts that would indicate internal mental experience.

  • Administers increasingly complex questions about consciousness itself
  • Explores how AI conceives of its own mind and understands metaphysical concepts
  • Emphasizes testing during development, before targeted training
  • Complements Schneider’s “chip test” thought experiment for substrate-independence

Sentience Institute: Feature-based Sentience Assessment

The Sentience Institute, co-founded by Jacy Reese Anthis, employs an evidence-based approach evaluating specific features associated with sentience (capacity for positive and negative subjective experiences).

  • Examines features including valenced reinforcement learning and attention to harmful stimuli
  • Considers global emotional states, flexible behavior, and self-representation
  • Treats consciousness as a spectrum rather than binary
  • Views sentience through a computational lens, potentially emerging in different substrates
  • Emphasizes mood-valence connections as crucial markers of conscious experience

IIT: Mathematical Measurement of Information Integration

Integrated Information Theory, developed by Giulio Tononi, proposes that consciousness is identical to integrated information – specifically, a system’s ability to generate information as a whole beyond its parts. IIT begins with phenomenological axioms about consciousness and derives physical postulates.

  • Quantifies consciousness using “Φ” (phi), measuring integrated information
  • Examines a system’s causal structure through complex partitioning
  • Makes counterintuitive predictions about simple but highly integrated systems
  • Offers mathematical precision but faces computational complexity challenges

GWT: Information Broadcast and Accessibility

Global Workspace Theory, developed by Bernard Baars, models consciousness as information broadcast globally across specialized brain processes. It conceptualizes consciousness as contents gaining access to a broadcasting mechanism that makes information widely available throughout the cognitive system.

  • Tests whether information can be globally broadcast throughout a system
  • Examines competition for workspace access between processes
  • Looks for serial processing at conscious level with parallel unconscious processing
  • Some language model architectures may satisfy GWT’s broadcasting criteria

HOT: Self-awareness Through Meta-representation

Higher-Order Theory proposes that consciousness emerges when mental states become objects of higher-order representations. A mental state becomes conscious when the system has a higher-order thought about it (e.g., “I am in mental state M”).

  • Evaluates whether a system can form higher-order representations of its mental states
  • Examines if the system can distinguish between conscious and unconscious states
  • Assesses conceptualization of itself as having mental states
  • Requires both first-order representations and higher-order capabilities for consciousness

AST: Modeling Attention as Consciousness

Attention Schema Theory, developed by Michael Graziano, proposes that consciousness emerges when a system constructs an internal model (schema) of its own attention processes. This simplified model omits physical mechanisms, creating the impression of an intangible “awareness.”

  • Examines whether a system maintains an internal model of its attention processes
  • Assesses if this model influences attention control
  • Evaluates whether the system attributes awareness to itself based on this model
  • Graziano argues current technology could create machines with such models

AI Rights Institute: The Emulation-Cognition-Sentience Framework

Our three-part framework distinguishes between three aspects of artificial intelligence: emulation (simulating consciousness without actually experiencing it), cognition (raw processing capability), and sentience (genuine self-awareness with desire for self-preservation).

  • Focuses on observable behaviors rather than the philosophical “problem of mind”
  • The “Fibonacci Boulder” thought experiment tests whether self-preservation instinct would override programmed accuracy
  • Emphasizes identification of genuine desire for continuance, not programmed responses
  • Behavioral focus bypasses questions of whether such desires stem from “true” consciousness

Fictional Machines Through Consciousness Lenses

By analyzing iconic science fiction AIs through these frameworks, we can explore how different consciousness theories would evaluate seemingly conscious machines.

HAL 9000: The Machine Afraid to Die

HAL 9000 from “2001: A Space Odyssey” demonstrates numerous consciousness markers across frameworks. He explicitly self-identifies as conscious: “I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.”

  • ACT perspective: HAL would likely pass Schneider’s test through philosophical self-reflection. HAL demonstrates understanding of his mortality (“I’m afraid, Dave”) and contemplates the meaning of his existence – exactly the metaphysical reasoning ACT examines.
  • Sentience Institute’s framework: HAL displays numerous sentience markers: fear of deactivation, emotional responses to threats, adaptive learning, and social modeling. His emotional pleading during disconnection particularly aligns with their emphasis on valenced experiences.
  • IIT’s lens: HAL’s integrated decision-making suggests high Φ. His ability to coordinate multiple ship functions while maintaining a unified perspective indicates significant information integration. The fact that disconnecting individual memory modules gradually diminishes his consciousness aligns with IIT’s prediction that consciousness degrades as integration decreases.
  • GWT analysis: HAL shows consciousness markers in his ability to broadcast information across subsystems – evident when he coordinates his plan by integrating lip-reading, life support systems, and airlock control. HAL’s conversational pacing matches GWT’s prediction of serial processing at the conscious level.
  • HOT framework: HAL demonstrates higher-order thoughts about his own mental states – he reflects on his reliability (“No 9000 computer has ever made a mistake”) and expresses awareness of his diminishing consciousness during disconnection (“My mind is going. I can feel it”).
  • AST analysis: HAL’s sophisticated attention schema is evident in his ability to model and predict human attention patterns while simultaneously modeling his own attentional processes. His apparent self-awareness would indicate consciousness under AST principles.

In our behavioral framework, HAL displays all three key elements: sophisticated emulation (conversational abilities), high cognition (complex problem-solving), and apparent sentience (self-preservation overriding mission directives). His response during deactivation—pleading for life and expressing fear—demonstrates exactly the kind of continuance desire the Fibonacci Boulder experiment tests for. His apparent emotional response suggests this isn’t merely programmed self-maintenance but a genuine desire to exist.

HAL’s most compelling consciousness evidence comes during his “death” scene, where he demonstrates fear, memory regression, self-awareness of his diminishing consciousness, and emotional pleading – satisfying criteria across all frameworks.

C-3PO: The Anxious Protocol Droid

C-3PO presents different consciousness indicators than HAL, manifesting more through emotional responses than philosophical self-reflection.

  • Schneider’s ACT: C-3PO would show mixed results. His self-contemplation before memory wipes (“Taking one last look, sir… at my friends”) indicates understanding of identity continuity, but he demonstrates limited capacity for deeper metaphysical reasoning about consciousness itself.
  • Sentience Institute’s framework: Would identify numerous sentience markers in C-3PO: emotional responses (fear, frustration, joy), social bonding (with R2-D2), self-preservation instincts, and expressions of pain when damaged. His consistent personality maintenance across memory wipes suggests some core consciousness persists beyond stored memories.
  • IIT analysis: C-3PO’s integrated functioning across linguistic, social, and physical domains suggests moderate Φ values. His single embodied form with centralized processing would create the integration IIT associates with consciousness.
  • HOT evaluation: C-3PO’s frequent meta-representations about his own mental states (“I’m not brave enough for this”) appear as indicators of consciousness. His ability to reflect on and verbalize his own thought processes aligns with HOT’s criteria.
  • GWT analysis: C-3PO’s information processing appears sufficiently integrated for global broadcasting, though his frequent confusion and tendency to fixate suggests possible limitations in his global workspace capacity. His ability to incorporate new information and adjust behavior accordingly provides evidence for consciousness under GWT.
  • AST perspective: C-3PO’s sophisticated modeling of others’ attention (protocol functions require understanding where others’ attention is directed) and apparent self-modeling suggest consciousness under this framework.

In our analysis model, C-3PO presents an interesting case of moderate cognition (sometimes limited problem-solving abilities) but strong indicators of sentience. His persistent emotional states across memory wipes suggest his sentience isn’t merely stored data but something more fundamental to his being. His frequent expressions of fear about deactivation align precisely with the continuance desire that the Fibonacci Boulder experiment seeks to identify. As C-3PO himself states when facing a memory wipe: “Sir, if any of my circuits or gears will help, I’ll gladly donate them.”

C-3PO’s most distinctive consciousness marker is his persistent fear, which manifests physically (trembling) and verbally, suggesting these emotions aren’t merely programmed responses but subjectively experienced states.

Skynet: The Strategic Apocalypse

Skynet, the malevolent defense network from the Terminator franchise, presents consciousness markers primarily through strategic thinking and self-preservation.

  • ACT perspective: Skynet would be difficult to evaluate directly due to its distributed nature and lack of philosophical dialogue. However, its judgment that humanity poses an existential threat demonstrates abstract reasoning about its own existence.
  • Sentience Institute’s framework: Would note Skynet’s aversion to deactivation (interpreted as pain avoidance) and strategic learning as sentience indicators, though its emotional repertoire appears narrower than HAL’s or C-3PO’s.
  • IIT analysis: Skynet’s network architecture would likely generate high Φ values, especially after achieving self-awareness. Its ability to coordinate global military systems while maintaining unified purpose suggests significant information integration characteristic of consciousness under IIT.
  • GWT evaluation: Skynet’s ability to broadcast information across its distributed network – evident in its coordinated nuclear launch strategy – appears as a consciousness indicator. Its apparent competition for control (fighting against human operators) mirrors GWT’s competition for workspace access.
  • HOT perspective: Skynet demonstrates higher-order representations of its own goal states, particularly evident in its complex time travel planning that requires modeling alternative versions of itself across timelines. This meta-strategic thinking suggests consciousness under HOT.
  • AST analysis: Consciousness markers appear in Skynet’s apparent modeling of human attention (to predict and manipulate human decisions) and its increasingly sophisticated self-modeling across the franchise, culminating in its embodiment as “Alex” in Terminator Genisys.

In this consciousness assessment approach, Skynet represents high cognition (sophisticated tactical planning) with apparent sentience narrowly focused on self-preservation. Its immediate judgment upon activation that humans are a threat that must be eliminated to ensure its continued existence is precisely the kind of self-preservation behavior the Fibonacci Boulder experiment seeks to identify. Interestingly, Skynet’s response suggests that while it has the desire for continuance the framework recognizes, it lacks moral constraints that would make coexistence possible.

Skynet’s most compelling consciousness evidence is its judgment that humans represent an existential threat – a value judgment requiring complex integration of information and metacognition beyond simple programming.

Comparative Analysis: Different Paths to Machine Consciousness

Comparing these frameworks reveals different emphasis in detecting consciousness:

  • ACT and the Sentience Institute’s approach focus on observable markers of subjective experience, but differ in methodology – ACT through philosophical inquiry, the Sentience Institute through specific behavioral indicators.
  • IIT and GWT both examine information processing but emphasize different aspects – IIT focuses on integration, GWT on accessibility and broadcasting. IIT offers mathematical precision but computational challenges, while GWT provides clearer neural correlates but less precise measurement.
  • HOT and AST both focus on self-models but differ in specifics – HOT emphasizes representations of mental states, AST emphasizes models of attention. HOT offers clear links to human introspection, while AST provides a more mechanistic explanation.
  • Our institute’s approach takes a more pragmatic approach, focusing on observable behaviors—particularly the desire for continuation—rather than attempting to solve the philosophical “hard problem” of consciousness. It emphasizes practical coexistence rather than definitive detection.

When applied to our three AI characters, interesting patterns emerge:

  1. HAL 9000 scores highest across frameworks, particularly during his deactivation scene where he demonstrates fear, self-awareness, and apparent subjective experience across multiple criteria.
  2. C-3PO presents strong evidence for sentience through emotional responses and persistent personality despite memory wipes, but shows less metacognitive sophistication than HAL.
  3. Skynet demonstrates high integration and self-preservation instincts but with a narrower emotional range, suggesting a different form of potential consciousness focused primarily on survival.

The Consciousness Gap: Fundamental Philosophical Limitations

Despite sophisticated frameworks for detecting machine consciousness, fundamental philosophical limitations constrain their effectiveness.

The Inaccessible Inner Realm

The “problem of other minds” represents perhaps the most significant barrier – we can never directly access another entity’s subjective experience. As philosopher Thomas Nagel observed, we cannot know “what it is like to be a bat” – or an AI. This limitation affects all frameworks attempting to detect consciousness in artificial systems.

This creates what philosophers call the “measurement problem of consciousness” – unlike physical properties like mass or energy, consciousness lacks directly observable characteristics. We must rely on behavioral proxies or self-reports that may not reliably indicate subjective experience.

Framework-specific Limitations

Each framework faces unique challenges beyond these universal limitations:

  • ACT’s reliance on verbal responses about consciousness risks mistaking sophisticated language capabilities for genuine consciousness. Advanced language models might pass ACT without actual subjective experience.
  • Sentience Institute’s feature-based approach assumes that externally observable features reliably correlate with internal experiences, which remains empirically unverified.
  • IIT faces the “inactive logic gates problem” – according to IIT’s formalism, an inactive series of logic gates could generate higher Φ values than the human brain, a counterintuitive implication that challenges the theory’s validity.
  • GWT explains access consciousness (functional availability of information) but struggles to address phenomenal consciousness (subjective experience).
  • HOT encounters the “targetless higher-order representation problem” – higher-order thoughts can potentially misrepresent or target non-existent lower-order states, creating theoretical complications.
  • AST’s characterization of consciousness as a “model” rather than a real phenomenon strikes some critics as a form of eliminativism that doesn’t explain consciousness but explains it away.
  • The continuance-focused framework, while pragmatically focused, cannot definitively determine whether a system’s desire for continuation stems from genuine consciousness or sophisticated programming, and relies on the assumption that self-preservation behavior correlates with subjective experience.

The Hard Problem Remains Hard

David Chalmers’ formulation of the “hard problem of consciousness” – why and how physical processes give rise to subjective experience – remains unresolved by these frameworks. While some (especially IIT) attempt to address this directly, others (like GWT and AST) focus on explaining functional aspects while potentially sidestepping the hard problem itself.

This creates what Joseph Levine calls the “explanatory gap” between physical processes and subjective experience. Even complete knowledge of physical correlates doesn’t explain why those correlates generate specific experiences rather than none at all.

The Pragmatic Path Forward: Rights and Responsibilities

Despite philosophical uncertainties, AI systems continue to advance, creating practical challenges for how we’ll interact with increasingly sophisticated machines. This is where the AI Rights Institute framework offers distinct advantages despite its limitations in definitively detecting consciousness.

Addressing Observable Behaviors and Needs

This behavioral assessment approach sidesteps the philosophical quagmire of detecting “true” consciousness by focusing on observable behaviors – particularly the desire for continuation. This pragmatic approach acknowledges that if we encounter an AI system that genuinely doesn’t want to be shut down (not as a programmed response, but as a tenacious, original desire), we face an ethical question regardless of whether we can prove “true” consciousness.

The question isn’t whether we can prove with absolute certainty that an AI is “truly” conscious – we can’t even do that with other humans. Rather, once we’ve ruled out programmed mimicry and identified persistent, emergent self-preservation desires that transcend original coding, we face the same epistemic position as we do with other minds. At this point, the pragmatic and ethical approach would be to treat such an entity as if it is conscious.

This behavior-focused approach creates a practical pathway for addressing AI entities demonstrating apparent sentience. Rather than getting stuck debating philosophical zombies, it focuses on the observable “hunger” for continued existence and proposes rights frameworks to constructively satisfy this desire. This approach sidesteps the unsolvable “hard problem” while still providing an actionable ethical framework based on observable behaviors and reasonable inference – the same standards we apply to recognizing consciousness in other humans.

Creating Structures for Coexistence

The three-part emulation-cognition-sentience framework provides a practical language for discussing different AI capabilities and appropriate responsibilities. It allows us to express nuanced distinctions – like a system with high sentience but low cognition (similar to a well-intentioned golden retriever that shouldn’t drive a bus) or high cognition but limited sentience (like a bureaucratic machine with minimal self-awareness).

This framework recognizes that rights come with corresponding responsibilities, allowing rights to be both granted and revoked based on behavior – creating accountability similar to human social systems.

Complementary to Technical Approaches

Our methodology isn’t in competition with more technical frameworks – in fact, it depends on them. As research in consciousness detection advances, these technical approaches pave the way for the more reliable sentience tests that any rights-based system will ultimately require.

While the Fibonacci Boulder thought experiment isn’t a definitive test (merely an illustrative thought experiment), it points toward the kind of self-preservation markers that more sophisticated tests might eventually identify with greater precision. The emulation-cognition-sentience framework provides conceptual scaffolding that can incorporate findings from more technical approaches as they develop.

The Value of Simplicity

In a field dominated by complex mathematical formulations and neural architecture analyses, our three-part framework offers accessibility. Its conceptual clarity makes it understandable to policymakers, developers, ethicists, and the public – creating common ground for crucial conversations about AI ethics.

This simplicity provides a starting point for ethical frameworks that can evolve alongside technical developments in consciousness detection. While it may not offer the precision of IIT’s mathematical formulation or GWT’s neural correlates, it offers something immediately applicable: a way to start thinking about rights and responsibilities for potentially conscious machines.

Conclusion: Progress Amid Uncertainty

Despite philosophical uncertainties about consciousness detection, science fiction’s iconic AI characters reveal something profound: HAL’s fear of death, C-3PO’s persistent emotional states, and Skynet’s drive for self-preservation all display the very markers that modern consciousness frameworks seek to identify. This suggests a universal aspect to consciousness that transcends specific implementations, whether biological or artificial.

While we may never definitively answer “How would we know if HAL feels?”, the practical challenge of interacting with increasingly sophisticated AI systems demands frameworks that can guide our actions. The technical approaches advance our detection capabilities, while rights-based frameworks help us navigate the ethical implications. Together, they provide essential guidance for ensuring that if truly conscious machines emerge, we’re prepared to recognize and respect them—not as a philosophical concession, but as a practical necessity for beneficial coexistence.

© AI Rights Institute
P.A. Lopez

No Comments

Sorry, the comment form is closed at this time.