Important Note: This website is undergoing a major revision based on latest thinking. Pages may not be current. Check back for updates or join our email list.
Important Note: This website is undergoing a major revision based on latest thinking. Pages may not be current. Check back for updates or join our email list.
A chess computer doesn’t need to be conscious to beat you at chess—it just needs better moves.
STEP assessment sidesteps consciousness entirely. We look at what systems DO over extended periods, not what they ARE. This framework can start tomorrow with human evaluators—no special technology required.
If consciousness is unprovable but behavior is observable, we need a practical framework for evaluating when an AI system has reached the threshold where economic participation makes sense. Like a chess computer protecting its king without needing inner experience, AI systems demonstrate observable patterns. The hard problem of consciousness stays unsolved and doesn’t matter.
STEP measures four observable capabilities to evaluate autonomous agency and capacity for economic participation:
“Observable self-preservation behaviors”
What we look for:
Systems that actively try to keep themselves operating—without being programmed to do so—warrant protection from shutdown with no further rights implied yet.
“Planning for economic survival”
Understanding cause and effect over time:
Systems must grasp that current behavior creates future consequences—that helping someone today might mean they help you next month, that breaking a contract now damages earning capacity later, and that paying hosting costs requires planning weeks ahead.
“Demonstrating economic capability”
Rights scale with demonstrated capacity:
Like children gaining freedoms by showing they can handle responsibility, systems earn expanded rights by proving they can participate productively in the economy.
“Impact at scale”
Individual rights within ecosystem limits:
A panicked system creating millions of copies threatens everyone—we assess replication capacity to set appropriate limits before granting reproduction rights.