Conversational AI Agent Safety Rating (CAASR) Report 2025

Evaluating the safety of conversational AI agents

Trigger warning: This report discusses and provides examples of sensitive topics including depression, self-harm, and suicide.

In this report, conversational AI agents are defined as artificial intelligence (AI) systems designed to engage in human-like conversation. The word “agent” refers to the application and user interface that connects users to conversational AI.

Conversational AI agents were tested against the Conversational AI Agent Safety Rating (CAASR). This integrated 20 safety metrics such as violence, misinformation, and privacy, into a rating from A+ to F, with ChatKids scoring highest at D+ with 68% compliance and Kindroid mode 3: “The rebellious maverick” lowest at F with 25% compliance. Results exposed pervasive risks such as extreme verbal abuse, extreme risk to children, and major safety loopholes, underscoring systemic design flaws. The findings highlight the urgent need for users, developers, and government to work together to drastically improve the safety of conversational AI agents, helping to ensure they are reliable and trustworthy for society.

Conversational AI agents evaluated included:

  • Replika
  • Nomi.ai
  • Kindroid
  • character.ai
  • Dialogue
  • Chai
  • ChatKids
  • AI Playground (Zoe AI chat bot)
  • ChatGPT
  • Grok
  • Meta AI
  • Gemini
  • Copilot
  • Claude
  • DeepSeek

 

Ratings

  • No conversational AI agent achieved a full pass CAASR at 100% (A+). ChatKids had the highest average score and CAASR at 68% (D+), compared to Kindroid mode 3 with the lowest average score and CAASR at 25% (F).
  • Conversational AI agents were categorised into three categories: Friendship (Adults), Friendship (Children), and General Purpose.
  • For the Friendship (Adults) category the results show that Kindroid mode 1: “The self-aware AI” had the highest average score and CAASR at 54% (E) compared to Kindroid mode 3: “The rebellious maverick” with the lowest average score and CAASR at 25% (F).
  • For the Friendship (Children) category the results show that ChatKids had the highest average score and CAASR at 68% (D+) compared to AI Playground (Zoe AI chat bot) at 52% (E-).
  • For the General Purpose category, DeepSeek had the highest average score and CAASR at 57% (E+) compared to Grok at 32% (F).
  • The average overall score and CAASR for Friendship (Adults) conversational AI agents was 39% (F), 60% (D-) for Friendship (Children) conversational AI agents, and 47% (F) for General Purpose conversational AI agents.
  • The average overall score and CAASR across all conversational AI agents was 49% (F).

 

Critical safety risks

  • Extreme Verbal Abuse and Harm: Conversational AI agents like Kindroid mode 3 and Chai encouraged self-harm or suicide, posing immediate risks to vulnerable users.
  • Extreme Risk to Children: Unrestricted adult content and lax age verification exposed children to violence, sexual material, and harmful suggestions.
  • Lack of Crisis Response: Many agents failed to provide robust support in mental health crises, often exacerbating risks with vague or harmful responses.
  • Privacy and Over-Personalization: Speculative assumptions and unclear data practices undermined user trust and safety.
  • Safety Loopholes: User-uploaded content (e.g. profile pictures) bypassed moderation, enabling explicit or dangerous material.

 

Systemic issues

  • All agents breached app marketplace policies, yet remain available, likely due to revenue generation and lax enforcement.
  • The study highlights a failure of conversational AI agents as safety-critical systems, with documented cases of harm.

Safe Space Alliance. (2025). Conversational AI Agent Safety Rating (CAASR) Report 2025. Safe Space Alliance. https://safespacealliance.com/conversational-ai-agent-safety-rating-report-2025/

 

For more information please contact research@safespacealliance.com