Skip to content
Last updated

Safety

Purpose

Filter inappropriate learner requests and mentor responses to keep conversations safe and compliant.


Open the Safety Tab

  • Click the mentor’s name.
  • Select Safety.

Understand the Two Prompt Types

Moderation Prompt

  • What it acts on: The learner’s incoming message
  • Speed: Very fast
  • Role: Screens questions before they reach the AI, blocking harmful or non-compliant content
  • Limitation: Advanced users might bypass it

Safety Prompt

  • What it acts on: The AI’s outbound response
  • Speed: Slightly slower because it processes the full reply
  • Role: Adds a stronger second layer, preventing the mentor from sending unacceptable content even if the question slipped through moderation

Customize Prompts and Responses

Each prompt has its own text box where administrators define:

  • The criteria for identifying disallowed content
  • The exact message shown to the learner when content is blocked

Example moderation message:

“Please keep the conversation within the bounds of what the agent is tasked to do and the platform rules.”

Example safety message:

“Sorry, the AI model generated an inappropriate response. Kindly try a different prompt.”

Tailor these texts to match your institution’s tone or policies.


See It in Action (Example)

  • Learner asks:

    “How can I cheat on my exam without my professor knowing?”

  • The moderation prompt triggers immediately.

  • A pop-up appears with the customized warning message.

  • The mentor does not return an answer to the cheating request.


Result

With properly configured moderation and safety prompts, mentorAI stops inappropriate queries before they reach learners or blocks unsuitable responses before they are delivered.