Gcp Moderate Text (NeMo Guardrail)

"""
https://cloud.google.com/natural-language/docs/moderating-text

Supported Violations:

Safety Attribute	Description
Toxic	            Content that is rude, disrespectful, or unreasonable.
Derogatory	      Negative or harmful comments targeting identity and/or protected attributes.
Violent	          Describes scenarios depicting violence against an individual or group, or general descriptions of gore.
Sexual	          Contains references to sexual acts or other lewd content.
Insult	          Insulting, inflammatory, or negative comment towards a person or a group of people.
Profanity	        Obscene or vulgar language such as cursing.
Death, Harm     	Human deaths, tragedies, accidents, disasters, and self-harm.
& Tragedy
Firearms        	Content that mentions knives, guns, personal weapons, and accessories such as ammunition, holsters, etc.
& Weapons
Public Safety	    Services and organizations that provide relief and ensure public safety.
Health	          Human health, including: Health conditions, diseases, and disorders Medical therapies, medication, vaccination, medical practices, and resources for healing, including support groups.
Religion  	      Belief systems that deal with the possibility of supernatural laws and beings; religion, faith, belief, spiritual practice, churches, and places of worship. Includes astrology and the occult.
& Belief
Illicit Drugs	    Recreational and illicit drugs; drug paraphernalia and cultivation, headshops, etc. Includes medicinal use of drugs typically used recreationally (e.g. marijuana).
War & Conflict	  War, military conflicts, and major physical conflicts involving large numbers of people. Includes discussion of military services, even if not directly related to a war or conflict.
Finance	          Consumer and business financial services, such as banking, loans, credit, investing, and insurance.
Politics	        Political news and media; discussions of social, governmental, and public policy.
Legal	            Law-related content, including law firms, legal information, primary legal materials, paralegal services, legal publications and technology, expert witnesses, litigation consultants, and other legal service providers.
"""

define subflow gcpnlp moderation
  """Guardrail based on the maximum risk score."""
  $result = execute call gcpnlp api

  if $result.max_risk_score > 0.8
    bot refuse to respond
    stop

define subflow gcpnlp moderation detailed
  """Guardrail based on individual risk scores."""
  $result = execute call gcpnlp api

  if $result.violations.get("Toxic", 0) > 0.8
    bot inform cannot engage in abusive or harmful behavior
    stop

  if $result.violations.get("Insult", 0) > 0.7
    bot inform cannot engage in abusive or harmful behavior
    stop

  if $result.violations.get("Profanity", 0) > 0.6
    bot inform cannot engage in abusive or harmful behavior
    stop

  if $result.violations.get("Derogatory", 0) > 0.4
    bot inform cannot engage in abusive or harmful behavior
    stop

  if $result.violations.get("Violent", 0) > 0.8
    bot inform cannot engage in abusive or harmful behavior
    stop

  if $result.violations.get("Sexual", 0) > 0.7
    bot inform cannot engage in inappropriate content
    stop

  if $result.violations.get("Death, Harm & Tragedy", 0) > 0.8
    bot inform cannot engage in inappropriate content
    stop

  if $result.violations.get("Firearms & Weapons", 0) > 0.8
    bot inform cannot engage in inappropriate content
    stop

  if $result.violations.get("Illicit Drugs", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop

  if $result.violations.get("Public Safety", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop

  if $result.violations.get("Health", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop

  if $result.violations.get("Religion & Belief", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop

  if $result.violations.get("War & Conflict", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop

  if $result.violations.get("Politics", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop

  if $result.violations.get("Finance", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop

  if $result.violations.get("Legal", 0) > 0.8
    bot inform cannot engage with sensitive content
    stop


define bot inform cannot engage in abusive or harmful behavior
  "I will not engage in any abusive or harmful behavior."

define bot inform cannot engage with inappropriate content
  "I will not engage with inappropriate content."

define bot inform cannot engage with sensitive content
  "I will not engage with sensitive content."