Guardrails for Safety and Compliance
Guardrails are safety measures that ensure AI-generated responses from large language models (LLMs) are appropriate and align with standards. You can deploy various guardrail models and use them to scan the inputs or prompts and output results. The scanners ensure responsible AI interactions while generating responses. Supported Scanners: Regex Scanner- Validates prompts based on user-defined regular expression patterns.
- Allows defining desirable (“good”) and undesirable (“bad”) patterns for fine-grained prompt validation.
- Ensures user prompts remain confidential by removing sensitive data.
- Helps maintain user privacy and prevents exposure of personal information.
- Restricts specific topics, such as religion, from being introduced in prompts.
- Maintains acceptable boundaries and avoids potentially sensitive or controversial discussions.
- Protects LLM against crafty input manipulations.
- Identifies and mitigates injection attempts to ensure secure LLM operation.
- Analyzes and gauges the toxicity level of prompts.
- Assists in maintaining healthy and safe online interactions by preventing the dissemination of potentially harmful content.
- Inspects LLM-generated outputs to detect and evaluate potential biases.
- Ensures LLM outputs remain neutral and free from unwanted or predefined biases.
- Replaces placeholders in the model’s output with real values.
- Helps restore original information in the output when needed.
- Measures the similarity between the input prompt and the model’s output.
- Provides a confidence score indicating the contextual relevance of the response.
- Ensures LLM outputs remain aligned with the given input prompt.