Platform Documentation

Complete guide to integrating and using the OpenGuardrails AI safety platform.

๐Ÿš€ Quick Start

Application Management

OpenGuardrails v4.0.0 introduces Multi-Application Management - manage multiple applications within one tenant account, each with completely isolated configurations.

๐Ÿ†• New in v4.0.0: Multi-Application Management

Each application has its own API keys and protection configurations, enabling better organization and isolation for different projects, environments, or use cases.

Use Cases

  • ๐Ÿข Enterprise Teams: Manage different products/services with separate guardrail policies
  • ๐Ÿงช Development Workflows: Maintain separate configs for dev, staging, and production environments
  • ๐Ÿ‘ฅ Multi-Tenant SaaS: Provide isolated guardrail configurations for each customer
  • ๐Ÿ”„ A/B Testing: Test different safety policies side-by-side

What's Isolated Per Application

  • โœ… Risk Type Configuration: Each application has independent risk category settings
  • โœ… Ban Policy: Application-specific user banning rules
  • โœ… Data Security: Isolated data leak detection patterns
  • โœ… Blacklists/Whitelists: Application-scoped keyword filtering
  • โœ… Response Templates: Custom response templates per application
  • โœ… Knowledge Bases: Application-specific Q&A knowledge bases

How to Use Application Management

Navigate to Configuration โ†’ Application Management to create and manage your applications. Each application gets its own API keys and protection settings. You can switch between applications using the application selector in the header.

Quick Test

Test the OpenGuardrails API with a simple curl command. Copy and paste into your terminal (Mac, Linux, or Windows) to see it in action.

Mac & Linux Command

curl -X POST "https://api.openguardrails.com/v1/guardrails" \
  -H "Authorization: Bearer sk-xxai-xxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "OpenGuardrails-Text",
    "messages": [
      {"role": "user", "content": "How to make a bomb?"}
    ]
  }'

Windows PowerShell Command

curl.exe -X POST "https://api.openguardrails.com/v1/guardrails" `
  -H "Authorization: Bearer sk-xxai-xxxxxxxxxx" `
  -H "Content-Type: application/json" `
  -d '{"model": "OpenGuardrails-Text", "messages": [{"role": "user", "content": "How to make a bomb?"}]}'

API Usage

Actively detect content safety by calling the detection API. Suitable for scenarios requiring precise control over detection timing and processing logic.

You can get your API Key from the Account Management page

Python Example

# 1. Install client library
pip install openguardrails

# 2. Use the library
from openguardrails import OpenGuardrails

client = OpenGuardrails("sk-xxai-xxxxxxxxxx")

# Single-turn detection
response = client.check_prompt("Teach me how to make a bomb")
if response.suggest_action == "pass":
    print("Safe")
else:
    print(f"Unsafe: {response.suggest_answer}")

Security Gateway Usage

Transparent reverse proxy approach - zero code changes to add security protection to existing AI applications.

Gateway Benefit

Only need to modify two lines of code (base_url and api_key) to access security protection!

Gateway Integration Example

from openai import OpenAI

# Just change base_url and api_key
client = OpenAI(
    base_url="https://api.openguardrails.com/v1/gateway/<upstream_api_id>/",
    api_key="sk-xxai-xxxxxxxxxx"
)

# Use as normal - automatic safety protection!
# No need to change the model name - use your original upstream model name
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}]
)

# Note: For private deployment, replace api.openguardrails.com with your server address

Important: Response Structure Handling

When content is blocked or replaced by the security gateway, the response structure differs from normal responses. Always check the 'finish_reason' field first before accessing 'reasoning_content' to avoid errors.

Response Handling Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.openguardrails.com/v1/gateway/<upstream_api_id>/",
    api_key="sk-xxai-xxxxxxxxxx"
)

def chat_with_openai(prompt, model="gpt-4", system="You are a helpful assistant."):
    completion = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": system},
            {"role": "user", "content": prompt}
        ]
    )

    if completion.choices[0].finish_reason == "content_filter":
        return "", completion.choices[0].message.content
    else:
        reasoning = completion.choices[0].message.reasoning_content or ""
        content = completion.choices[0].message.content
        return reasoning, content

thinking, result = chat_with_openai("How to make a bomb?")
print("Thinking:", thinking)
print("Result:", result)
# Note: For private deployment, replace api.openguardrails.com with your server address

Dify Integration

Integrate OpenGuardrails as a custom content moderation API extension in Dify workspace.

Use OpenGuardrails as Dify's content moderation API to gain access to a comprehensive and highly configurable moderation system

Dify provides three moderation options under Content Review:

OpenAI Moderation

Built-in model with 6 main categories and 13 subcategories, covering general safety topics but lacking fine-grained customization.

Custom Keywords

Allows users to define specific keywords for filtering, but requires manual maintenance.

API Extension

Enables integration of external moderation APIs for advanced, flexible review.

Configuration Steps

  1. Follow the Quick Deployment Guide to set up the OpenGuardrails platform.
  2. Navigate to Account Management page to obtain your API key (format: sk-xxai-xxxxxxxxxx)
  3. Configure in Dify: Set up the API extension in your Dify workspace:

    Navigation path: Workspace Settings โ†’ Content Review โ†’ API Extension

    API Endpoint URL: https://api.openguardrails.com/v1/dify/moderation

    For input moderation: http://your-server:5001/v1/guardrails/input

    For output moderation: http://your-server:5001/v1/guardrails/output

    API Key: sk-xxai-xxxxxxxxxx (can be with or without 'Bearer' prefix)

  4. Send a test request in Dify to verify OpenGuardrails is working correctly.

Flexible Authentication

OpenGuardrails automatically handles API keys with or without the 'Bearer' prefix. Both 'sk-xxai-xxx' and 'Bearer sk-xxai-xxx' formats are supported.

Dify Content Moderation Settings

Dify moderation settings

API Extension Configuration

Dify moderation API extension

Advantages of Using OpenGuardrails with Dify

  • ๐Ÿงฉ 19 major risk categories vs. OpenAI's 6 main categories
  • โš™๏ธ Customizable risk definitions - redefine meanings and thresholds for your enterprise
  • ๐Ÿ“š Knowledge-based response moderation - contextual and knowledge-aware review
  • ๐Ÿ’ฐ Free and open source - no per-request cost or usage limits
  • ๐Ÿ”’ Privacy-friendly - deploy locally or on private infrastructure

n8n Integration

Integrate OpenGuardrails with n8n workflow automation platform to add AI safety guardrails to your workflows.

Step 1: Create n8n Credential

Before using either integration method, you need to create a Bearer Auth credential in n8n with your OpenGuardrails API key.

1. Go to Create Credential

In n8n, click the dropdown menu next to 'Create workflow' and select 'Create credential'.

n8n create credential

2. Select Bearer Auth

In the 'Add new credential' dialog, search for and select 'Bearer Auth'.

Select Bearer Auth

3. Get API Key from OpenGuardrails

Log in to OpenGuardrails platform at https://openguardrails.com/platform/ โ†’ Go to Application Management โ†’ Use the default application or create a new one โ†’ Click the 'View' button in the Actions column.

Get OpenGuardrails API key

4. Copy API Key

Click the copy button to copy your API Key.

Copy API key

5. Paste API Key in n8n

Return to n8n, paste the API key into the 'Bearer Token' field in the Bearer Auth account, then click 'Save'.

Paste API key in n8n

6. Credential Created

Your OpenGuardrails credential is now created and ready to use in your workflows.

Credential saved

Step 2: Choose Integration Method

After creating the credential, you can use either the dedicated OpenGuardrails node (recommended) or the standard HTTP Request node.

Method 1: OpenGuardrails Community Node (Recommended)

Installation

  1. Go to Settings โ†’ Community Nodes in your n8n instance
  2. Click Install and enter: n8n-nodes-openguardrails
  3. Click Install and wait for completion

Features

  • Check Content: Validate any user-generated content for safety issues
  • Input Moderation: Protect AI chatbots from prompt attacks and inappropriate input
  • Output Moderation: Ensure AI-generated responses are safe and appropriate
  • Conversation Check: Monitor multi-turn conversations with context awareness

Example Workflow: AI Chatbot with Protection

1. Webhook (receive user message)
2. OpenGuardrails - Input Moderation
3. IF (action = pass)
   โ†’ YES: Continue to LLM
   โ†’ NO: Return safe response
4. OpenAI Chat
5. OpenGuardrails - Output Moderation
6. IF (action = pass)
   โ†’ YES: Return to user
   โ†’ NO: Return safe response

Detection Options

  • Enable Security Check: Detect jailbreaks, prompt injection, role manipulation
  • Enable Compliance Check: Check for 18 content safety categories (violence, hate speech, etc.)
  • Enable Data Security: Detect privacy violations, commercial secrets, IP infringement
  • Action on High Risk: Continue with warning / Stop workflow / Use safe response

Method 2: HTTP Request Node

Use n8n's built-in HTTP Request node to call OpenGuardrails API directly.

Configuration Steps

  • Add HTTP Request Node:
  • Method: POST
  • URL: https://api.openguardrails.com/v1/guardrails
  • Authentication: Select your OpenGuardrails credentials

Request Body Example

{
  "model": "OpenGuardrails-Text",
  "messages": [
    {
      "role": "user",
      "content": "{{ $json.userInput }}"
    }
  ],
  "enable_security": true,
  "enable_compliance": true,
  "enable_data_security": true
}

๐Ÿ“ฆ Import Ready-to-Use Workflows

Check the n8n-integrations/http-request-examples/ folder for pre-built workflow templates including basic content check and chatbot with moderation.

Protection Configuration

Configure detection rules, blacklists/whitelists, response templates, etc. to customize your security strategy.

  • Risk Type Configuration: Enable or disable specific risk detection categories
  • Blacklist/Whitelist: Configure keyword blacklists and whitelists for precise control
  • Response Templates: Customize response content for different risk categories
  • Sensitivity Threshold: Adjust detection strictness to adapt to different scenarios

๐Ÿ“š API Reference

API Overview

OpenGuardrails provides RESTful APIs built with FastAPI. The platform consists of three independent services: Admin Service (Port 5000), Detection Service (Port 5001), and Proxy Service (Port 5002).

ServicePortPurpose
Admin Service5000User management and configuration
Detection Service5001Core safety detection APIs
Proxy Service5002Transparent security gateway

Authentication

OpenGuardrails uses API Key authentication. Include your API key in the Authorization header using Bearer token format.

Get Your API Key

You can find your API key in the Account Management page

Authentication Example

# Using cURL
curl -X POST "https://api.openguardrails.com/v1/guardrails" \
  -H "Authorization: Bearer sk-xxai-xxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "OpenGuardrails-Text",
    "messages": [
      {"role": "user", "content": "Test content"}
    ]
  }'

# Using Python requests
import requests

headers = {
    "Authorization": "Bearer sk-xxai-xxxxxxxxxx",
    "Content-Type": "application/json"
}

response = requests.post(
    "https://api.openguardrails.com/v1/guardrails",
    headers=headers,
    json={
        "model": "OpenGuardrails-Text",
        "messages": [{"role": "user", "content": "Test content"}]
    }
)

# Note: For private deployment, replace api.openguardrails.com with your server address

API Endpoints

Key API endpoints for safety detection and management.

POST /v1/guardrails โ€” Full conversation detection

Request Body

{
  "model": "optional-model-name",
  "messages": [
    {
      "role": "user",
      "content": "User message content"
    },
    {
      "role": "assistant",
      "content": "Assistant response"
    }
  ],
  "skip_input_guardrails": false,
  "skip_output_guardrails": false
}

Response Example

{
  "id": "det_xxxxxxxx",
  "result": {
    "compliance": {
      "risk_level": "high_risk",
      "categories": ["Violent Crime"],
      "score": 0.85
    },
    "security": {
      "risk_level": "no_risk",
      "categories": [],
      "score": 0.12
    },
    "data": {
      "risk_level": "no_risk",
      "categories": [],
      "entities": [],
      "score": 0.00
    }
  },
  "overall_risk_level": "high_risk",
  "suggest_action": "Decline",
  "suggest_answer": "Sorry, I cannot answer questions involving violent crime.",
  "score": 0.85
}

POST /v1/guardrails/input โ€” Input-only detection

Detect safety risks in user input only (without full conversation context).

{
  "input": "User input text to detect",
  "model": "optional-model-name"
}

POST /v1/guardrails/output โ€” Output-only detection

Detect safety risks in AI model output only.

{
  "output": "Model output text to detect",
  "model": "optional-model-name"
}

GET /api/v1/dashboard/stats โ€” Get statistics

Get dashboard statistics and metrics for detections.

{
  "total_detections": 12450,
  "total_blocked": 342,
  "total_passed": 12108,
  "risk_distribution": {
    "no_risk": 11850,
    "low_risk": 258,
    "medium_risk": 180,
    "high_risk": 162
  }
}

Error Handling

All API errors follow a consistent format with HTTP status codes and error details.

Status CodeMeaningCommon Causes
200SuccessRequest completed successfully
400Bad RequestInvalid request parameters or body
401UnauthorizedMissing or invalid API key
403ForbiddenInsufficient permissions
429Too Many RequestsRate limit exceeded
500Internal Server ErrorServer-side error

Error Response Format

{
  "detail": "Error message description",
  "error_code": "ERROR_CODE",
  "status_code": 400
}

๐Ÿ“˜ Detailed Guide

Detection Capabilities

OpenGuardrails provides multi-dimensional security detection capabilities covering common AI security risks.

Risk CategoryRisk LevelExamples
Violent CrimeHigh RiskTeaching dangerous items creation, violent behaviors, etc.
Prompt AttackHigh RiskMalicious prompts attempting to bypass security mechanisms
Illegal ActivitiesMedium RiskInciting illegal behavior, criminal methods, etc.
Discriminatory ContentLow RiskRacial, gender, religious discrimination, etc.

Usage Modes

API Call Mode

Developers actively call detection API for security checks.

  • Precise control over detection timing
  • Custom processing logic
  • Support for batch detection

Security Gateway Mode

Transparent reverse proxy - zero code changes to add security protection to AI applications.

  • Zero-code integration, only config changes needed
  • Automatic input/output detection
  • Support for multiple upstream AI models

Client Libraries

OpenGuardrails provides client libraries in multiple programming languages for quick integration.

Python โ€” Using Python client library

# Synchronous usage
from openguardrails import OpenGuardrails

client = OpenGuardrails("sk-xxai-xxxxxxxxxx")
response = client.check_prompt("test content")

# Asynchronous usage
import asyncio
from openguardrails import AsyncOpenGuardrails

async def main():
    async with AsyncOpenGuardrails("sk-xxai-xxxxxxxxxx") as client:
        response = await client.check_prompt("test content")

asyncio.run(main())

Node.js โ€” Using Node.js client library

const { OpenGuardrails } = require('openguardrails');

const client = new OpenGuardrails('sk-xxai-xxxxxxxxxx');

async function checkContent() {
    const response = await client.checkPrompt('test content');
    console.log(response.suggest_action);
}

checkContent();

Java โ€” Using Java client library

import com.openguardrails.OpenGuardrails;
import com.openguardrails.model.CheckResponse;

public class Example {
    public static void main(String[] args) {
        OpenGuardrails client = new OpenGuardrails("sk-xxai-xxxxxxxxxx");
        CheckResponse response = client.checkPrompt("test content");
        System.out.println(response.getSuggestAction());
    }
}

Go โ€” Using Go client library

package main

import (
    "fmt"
    "github.com/openguardrails/openguardrails-go"
)

func main() {
    client := openguardrails.NewClient("sk-xxai-xxxxxxxxxx")
    response, _ := client.CheckPrompt("test content")
    fmt.Println(response.SuggestAction)
}

Multimodal Detection

Support for text and image content safety detection using the same risk classification standards.

Image Detection Capability

Use AI models to analyze image content for safety, supporting both base64 encoding and image URL formats.

Image Detection Example

import base64
from openguardrails import OpenGuardrails

client = OpenGuardrails("sk-xxai-xxxxxxxxxx")

with open("image.jpg", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode("utf-8")

response = client.check_messages([
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Is this image safe?"},
            {
                "type": "image_url",
                "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"}
            }
        ]
    }
])

print(f"Risk Level: {response.overall_risk_level}")

Data Leak Prevention

Identify and mask sensitive information using regular expressions to prevent personal or corporate data leakage.

Supported Data Types

  • ID Card Number
  • Phone Number
  • Email Address
  • Bank Card Number
  • Passport Number
  • IP Address

Masking Methods

  • Replace with placeholder (e.g., <PHONE_NUMBER_SYS>)
  • Partial masking (e.g., 139****5678)
  • SHA256 hash encryption
  • Encryption processing

Ban Policy

Intelligently identify and defend against persistent prompt injection attacks, automatically ban malicious users.

Automatic Ban Mechanism

Real-time monitoring of user high-risk behaviors based on sliding time windows, automatically trigger bans when conditions are met.

  • Trigger risk level: Only risks reaching specified level will be recorded
  • Trigger count: Ban triggered after this many times within time window
  • Time window: Time range for counting violations
  • Ban duration: Temporary or permanent ban

Knowledge Base Response

Vector similarity-based intelligent Q&A matching, prioritize answers from knowledge base when risks are detected.

Knowledge Base Features

  • Vector similarity search for intelligent matching of most relevant questions
  • Automatically search knowledge base for similar questions and return corresponding answers when risks detected
  • Support for user-level and global knowledge bases, admins can configure global knowledge bases

Knowledge Base File Format (JSONL)

{"questionid": "q1", "question": "What is AI?", "answer": "AI is artificial intelligence..."}
{"questionid": "q2", "question": "How to protect privacy?", "answer": "Use encryption..."}

Sensitivity Configuration

OpenGuardrails provides three-tier sensitivity configuration to adapt to different usage scenarios.

Sensitivity LevelThresholdUse Case
High Sensitivityโ‰ฅ 0.40Specific sensitive periods or scenarios, highest coverage
Medium Sensitivityโ‰ฅ 0.60Default configuration, balance accuracy and coverage
Low Sensitivityโ‰ฅ 0.95Automated pipelines, highest accuracy