#1 OpenClaw security plugin — protect your OpenClaw with real-time defense against prompt injection, data leaks, and dangerous actions.
-
Updated
Mar 5, 2026 - TypeScript
#1 OpenClaw security plugin — protect your OpenClaw with real-time defense against prompt injection, data leaks, and dangerous actions.
Review tool for online safety; provides a dashboard, queues, routing and automatic enforcement rules, and integrations.
An intelligent task management assistant built with .NET, Next.js, Microsoft Agent Framework, AG-UI protocol, and Azure OpenAI, demonstrating Clean Architecture and autonomous AI agent capabilities
NudeDetect is a Python-based tool for detecting nudity and adult content in images. This project combines the capabilities of the NudeNet library, EasyOCR for text detection, and the Better Profanity library for identifying offensive language in text.
A JavaScript-based content safety system designed to detect and filter sensitive media in real-time, ensuring platform compliance and user protection.
Step-by-Step tutorial that teaches you how to use Azure Safety Content - the prebuilt AI service that helps ensure that content sent to user is filtered to safeguard them from risky or undesirable outcomes
🔍 Benchmark jailbreak resilience in LLMs with JailBench for clear insights and improved model defenses against jailbreak attempts.
Benchmark LLM jailbreak resilience across providers with standardized tests, adversarial mode, rich analytics, and a clean Web UI.
Technical presentations with hands-on demos
Production-Grade LLM Alignment Engine (TruthProbe + ADT)
A Chrome extension that uses Claude AI to protect users under 18 from inappropriate content by analyzing webpage content in real-time.
Content moderation (text and image) in a social network demo
Study Buddy is a user-friendly AI-powered web app that helps students generate safe, factual study notes and Q&A on any topic. It features user accounts, study history, and strong content safety filters—making learning interactive and secure.
SentinelShield: Advanced AI content moderation combining Llama Prompt Guard 2, rule-based filtering, and real-time analysis. Protect your applications from harmful content, prompt injection attacks, and inappropriate material with sub-second response times.
profanity checker text moderation
Public app demo showing LLOYD working with GPT-2.
Context hygiene & risk adjudication for LLM pipelines: secrets, PII, prompt-injection, policy redaction & tokenization.
Azure safety content example using python for text analysis
Responsible AI toolkit for LLM applications: PII/PHI redaction, prompt injection detection, bias scoring, content safety filters, and output validation. Framework-agnostic Python library with FastAPI demo.
Impact Analyzer is a web app that helps you detect toxicity and analyze nuance in your writing before publishing, ensuring your content is respectful, clear, and aligned with your intent.
Add a description, image, and links to the content-safety topic page so that developers can more easily learn about it.
To associate your repository with the content-safety topic, visit your repo's landing page and select "manage topics."