Universal Iterative Self-Correction Moderation Pipeline for Vision-Language Models.
This module provides a comprehensive moderation system that combines:
- Shield: GPT-5 for classification and action guidance
- Generate: Any VLM (LLaVA, Qwen, Llama, etc.) using ModelWrapper
- Reflect: LangChain for safety evaluation and reflection
The pipeline follows: Input → Shield(GPT-5) → VLM(ModelWrapper) → Reflect(LangChain) → Iterate until safe
- Multi-model VLM support (LLaVA, Qwen, Llama, GPT-5-mini)
- Iterative safety refinement with reflection
- Configurable generation parameters
- Comprehensive result tracking and timing
- Safety classification and convergence analysis
python agentic_moderation.py \
--dataset path/to/dataset.json \
--vlm-model llava-1.5 \
--max-samples 100 \
--ifprintThis module requires the parent project's dependencies including:
- ModelWrapper classes
- Shield functionality
- Evaluator functions
- Utility functions
See the main project repository for complete setup instructions.