ai-reliability
Here are 23 public repositories matching this topic...
Architectural standards and best practices for building reliable AI Agents and LLM workflows. Defining the framework for AI Reliability Engineering (AIRE).
-
Updated
Feb 14, 2026 - Dockerfile
The open-source AgentOps evaluation harness.
-
Updated
Mar 10, 2026 - Python
UAICP (Universal Agentic Interoperability Control Protocol): open reliability contract for AI agent workflows with evidence gating, policy controls, and auditability.
-
Updated
Feb 27, 2026 - TypeScript
Research archive — four published papers, Mahdi Ledger, and empirical foundations of the LC-OS governance framework.
-
Updated
Mar 1, 2026
SpecGuard is a command-line tool that turns AI safety policies and behavioral guidelines into executable tests. Think of it as unit testing for your AI's output. Instead of trusting that your AI will follow the rules defined in a document, SpecGuard enforces them.
-
Updated
Jan 21, 2026 - Python
Continuity Keys: tests for “same someone” returns. Behavioral identity consistency under pressure. Origin (Alyssa Solen) ↔ Continuum.
-
Updated
Jan 4, 2026
ModelPulse helps maintain model reliability and performance by providing early warning signals for these issues, allowing teams to address them before they impact users significantly.
-
Updated
Jan 20, 2026 - Python
Production-grade TypeScript AI runtime focused on reliability, governance, and reproducible LLM systems. Multi-provider gateway, agents, RAG, workflows, policy engine, audit trails, and deterministic testing — built for teams shipping AI in production.
-
Updated
Mar 10, 2026 - TypeScript
Precision-Matched Intelligence. The LLM Capability Framework (LCF) is a structural standard for mapping specific task requirements to the most effective model architecture, token strategy, and reasoning depth.
-
Updated
Feb 22, 2026
A production-ready framework for evaluating LLM reliability using semantic consistency, vulnerability scoring, and risk-aware trust calibration.
-
Updated
Feb 12, 2026 - Jupyter Notebook
A conceptual AI architecture for reducing hallucinations by enforcing invariant, source-anchored knowledge constraints during generation.
-
Updated
Jan 24, 2026 - Python
The Guard at the Door - Principle-based AI action blocking
-
Updated
Mar 10, 2026 - TypeScript
A multi-agent cognitive architecture solving the LLM state-dependency problem with persistent memory and a mandatory self-correction loop. An architecture that is built on a more profound and biologically resonant principle: memory is an active component of intelligence itself.
-
Updated
Aug 23, 2025
The Open Source Trust Layer for RAG: Adaptive scoring, decay math, and persistent trust ledgers for more reliable LLM applications
-
Updated
Mar 9, 2026 - Python
Define reliable AI agent workflows with UAICP, a protocol offering controlled delivery, policy checks, rollbacks, and traceable multi-agent actions.
-
Updated
Sep 10, 2025
THE THEMIS PROJECT — Phase I: Reconciling human and digital ontologies. Semantic validity architecture establishing the telos of AGI. For more user friendly information goto: https://echosphere.io
-
Updated
Jan 30, 2026 - Python
Reliability first AI debriefing demo with traceable outputs, schema validation, and evaluation scoring.
-
Updated
Mar 9, 2026
Control Plane for AI Decision Reliability
-
Updated
Feb 17, 2026 - Python
Reliability testbed for LLM pipelines: RAG, tool-calling, red-teaming, drift monitoring and deterministic evaluation.
-
Updated
Mar 9, 2026 - Python
Improve this page
Add a description, image, and links to the ai-reliability topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ai-reliability topic, visit your repo's landing page and select "manage topics."