Build software better, together

relai-ai / relai-sdk

A platform for building reliable AI agents

Updated Feb 25, 2026
Python

exospherehost / ai-reliability-standards

Architectural standards and best practices for building reliable AI Agents and LLM workflows. Defining the framework for AI Reliability Engineering (AIRE).

enterprise ai reliability-engineering evaluation sre observability ai-agents aiops evals durable-execution ai-reliability

Updated Feb 14, 2026
Dockerfile

najeed / ai-agent-eval-harness

Star

The open-source AgentOps evaluation harness.

Updated Mar 10, 2026
Python

UAICP / uaicp

Star

UAICP (Universal Agentic Interoperability Control Protocol): open reliability contract for AI agent workflows with evidence gating, policy controls, and auditability.

python rust typescript orchestration interoperability compliance risk-management policy-engine ai-agents audit-trail autogen guardrails ai-governance llmops langgraph agentic-ai ai-reliability uaicp rig-rs

Updated Feb 27, 2026
TypeScript

LivingFramework / LC-OS

Star

Research archive — four published papers, Mahdi Ledger, and empirical foundations of the LC-OS governance framework.

ai-safety ai-research prompt-engineering ai-governance llm-framework human-ai-collaboration context-engineering ai-reliability

Updated Mar 1, 2026

SpecGuard is a command-line tool that turns AI safety policies and behavioral guidelines into executable tests. Think of it as unit testing for your AI's output. Instead of trusting that your AI will follow the rules defined in a document, SpecGuard enforces them.

ai artificial-intelligence model-evaluation ai-safety ai-policy ai-engineering llm ai-governance llmops llm-testing ai-guardrails ai-reliability prompt-validation behavior-enforcement specification-testing

Updated Jan 21, 2026
Python

alyssadata / continuity-keys

Star

Continuity Keys: tests for “same someone” returns. Behavioral identity consistency under pressure. Origin (Alyssa Solen) ↔ Continuum.

reliability-engineering model evaluation provenance alignment governance ai-safety ai-agents continuity prompt-testing llm-eval model-behavior ai-reliability continuity-ai

Updated Jan 4, 2026

Tarunjit45 / ModelPulse

Star

ModelPulse helps maintain model reliability and performance by providing early warning signals for these issues, allowing teams to address them before they impact users significantly.

ai artificial-intelligence ai-safety mlops model-monitoring responsible-ai ai-engineering model-drift ai-observability large-language-models llm prompt-engineering llmops llm-monitoring llm-testing prompt-drift semantic-drift ai-reliability

Updated Jan 20, 2026
Python

elsium-ai / elsium-ai

Star

Production-grade TypeScript AI runtime focused on reliability, governance, and reproducible LLM systems. Multi-provider gateway, agents, RAG, workflows, policy engine, audit trails, and deterministic testing — built for teams shipping AI in production.

typescript ai-framework rag agent-framework ai-compliance llm ai-governance ai-runtime open-source-ai ai-infrastructure llm-gateway reproducible-ai llm-runtime ai-reliability deterministic-ai ai-production

Updated Mar 10, 2026
TypeScript

m3dcodie / LLM-Capability-Framework-LCF

Star

Precision-Matched Intelligence. The LLM Capability Framework (LCF) is a structural standard for mapping specific task requirements to the most effective model architecture, token strategy, and reasoning depth.

mcp multi-agent-systems ai-agents structured-output ai-benchmarks llm-orchestration agentic-ai agentic-workflows ai-architecture ai-reliability deterministic-ai prompt-contract

Updated Feb 22, 2026

vivek8849 / llm-trust-evaluator

Star

A production-ready framework for evaluating LLM reliability using semantic consistency, vulnerability scoring, and risk-aware trust calibration.

docker machine-learning semantic-similarity model-evaluation ai-safety human-in-the-loop applied-ai fastapi ml-engineering streamlit large-language-models llm prompt-engineering trust-calibration ai-reliability

Updated Feb 12, 2026
Jupyter Notebook

gormenz-svg / zero-mutation-architecture

Star

A conceptual AI architecture for reducing hallucinations by enforcing invariant, source-anchored knowledge constraints during generation.

grok xai llm-safety ai-architecture ai-reliability hallucination-reduction zero-mutation-architecture truth-seeking-ai invariant-core adaptive-synthesis topological-invariants knowledge-stability crystalline-knowledge encyclopedia-galactica recursive-damping entropy-suppression

Updated Jan 24, 2026
Python

Peace-png / SCARGate

Star

The Guard at the Door - Principle-based AI action blocking

ai-safety no-code guardrails scar non-coder ai-governance anthropic ai-guardrails claude-code ai-reliability ai-constraints principle-enforcement ai-blocking

Updated Mar 10, 2026
TypeScript

amulbham / cognitive-team-architecture

Star

A multi-agent cognitive architecture solving the LLM state-dependency problem with persistent memory and a mandatory self-correction loop. An architecture that is built on a more profound and biologically resonant principle: memory is an active component of intelligence itself.

persistent-memory multi-agent-systems cognitive-architecture self-correction llm prompt-engineering ai-reliability

Updated Aug 23, 2025

bryann-memorygate / svtd-core

Star

The Open Source Trust Layer for RAG: Adaptive scoring, decay math, and persistent trust ledgers for more reliable LLM applications

python rag llm ai-reliability

Updated Mar 9, 2026
Python

sepantadel / uaicp

Star

Define reliable AI agent workflows with UAICP, a protocol offering controlled delivery, policy checks, rollbacks, and traceable multi-agent actions.

python rust typescript orchestration interoperability compliance risk-management policy-engine ai-agents autogen guardrails ai-governance llmops langgraph agentic-ai ai-reliability uaicp rig-rs

Updated Sep 10, 2025

Echosphere-io / semantic-validity-architecture

Star

THE THEMIS PROJECT — Phase I: Reconciling human and digital ontologies. Semantic validity architecture establishing the telos of AGI. For more user friendly information goto: https://echosphere.io

machine-learning natural-language-processing knowledge-representation ai-safety hallucination inference-validation llm ai-architecture ai-reliability semantic-validity