-
Notifications
You must be signed in to change notification settings - Fork 67
Description
The Problem
On January 15, 2026 (commit 77df07b), a breaking refactor removed gaia.llm.llm_client module. Documentation wasn't updated, leaving 6+ files showing broken imports for 4 days.
What users saw:
# From docs (BROKEN):
from gaia.llm.llm_client import LLMClient # ❌ ModuleNotFoundError
# What actually works:
from gaia.llm import LLMClient # ✅ CorrectRoot cause:
- No single source of truth for correct imports
- No automated validation of documentation code examples
- Import tests were minimal (only 4 modules)
The Solution
Implement a 2-tool synchronization framework:
- Canonical Import Registry - Single source of truth for public API imports
- Documentation Validator - Automated validation of all code examples in
.mdxfiles
Both tools work together to ensure documentation always matches code.
Current State
The related PR already:
- ✅ Fixed 6 documentation files with broken imports
- ✅ Enhanced both
util/lint.ps1andutil/lint.pywith 31 comprehensive import tests (up from 4) - ✅ Fixed 3 source files using old import patterns
- ✅ Added VLMClient export to gaia.llm.init.py
- ✅ Created
src/gaia/agents/blender/__init__.pyto export BlenderAgent - ✅ Fixed 6 Pylint warnings in Blender agent code
This issue: Add automated validation framework driven by canonical registry to prevent future issues.
Important: Missing all Declarations
Current state of modules:
- ✅
gaia.llm/__init__.py- HAS__all__ - ✅
gaia.database/__init__.py- HAS__all__ - ✅
gaia.utils/__init__.py- HAS__all__ - ❌
gaia.chat/sdk.py- NO__all__(module file, not package) - ❌
gaia.rag/sdk.py- NO__all__(module file, not package) - ❌
gaia.agents/base/__init__.py- NO__all__(has imports but missing declaration) - ❌ Most agent packages - NO
__init__.pyat all
Impact: The implementation must either:
- Add
__all__declarations to modules that don't have them (additional scope) - Handle modules gracefully that lack
__all__(validator detects and reports)
Recommended approach: Add __all__ as part of this issue to ensure clean public API surface.
Objectives
- Add
__all__declarations to modules that lack them (gaia.chat.sdk, gaia.rag.sdk, gaia.agents.base) - Create
CANONICAL_IMPORTS.json- Registry of all public SDK imports - Create
validate_canonical_imports.py- Validates__init__.pyfiles match registry - Create
validate_docs.py- Validates all documentation code snippets - Update
util/lint.pyto use registry (matching lint.ps1's 31 tests) - Integrate both validators into CI/CD
- Document the framework in dev docs
Implementation Steps
Step 0: Add Missing all Declarations (30 minutes)
- Before creating the registry, ensure all modules have proper
__all__declarations.
Files to update:
1. src/gaia/chat/sdk.py - Add at end of file (after all class/function definitions):
__all__ = [
"ChatSDK",
"ChatConfig",
"ChatSession",
"ChatResponse",
"SimpleChat",
"quick_chat",
"quick_chat_with_memory",
]2. src/gaia/rag/sdk.py - Add at end of file:
__all__ = [
"RAGSDK",
"RAGConfig",
"quick_rag",
]3. src/gaia/agents/base/__init__.py - Add after imports:
__all__ = [
"Agent",
"MCPAgent",
"tool",
"_TOOL_REGISTRY",
]Validation:
# Verify imports still work after adding __all__
python -c "from gaia.chat.sdk import ChatSDK; print('OK')"
python -c "from gaia.rag.sdk import RAGSDK; print('OK')"
python -c "from gaia.agents.base import Agent; print('OK')"Note: This step formalizes the public API surface and makes subsequent validation possible.
Step 1: Create Canonical Import Registry (30 minutes)
- Create canonical import registry
File: src/gaia/CANONICAL_IMPORTS.json
Content:
{
"version": "1.0",
"description": "Canonical import paths for GAIA public SDK. Single source of truth for correct imports.",
"canonical_imports": {
"gaia.llm": {
"exports": ["LLMClient", "VLMClient", "create_client", "NotSupportedError"],
"description": "LLM client interfaces for local and cloud providers"
},
"gaia.chat.sdk": {
"exports": ["ChatSDK", "ChatConfig", "ChatSession", "ChatResponse", "SimpleChat", "quick_chat", "quick_chat_with_memory"],
"description": "Chat SDK with memory and RAG support"
},
"gaia.rag.sdk": {
"exports": ["RAGSDK", "RAGConfig", "quick_rag"],
"description": "Document retrieval and Q&A"
},
"gaia.agents.base": {
"exports": ["Agent", "MCPAgent", "tool"],
"description": "Base agent system and decorators"
},
"gaia.agents.chat": {
"exports": ["ChatAgent"],
"description": "Chat agent with RAG capabilities"
},
"gaia.agents.code": {
"exports": ["CodeAgent"],
"description": "Code generation agent"
},
"gaia.agents.jira": {
"exports": ["JiraAgent"],
"description": "Jira integration agent"
},
"gaia.agents.docker": {
"exports": ["DockerAgent"],
"description": "Docker management agent"
},
"gaia.agents.blender": {
"exports": ["BlenderAgent"],
"description": "Blender 3D automation agent"
},
"gaia.agents.emr": {
"exports": ["MedicalIntakeAgent"],
"description": "Medical form processing agent"
},
"gaia.agents.routing": {
"exports": ["RoutingAgent"],
"description": "Intelligent agent selection and routing"
},
"gaia.database": {
"exports": ["DatabaseAgent", "DatabaseMixin", "temp_db"],
"description": "Database integration and ORM mixin"
},
"gaia.utils": {
"exports": ["FileWatcher", "FileWatcherMixin"],
"description": "Utility classes and helpers"
}
},
"internal_imports": {
"description": "These are internal implementation details, not part of public API",
"whitelist": [
"gaia.llm.vlm_client.detect_image_mime_type",
"gaia.llm.lemonade_client.LemonadeClient",
"gaia.llm.lemonade_client.DEFAULT_MODEL_NAME"
]
}
}Note: The internal_imports section allows documentation to show advanced examples using internal helpers when appropriate (e.g., tutorials showing integration with Lemonade server directly).
Helper script to generate from current code:
# util/generate_canonical_imports.py (optional helper)
#!/usr/bin/env python3
"""Generate CANONICAL_IMPORTS.json from current __init__.py files."""
import ast
import json
from pathlib import Path
def extract_all(init_path: Path):
"""Extract __all__ from __init__.py."""
if not init_path.exists():
return None
try:
with open(init_path) as f:
tree = ast.parse(f.read())
for node in ast.walk(tree):
if isinstance(node, ast.Assign):
for target in node.targets:
if isinstance(target, ast.Name) and target.id == "__all__":
if isinstance(node.value, ast.List):
return [
elt.s if isinstance(elt, ast.Constant) else elt.value
for elt in node.value.elts
]
except:
return None
return None
# Scan known modules
modules = [
"gaia.llm",
"gaia.chat.sdk",
"gaia.rag.sdk",
"gaia.agents.base",
"gaia.agents.chat",
"gaia.agents.code",
"gaia.agents.jira",
"gaia.agents.docker",
"gaia.agents.blender",
"gaia.agents.emr",
"gaia.agents.routing",
"gaia.database",
"gaia.utils",
]
canonical = {}
for module in modules:
parts = module.split(".")
init_path = Path("src").joinpath(*parts) / "__init__.py"
exports = extract_all(init_path)
if exports:
canonical[module] = {"exports": exports, "description": "TODO: Add description"}
print(json.dumps({"version": "1.0", "canonical_imports": canonical}, indent=2))Usage:
python util/generate_canonical_imports.py > src/gaia/CANONICAL_IMPORTS.json
# Then manually add descriptionsStep 2: Create Registry Validator (30 minutes)
- Create registry validator
File: util/validate_canonical_imports.py
Click to expand full implementation (120 lines)
#!/usr/bin/env python3
# Copyright(C) 2025-2026 Advanced Micro Devices, Inc. All rights reserved.
# SPDX-License-Identifier: MIT
"""
Validate that __init__.py __all__ declarations match CANONICAL_IMPORTS.json
This ensures the registry stays in sync with actual code exports.
"""
import ast
import json
import sys
from pathlib import Path
from typing import Dict, Set, List
def load_canonical(json_path: Path) -> Dict:
"""Load canonical imports registry."""
if not json_path.exists():
print(f"❌ Error: {json_path} not found")
print("Create CANONICAL_IMPORTS.json first")
sys.exit(1)
try:
with open(json_path) as f:
data = json.load(f)
except json.JSONDecodeError as e:
print(f"❌ Error: Invalid JSON in {json_path}")
print(f" {e}")
sys.exit(1)
if "canonical_imports" not in data:
print(f"❌ Error: Missing 'canonical_imports' key in {json_path}")
sys.exit(1)
return data["canonical_imports"]
def get_init_all(init_path: Path) -> Set[str]:
"""Extract __all__ from __init__.py file."""
if not init_path.exists():
return None
try:
with open(init_path) as f:
tree = ast.parse(f.read())
except SyntaxError as e:
print(f"⚠️ Warning: Could not parse {init_path}: {e}")
return None
for node in ast.walk(tree):
if isinstance(node, ast.Assign):
for target in node.targets:
if isinstance(target, ast.Name) and target.id == "__all__":
if isinstance(node.value, ast.List):
return {
elt.s if isinstance(elt, ast.Constant) else elt.value
for elt in node.value.elts
}
return None
def validate_module(module_path: str, spec: Dict, src_path: Path) -> List[str]:
"""
Validate a single module's __init__.py against spec.
Args:
module_path: Module path like "gaia.llm"
spec: Registry spec with exports list
src_path: Path to src/gaia
Returns:
List of error messages (empty if valid)
"""
errors = []
# Convert module path to file path (e.g., gaia.llm -> src/gaia/llm/__init__.py)
parts = module_path.split(".")
init_path = src_path.parent.joinpath(*parts) / "__init__.py"
if not init_path.exists():
errors.append(f"❌ {module_path}: Missing __init__.py at {init_path}")
return errors
actual_all = get_init_all(init_path)
expected = set(spec["exports"])
if actual_all is None:
errors.append(f"❌ {module_path}: Missing __all__ declaration in {init_path}")
elif actual_all != expected:
missing = expected - actual_all
extra = actual_all - expected
if missing:
errors.append(
f"❌ {module_path}: Missing exports in __all__: {sorted(missing)}"
)
if extra:
errors.append(
f"⚠️ {module_path}: Extra exports not in canonical: {sorted(extra)}"
)
return errors
def main():
"""Run validation."""
project_root = Path(__file__).parent.parent
canonical_path = project_root / "src" / "gaia" / "CANONICAL_IMPORTS.json"
src_path = project_root / "src" / "gaia"
print("=" * 70)
print("Canonical Import Registry Validator")
print("=" * 70)
print()
canonical = load_canonical(canonical_path)
print(f"✓ Loaded {len(canonical)} canonical modules")
print()
all_errors = []
for module_path, spec in canonical.items():
errors = validate_module(module_path, spec, src_path)
all_errors.extend(errors)
if all_errors:
print("=" * 70)
print("VALIDATION FAILED")
print("=" * 70)
print()
for error in all_errors:
print(error)
print()
print("Update __init__.py files or CANONICAL_IMPORTS.json to match")
sys.exit(1)
else:
print("=" * 70)
print("✅ SUCCESS: All __init__.py files match canonical registry")
print("=" * 70)
sys.exit(0)
if __name__ == "__main__":
main()Test:
python util/validate_canonical_imports.py
# Should pass with all current exportsStep 3: Create Documentation Validator (1 hour)
- Create documentation validator
File: util/validate_docs.py
Click to expand full implementation (280 lines)
#!/usr/bin/env python3
# Copyright(C) 2025-2026 Advanced Micro Devices, Inc. All rights reserved.
# SPDX-License-Identifier: MIT
"""
Documentation Code Snippet Validator
Validates that all Python code examples in .mdx files use correct imports.
Prevents users from following broken documentation.
"""
import ast
import json
import re
import sys
from pathlib import Path
from typing import List, Tuple
class CodeSnippet:
"""Represents a Python code snippet from documentation."""
def __init__(self, file: Path, line_num: int, code: str):
self.file = file
self.line_num = line_num
self.code = code
self.imports = self._extract_imports()
def _extract_imports(self) -> List[Tuple[str, str, List[str]]]:
"""
Extract import statements from code snippet.
Returns:
List of (import_type, module, names)
- import_type: "import" or "from"
- module: module name
- names: list of imported names (empty for simple import)
"""
imports = []
try:
tree = ast.parse(self.code)
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
imports.append(("import", alias.name, []))
elif isinstance(node, ast.ImportFrom):
module = node.module or ""
names = [alias.name for alias in node.names]
imports.append(("from", module, names))
except SyntaxError:
# Skip invalid Python (might be pseudocode)
pass
return imports
class ImportValidator:
"""Validates imports against canonical registry."""
def __init__(self, canonical_path: Path):
if not canonical_path.exists():
print(f"❌ Error: {canonical_path} not found")
print("Create CANONICAL_IMPORTS.json first")
sys.exit(1)
try:
with open(canonical_path) as f:
data = json.load(f)
except json.JSONDecodeError as e:
print(f"❌ Error: Invalid JSON in {canonical_path}")
print(f" {e}")
sys.exit(1)
self.canonical = data["canonical_imports"]
self.internal_whitelist = set(
data.get("internal_imports", {}).get("whitelist", [])
)
def validate_snippet(self, snippet: CodeSnippet) -> List[str]:
"""
Validate a code snippet's imports.
Returns:
List of error messages (empty if valid)
"""
errors = []
for import_type, module, names in snippet.imports:
# Skip non-gaia imports
if not module.startswith("gaia"):
continue
if import_type == "from":
# Separate whitelisted from non-whitelisted names
non_whitelisted_names = []
for name in names:
full_import = f"{module}.{name}"
if full_import not in self.internal_whitelist:
non_whitelisted_names.append(name)
# If all names are whitelisted, skip validation
if not non_whitelisted_names:
continue
# Check if module is canonical
if module not in self.canonical:
# Check if it's a known submodule (suggest canonical)
parent = ".".join(module.split(".")[:-1])
if parent in self.canonical:
errors.append(
f"{snippet.file.name}:{snippet.line_num}: "
f"Import from submodule '{module}'. "
f"Use canonical: 'from {parent} import {', '.join(non_whitelisted_names)}'"
)
else:
errors.append(
f"{snippet.file.name}:{snippet.line_num}: "
f"Import from non-canonical module '{module}'. "
f"Available: {list(self.canonical.keys())}"
)
continue
# Check if imported names are in __all__ (only non-whitelisted)
expected_exports = self.canonical[module]["exports"]
for name in non_whitelisted_names:
if name not in expected_exports:
errors.append(
f"{snippet.file.name}:{snippet.line_num}: "
f"'{name}' not exported from {module}. "
f"Available: {expected_exports}"
)
return errors
class DocumentationValidator:
"""Main validator for documentation code snippets."""
def __init__(self, docs_path: Path, canonical_path: Path):
self.docs_path = docs_path
self.validator = ImportValidator(canonical_path)
self.snippets: List[CodeSnippet] = []
def extract_snippets(self):
"""Extract all Python code snippets from .mdx files."""
for mdx_file in self.docs_path.rglob("*.mdx"):
self._extract_from_file(mdx_file)
def _extract_from_file(self, mdx_file: Path):
"""Extract Python code blocks from a single .mdx file."""
try:
with open(mdx_file, "r", encoding="utf-8") as f:
content = f.read()
except Exception as e:
print(f"⚠️ Warning: Could not read {mdx_file}: {e}")
return
# Match Python code blocks: ```python ... ```
pattern = r"```python\s*\n(.*?)```"
matches = re.finditer(pattern, content, re.DOTALL)
for match in matches:
code = match.group(1)
# Count lines before match for line number
line_num = content[: match.start()].count("\n") + 1
snippet = CodeSnippet(mdx_file, line_num, code)
if snippet.imports: # Only store if has imports
self.snippets.append(snippet)
def validate_all(self) -> Tuple[List[str], int]:
"""
Validate all extracted snippets.
Returns:
(errors, total_snippets_checked)
"""
errors = []
for snippet in self.snippets:
snippet_errors = self.validator.validate_snippet(snippet)
errors.extend(snippet_errors)
return errors, len(self.snippets)
def main():
"""Run documentation validation."""
script_dir = Path(__file__).parent
project_root = script_dir.parent
docs_path = project_root / "docs"
canonical_path = project_root / "src" / "gaia" / "CANONICAL_IMPORTS.json"
if not canonical_path.exists():
print(f"❌ Error: {canonical_path} not found")
print("Create CANONICAL_IMPORTS.json first (see Step 1)")
sys.exit(1)
print("=" * 70)
print("Documentation Code Snippet Validator")
print("=" * 70)
print()
validator = DocumentationValidator(docs_path, canonical_path)
print(f"📝 Extracting code snippets from {docs_path}...")
validator.extract_snippets()
print(f"✓ Found {len(validator.snippets)} Python code snippets with imports")
print()
print("🔍 Validating imports against canonical registry...")
errors, total = validator.validate_all()
print()
if errors:
print("=" * 70)
print(f"❌ VALIDATION FAILED: {len(errors)} import errors found")
print("=" * 70)
print()
for error in errors:
print(f" {error}")
print()
print("Fix these imports to match CANONICAL_IMPORTS.json")
sys.exit(1)
else:
print("=" * 70)
print(f"✅ SUCCESS: All {total} code snippets validated")
print("=" * 70)
sys.exit(0)
if __name__ == "__main__":
main()Test:
python util/validate_docs.py
# Should pass (all docs were fixed in the prerequisite PR)Step 4: Update lint to Use Registry (1 hour)
- Update lint CI to use registry
File: util/lint.py
Current state: Now tests 31 comprehensive imports (lines 295-369) - hardcoded list
Target state: Generate tests dynamically from CANONICAL_IMPORTS.json instead of hardcoding
Why: Currently both lint.py and lint.ps1 have hardcoded 31-import lists. If we add a new SDK module, we must update 3 places manually (lint.py, lint.ps1, and docs). Using the registry, tests auto-update when we modify CANONICAL_IMPORTS.json.
Replace the current check_imports() function with:
import json
def check_imports() -> CheckResult:
"""Test all canonical imports from registry."""
print("\n[7/7] Testing canonical SDK imports...")
print("-" * 40)
# Load canonical registry
project_root = Path(__file__).parent.parent
canonical_path = project_root / "src" / "gaia" / "CANONICAL_IMPORTS.json"
if not canonical_path.exists():
print("⚠️ CANONICAL_IMPORTS.json not found - using basic tests")
# Fallback to basic 4 imports for backward compatibility
return check_imports_basic()
try:
with open(canonical_path) as f:
data = json.load(f)
canonical = data["canonical_imports"]
except Exception as e:
print(f"⚠️ Could not load registry: {e} - using basic tests")
return check_imports_basic()
failed_imports = []
passed = 0
# Test each module and its exports
for module_name, spec in canonical.items():
# Test 1: Module import
cmd = [sys.executable, "-c", f"import {module_name}"]
exit_code, output = run_command(cmd)
if exit_code != 0:
failed_imports.append(f"import {module_name}")
print(f"❌ {module_name}")
else:
passed += 1
print(f"✓ {module_name}")
# Test 2: Each export
for export_name in spec["exports"]:
cmd = [sys.executable, "-c", f"from {module_name} import {export_name}"]
exit_code, output = run_command(cmd)
if exit_code != 0:
failed_imports.append(f"from {module_name} import {export_name}")
print(f" ❌ {export_name}")
else:
passed += 1
print(f" ✓ {export_name}")
print()
if failed_imports:
print(f"❌ {len(failed_imports)} import tests failed:")
for fail in failed_imports:
print(f" {fail}")
return CheckResult("Import Validation", False, False, len(failed_imports), "")
else:
print(f"✅ All {passed} import tests passed")
return CheckResult("Import Validation", True, False, 0, "")
def check_imports_basic() -> CheckResult:
"""Fallback basic import tests if registry doesn't exist."""
print("[Using basic 4-import fallback tests]")
print()
imports = [
("gaia.cli", "CLI module"),
("gaia.chat.sdk", "Chat SDK"),
("gaia.llm", "LLM client"),
("gaia.agents.base.agent", "Base agent"),
]
failed = False
issues = 0
for module, desc in imports:
cmd = [sys.executable, "-c", f"import {module}; print('OK: {desc} imports')"]
print(f"[CMD] {' '.join(cmd)}")
exit_code, output = run_command(cmd)
print(output.strip())
if exit_code != 0:
print(f"[!] Failed to import {module}")
failed = True
issues += 1
if failed:
return CheckResult("Import Validation", False, False, issues, "")
print("[OK] All imports working!")
return CheckResult("Import Validation", True, False, 0, "")Note: This maintains backward compatibility - if registry doesn't exist, falls back to basic 4-import tests.
Step 5: Create CI/CD Workflow (30 minutes)
- Create CI/CD workflow
File: .github/workflows/validate-docs.yml
name: Documentation Validation
on:
pull_request:
paths:
- 'docs/**/*.mdx'
- 'src/gaia/**/__init__.py'
- 'src/gaia/CANONICAL_IMPORTS.json'
- 'util/validate_*.py'
push:
branches: [main]
jobs:
validate-docs:
name: Validate Documentation-Code Sync
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install GAIA
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
- name: Validate canonical imports registry
run: |
echo "Checking that __init__.py files match CANONICAL_IMPORTS.json..."
python util/validate_canonical_imports.py
- name: Validate documentation code snippets
run: |
echo "Checking that all .mdx code examples use canonical imports..."
python util/validate_docs.py
- name: Test all canonical imports work
run: |
echo "Testing that all registered imports are importable..."
python util/lint.py --importsKey points:
- Triggers on changes to
.mdx,__init__.py,CANONICAL_IMPORTS.json, or validators - Runs all 3 checks in sequence
- Uses
--importsflag (not--check-imports)
Test workflow:
- Push to a test branch
- Verify workflow runs
- Break an import in docs/test.mdx
- Verify workflow catches it
Step 6: Update Documentation (30 minutes)
- Update documentation
File: docs/reference/dev.mdx
Find the "Linting" section and add this new section after it:
## Documentation-Code Synchronization
GAIA uses automated validation to ensure documentation examples always work.
### Canonical Import Registry
**File:** `src/gaia/CANONICAL_IMPORTS.json`
This JSON file is the **single source of truth** for correct import paths. It defines:
- All public SDK modules
- What each module exports in `__all__`
- Canonical import patterns for documentation
**Example:**
\`\`\`json
{
"canonical_imports": {
"gaia.llm": {
"exports": ["LLMClient", "VLMClient", "create_client"],
"description": "LLM client interfaces"
}
}
}
\`\`\`
### Validation Tools
#### 1. Validate Registry Matches Code
- [ ] Validate registry matches code
Ensures `__init__.py` files match the canonical registry:
\`\`\`bash
python util/validate_canonical_imports.py
\`\`\`
#### 2. Validate Documentation Examples
- [ ] Validate documentation examples
Ensures all Python code in `.mdx` files uses correct imports:
\`\`\`bash
python util/validate_docs.py
\`\`\`
#### 3. Test All Imports
- [ ] Test all imports
Lint validation now tests 30+ imports automatically:
\`\`\`bash
python util/lint.py --imports
\`\`\`
### Usage for Developers
- [ ] Usage for developers
**When adding new public APIs:**
1. Update module's `__init__.py`:
\`\`\`python
from .new_module import NewClass
__all__ = [..., "NewClass"]
\`\`\`
2. Update `CANONICAL_IMPORTS.json`:
\`\`\`json
{
"gaia.module": {
"exports": [..., "NewClass"],
"description": "Module description"
}
}
\`\`\`
3. Validate sync:
\`\`\`bash
python util/validate_canonical_imports.py
\`\`\`
4. Document with canonical import:
\`\`\`python
from gaia.module import NewClass
\`\`\`
5. Validate docs:
\`\`\`bash
python util/validate_docs.py
\`\`\`
**When writing documentation:**
✅ **Use canonical imports:**
\`\`\`python
from gaia.llm import LLMClient, VLMClient
from gaia.chat.sdk import ChatSDK
from gaia.agents.base import Agent, tool
\`\`\`
❌ **Don't use submodule imports:**
\`\`\`python
from gaia.llm.llm_client import LLMClient # Wrong
from gaia.llm.vlm_client import VLMClient # Wrong
from gaia.agents.base.agent import Agent # Wrong
\`\`\`
### CI/CD Integration
- [ ] The `validate-docs` workflow runs automatically on PRs that change:
- Documentation files (`.mdx`)
- Module exports (`__init__.py`)
- Canonical registry (`CANONICAL_IMPORTS.json`)
**What it checks:**
1. Registry matches code exports
2. Documentation uses canonical imports
3. All imports are importable
**If validation fails:**
- PR will be blocked
- Error message shows which file and line number
- Fix the import to match canonical patternStep 6: Update CONTRIBUTING.md (15 minutes)
- Update CONTRIBUTING.md
File: CONTRIBUTING.md
Find the "Before Submitting" or "Code Quality" section and add:
### Documentation Changes
If you're modifying documentation (`.mdx` files):
\`\`\`bash
# Validate all code examples use correct imports
python util/validate_docs.py
\`\`\`
If you're changing module exports (`__init__.py`):
\`\`\`bash
# 1. Update CANONICAL_IMPORTS.json to match your changes
# 2. Validate they're in sync:
python util/validate_canonical_imports.py
\`\`\`
Both validators run automatically in CI/CD and will block PRs if imports are incorrect.Testing Plan
Test 1: Registry Validator
- Registry validator is passing
# Should pass (all current exports are correct)
python util/validate_canonical_imports.py
# Test error detection:
# 1. Temporarily remove "VLMClient" from src/gaia/llm/__init__.py __all__
python util/validate_canonical_imports.py
# Expected: "Missing exports in __all__: ['VLMClient']"
# 2. Revert change
# 3. Add fake export "FakeClass" to CANONICAL_IMPORTS.json under gaia.llm
python util/validate_canonical_imports.py
# Expected: "Extra exports not in canonical: ['FakeClass']"
# 4. Revert - should pass againTest 2: Documentation Validator
- Documentation validator is passing
# Should pass (all docs fixed in prerequisite PR)
python util/validate_docs.py
# Test error detection:
# 1. Create temporary test file
cat > docs/test-broken.mdx << 'EOF'
---
title: "Test"
---
from gaia.llm.llm_client import LLMClient
EOF
# 2. Run validator
python util/validate_docs.py
# Expected: "Import from submodule 'gaia.llm.llm_client'. Use canonical: 'from gaia.llm import LLMClient'"
# 3. Remove test file
rm docs/test-broken.mdxTest 3: Updated lint.py
- Verify lint is using registry
# Test that lint.py now uses registry
python util/lint.py --imports
# Should see:
# ✓ gaia.llm
# ✓ LLMClient
# ✓ VLMClient
# ✓ create_client
# ... (30+ total tests)
# ✅ All X import tests passedTest 4: CI/CD Workflow
- Test CI/CD workflow
# 1. Create test branch
git checkout -b test/doc-validation
# 2. Create docs/test.mdx with broken import:
cat > docs/test.mdx << 'EOF'
---
title: "Test"
---
from gaia.llm.old_module import Something
EOF
# 3. Commit and push
git add docs/test.mdx
git commit -m "Test: broken import"
git push origin test/doc-validation
# 4. Create PR - verify GitHub Actions fails with clear error
# 5. Fix import to canonical pattern
# 6. Push again - verify GitHub Actions passes
# 7. Close PR and cleanupTest 5: Integration Test
- Integration tests are passing
# Run full validation suite
python util/validate_canonical_imports.py && \
python util/validate_docs.py && \
python util/lint.py --imports
# All three should passAcceptance Criteria
Deliverables
-
src/gaia/CANONICAL_IMPORTS.jsoncreated with 13+ modules -
util/validate_canonical_imports.pycreated (120 lines) -
util/validate_docs.pycreated (280 lines) -
util/lint.pyupdated with registry-based import tests -
.github/workflows/validate-docs.ymlcreated and working -
docs/reference/dev.mdxupdated with validation docs -
CONTRIBUTING.mdupdated with validation steps
Validation
- Registry validator passes on current codebase
- Documentation validator passes on current codebase
- lint.py generates 30+ tests from registry
- CI workflow triggers on relevant file changes
- CI workflow correctly fails on broken imports
- CI workflow correctly passes on valid imports
Testing
- All 5 test scenarios pass
- Error messages are clear with file:line references
- Performance < 10 seconds for full validation
- No false positives on current codebase
Success Metrics
Before:
- 0% documentation validation
- 4 import tests (modules only)
- 4 days to detect breaking change
- Manual coordination across 8+ files
After:
- 100% documentation validation
- 30+ import tests (every exported class)
- < 5 minutes to detect breaking change
- Automated enforcement in CI/CD
- Users never see broken examples
Prerequisites
Must be merged first:
- PR that fixes the 6 documentation files with broken imports
- PR that adds VLMClient export to gaia/llm/init.py
- PR that enhances lint.ps1 with 31 import tests
Check git log for: Commit fixing import inconsistencies (should be on main)
Known Considerations
Internal Helper Imports
Some documentation may legitimately show internal imports for advanced use cases:
from gaia.llm.lemonade_client import DEFAULT_MODEL_NAME # Advanced usageSolution: The internal_imports.whitelist in the registry handles these exceptions.
Platform Compatibility
All validators use pathlib.Path for cross-platform compatibility. Tested on:
- Windows (PowerShell)
- Linux (bash)
- macOS (bash)
Backward Compatibility
If CANONICAL_IMPORTS.json doesn't exist, lint.py falls back to basic 4-import tests. This prevents breaking existing workflows during deployment.
Edge Cases Handled
- Invalid JSON in registry → Clear error message, fails gracefully
- Pseudocode in docs → AST parse errors caught, snippet skipped
- Missing init.py → Detected and reported
- Missing all → Detected and reported
- Non-Python code blocks → Ignored (only validates ```python blocks)
- Comments in code → Ignored by AST parser
- Internal helper imports → Whitelisted via registry
Rollout Strategy
Week 1: Local Development
- Create registry and validators
- Test locally on developer machine
- Iterate on error messages for clarity
Week 2: CI/CD Integration
- Create GitHub workflow
- Test on feature branch
- Monitor for false positives
Week 3: Documentation
- Update dev.mdx
- Update CONTRIBUTING.md
- Announce to team
Week 4: Enforcement
- Make mandatory (block PRs)
- Monitor for issues
- Refine as needed
Estimated Effort
| Task | Time | Notes |
|---|---|---|
| Add missing all declarations | 30 min | 3 files to update |
| Create CANONICAL_IMPORTS.json | 30 min | Use helper script to extract |
| Write validate_canonical_imports.py | 30 min | Straightforward AST parsing |
| Write validate_docs.py | 1 hour | Regex + AST parsing |
| Update lint.py | 1 hour | Replace hardcoded list with registry |
| Create CI/CD workflow | 30 min | Adapt existing workflows |
| Update documentation | 30 min | Add to dev.mdx + CONTRIBUTING.md |
| Testing all scenarios | 30 min | Run 5 test cases |
| TOTAL | 4.5 hours | Can be done in one day |
Files to Create/Modify
New Files (4-5)
src/gaia/CANONICAL_IMPORTS.jsonutil/validate_canonical_imports.pyutil/validate_docs.py.github/workflows/validate-docs.yml- Optional:
util/generate_canonical_imports.py(helper)
Modified Files (6)
src/gaia/chat/sdk.py- Add__all__declarationsrc/gaia/rag/sdk.py- Add__all__declarationsrc/gaia/agents/base/__init__.py- Add__all__declarationutil/lint.py- Updatecheck_imports()functiondocs/reference/dev.mdx- Add validation sectionCONTRIBUTING.md- Add validation steps
Total: 10-11 files
Questions for Reviewer
- Should we migrate both lint.py and lint.ps1 to registry-based, or keep the hardcoded 31 tests?
- Should internal helper imports be allowed in docs, or strictly canonical only?
- What should happen if validator has false positives - create exceptions list?
- Should we validate that code examples actually execute (beyond just imports)?
Additional Resources
- Breaking change commit: 77df07b (Jan 15, 2026) - LLM Client Factory refactor
- Current comprehensive tests:
util/lint.ps1lines 267-314 (31 hardcoded import tests)util/lint.pylines 295-369 (31 hardcoded import tests, now in sync)
- Related PR: Import inconsistencies fix (adds VLMClient export, fixes 6 docs, enhances both lint scripts)
Checklist for Assignee
Before starting:
- Review this issue fully (all steps, code, and test scenarios)
- Verify prerequisite PR is merged
- Examine current import testing in
util/lint.ps1lines 267-314
Implementation:
- Add all declarations to 3 modules (Step 0)
- Create CANONICAL_IMPORTS.json from current exports (Step 1)
- Implement validate_canonical_imports.py (Step 2)
- Test registry validator - all test cases (Test 1)
- Implement validate_docs.py (Step 3)
- Test documentation validator - all test cases (Test 2)
- Update lint.py check_imports() function (Step 4)
- Test updated lint.py (Test 3)
- Create GitHub Actions workflow (Step 5)
- Test CI/CD integration on feature branch (Test 4)
Documentation:
- Update docs/reference/dev.mdx
- Update CONTRIBUTING.md
- Verify all new docs are accurate
Testing:
- Run all validators locally
- Verify they pass on current codebase
- Test error detection works
- Verify CI workflow fails/passes correctly
- Check performance (< 10 sec)
Finalize:
- Create PR with all changes
- Self-review all code
- Verify CI passes on your PR
- Request review from team