Skip to content

Add automatic validation system for Score results#78

Open
endymion wants to merge 2 commits intomainfrom
feature/score_validation
Open

Add automatic validation system for Score results#78
endymion wants to merge 2 commits intomainfrom
feature/score_validation

Conversation

@endymion
Copy link
Contributor

Summary

Implements comprehensive automatic validation for Score results based on YAML configuration rules. This ensures consistent, reliable classification outputs across the entire Plexus system by validating results against predefined constraints before they're returned.

Key Features

🔍 Automatic Validation

  • Zero-code integration: Works automatically with all existing Score implementations
  • Transparent operation: Uses __getattribute__ method interception
  • Backward compatible: No changes required to existing code

⚙️ Validation Rules

  • valid_classes: Restrict results to specific allowed values (e.g., ["Yes", "No", "Maybe"])
  • patterns: Require results to match regex patterns (e.g., "^NQ - (?\!Other$).*")
  • length constraints: Enforce minimum/maximum string lengths
  • mixed validation: All constraints must pass (AND logic)

🎯 Field-Specific Validation

  • value: Validate the main classification result
  • explanation: Validate explanation text separately
  • flexible: Different rules for different fields

YAML Configuration

validation:
  value:
    valid_classes: ["Yes", "No", "NQ - Pricing", "NQ - Technical"]
    patterns: ["^(Yes|No)$", "^NQ - (?\!Other$).*"]
  explanation:
    minimum_length: 10
    maximum_length: 200
    patterns: [".*evidence.*", ".*found.*"]

Technical Implementation

🏗️ Core Components Added

  • FieldValidation: Validation rules for individual fields
  • ValidationConfig: Groups field validations together
  • ValidationError: Descriptive error messages with field context
  • Score.Result.validate(): Validation logic with comprehensive error handling

🔄 Method Interception

  • Uses __getattribute__ to wrap all predict() calls automatically
  • Handles multiple predict() method signatures for compatibility:
    • Standard: predict(context, model_input)
    • Legacy: predict(model_input)
    • Keyword-only: predict(*, context, model_input)
  • Maintains full backward compatibility

Performance

  • Validation only runs when configured (zero overhead otherwise)
  • Efficient regex compilation and caching
  • Minimal impact on prediction performance

Examples

Basic Usage

# Simple class validation
validation:
  value:
    valid_classes: ["Yes", "No"]

Advanced Pattern Matching

# Exclude specific patterns (NQ- anything except "Other")
validation:
  value:
    patterns: ["^NQ - (?\!Other$).*"]

Mixed Constraints

# Must be in valid list AND match pattern
validation:
  value:
    valid_classes: ["Yes", "No", "NQ - Pricing"] 
    patterns: ["^(Yes|No)$", "^NQ - (?\!Other$).*"]

Error Handling

Validation failures throw descriptive Score.ValidationError with:

  • Field name: Which field failed validation
  • Actual value: What value was provided
  • Failure reason: Why validation failed
  • Expected format: What was expected

Example error:

ValidationError: Validation failed for field 'value': 'Maybe' is not in valid_classes ['Yes', 'No']

Test Coverage

📋 Comprehensive Testing

  • 21 test cases covering all validation scenarios
  • Pattern validation: Including complex regex with negative lookahead
  • Mixed validation: Multiple constraints working together
  • Error cases: Invalid regex, missing values, edge cases
  • Integration: Full predict() method integration testing
  • Compatibility: Different predict() method signatures

🧪 Test Categories

  • Basic validation (valid_classes, patterns, length)
  • Advanced scenarios (NQ- exclusion, mixed constraints)
  • Error handling (descriptive messages, field context)
  • Integration (predict() method wrapping, list results)
  • Edge cases (None values, invalid regex, empty configs)

Benefits

Quality Assurance

  • Prevents invalid results from propagating through the system
  • Catches configuration errors immediately during development
  • Ensures consistent outputs across different Score implementations

🛠️ Developer Experience

  • Clear error messages for quick debugging
  • Zero code changes required for existing Scores
  • Easy configuration through familiar YAML syntax
  • Comprehensive documentation with examples

🔧 Maintainability

  • Centralized validation logic in base Score class
  • Consistent error handling across all Score types
  • Easy to extend with additional validation rules
  • Well-documented implementation and usage

Files Changed

  • plexus/scores/Score.py: Core validation implementation (+260 lines)
  • tests/test_score_validation.py: Comprehensive test suite (+448 lines)
  • plexus/docs/score-yaml-format.md: Documentation updates (+38 lines)

Compatibility Notes

🔀 Method Signature Handling

The implementation includes sophisticated compatibility handling for different predict() method signatures found in the codebase. This adds some complexity but ensures zero breaking changes.

Note: This compatibility layer can be simplified once predict() signatures are standardized across the codebase (tracked in #77).

🚀 Production Ready

  • Thoroughly tested with comprehensive test suite
  • Performance optimized with minimal overhead
  • Error handling for all edge cases
  • Documentation for configuration and usage

Usage

  1. Add validation to Score YAML:
validation:
  value:
    valid_classes: ["Yes", "No", "Maybe"]
  1. Validation happens automatically:
result = score.predict(context, input_data)
# Validation runs automatically before result is returned
# Throws ValidationError if result doesn't match constraints
  1. No code changes needed - existing Score implementations work unchanged

This validation system provides immediate quality assurance for Score results while maintaining full backward compatibility and requiring zero changes to existing code.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

…figuration

- Added a validation section to the Score class, allowing automatic validation of prediction results against specified rules in YAML.
- Introduced validation constraints including valid classes, regex patterns, and string length limits.
- Enhanced the Score.Result class with validation methods to ensure compliance with defined rules.
- Created comprehensive tests for various validation scenarios, ensuring robust functionality and error handling.
- Updated documentation to reflect new validation features and usage patterns.
@endymion endymion self-assigned this Jul 15, 2025
@endymion endymion requested a review from dereknorrbom July 15, 2025 02:16
cursor[bot]

This comment was marked as outdated.

…uration

- Added support for case-sensitive comparisons in the validation of `valid_classes` within the Score class.
- Updated the validation logic to handle both case-sensitive and case-insensitive scenarios based on a new `case_sensitive` flag.
- Expanded documentation to clarify the behavior of case sensitivity in validation rules.
- Implemented comprehensive tests to verify the functionality of case-sensitive and case-insensitive validation, ensuring robust error handling.
@endymion
Copy link
Contributor Author

✨ Enhancement: Case-Insensitive Validation Added

Based on feedback, I've enhanced the validation system with case-insensitive comparison for valid_classes - this addresses the original concern about needing regex just for case variations.

🔧 What Changed

New case_sensitive Field:

validation:
  value:
    valid_classes: ["Yes", "No", "Maybe"]
    case_sensitive: false  # Default: case-insensitive

Default Behavior (Case-Insensitive):

  • "yes", "YES", "Yes" all match ["Yes", "No"]
  • Much more intuitive and user-friendly
  • No regex needed for simple case variations

Optional Case-Sensitive Mode:

validation:
  value:
    valid_classes: ["Yes", "No"]
    case_sensitive: true  # Requires exact case match

📈 Benefits

  1. Eliminates regex complexity for simple case handling
  2. More intuitive default - most users expect case-insensitive matching
  3. Backward compatible - existing configs work unchanged
  4. Clear error messages indicate which mode was used
  5. Optional precision when exact case matters

🧪 Test Coverage

Added 4 comprehensive test cases:

  • ✅ Case-insensitive success (default behavior)
  • ✅ Case-insensitive failure with clear error message
  • ✅ Case-sensitive success when explicitly enabled
  • ✅ Case-sensitive failure with exact case requirement

📝 Example Usage

Simple case-insensitive (most common):

validation:
  value:
    valid_classes: ["Yes", "No"]  # Matches any case variation

Strict case-sensitive when needed:

validation:
  value:
    valid_classes: ["YES", "NO"]
    case_sensitive: true  # Only exact matches

This enhancement makes the validation system much more user-friendly while maintaining full flexibility for cases where exact case matching is required.

Total test coverage now: 25 passing tests 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant