Skip to content

openedges/xmlschema-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xmlschema-rs

Crates.io Documentation License: MIT

A Rust implementation of XML Schema (XSD) validation, ported from Python xmlschema.

Features

Feature Description
XSD 1.0 & 1.1 Full support including assertions and type alternatives
W3C Validated 100% pass rate (30,000+ tests)
XPath 2.0 119+ built-in functions for assertions
Partial Validation Validate specific paths with ancestor tracking
Incremental Updates Structural sharing for efficient document changes
Streaming Validation Memory-efficient processing with limits and progress tracking
Validation Hooks Custom validation logic injection with hook chains
Data Binding Type-safe XML to Rust struct conversion
ElementTree API Python lxml/ElementTree-compatible element access
Converters 8 XML-to-JSON converters
Multi-Platform Rust, WebAssembly, Node.js native

W3C Test Results

Test Type Result Pass Rate
SCHEMA 8248/8249 100.0%
INSTANCE 21834/21837 100.0%

Performance Benchmarks

Benchmark Rust Node.js WASM
Schema Parsing
simple_schema 2.18 µs 3.04 µs 10.58 µs
person_schema 9.91 µs 11.92 µs 40.63 µs
catalog_schema 17.49 µs 20.63 µs 66.33 µs
Validation
is_valid_simple 4.24 µs 5.21 µs 12.67 µs
is_valid_with_optional 4.98 µs 5.71 µs 15.42 µs
validate_str 4.21 µs 4.79 µs 12.71 µs
is_valid_invalid 14.69 µs 15.50 µs 19.21 µs
JSON Conversion
to_json_str 2.01 µs 2.42 µs 3.71 µs
converter_default 1.71 µs 2.29 µs 3.42 µs
converter_parker 1.32 µs 1.83 µs 2.63 µs
converter_badgerfish 3.04 µs 3.58 µs 4.46 µs
Nested Depth
depth/10 74.89 µs 76.63 µs 99.96 µs
depth/50 361.11 µs 368.29 µs 552.17 µs
depth/100 720.34 µs 726.17 µs 1.07 ms
Catalog (N items)
validation/10 42.94 µs 45.08 µs 60.46 µs
validation/100 425.87 µs 436.00 µs 587.67 µs
validation/1000 4.25 ms 4.36 ms 5.97 ms
conversion/10 17.75 µs 18.13 µs 30.67 µs
conversion/100 164.24 µs 168.46 µs 283.92 µs
conversion/1000 1.66 ms 1.69 ms 2.69 ms
Registry
create_and_register 176 ns 1.00 µs 1.63 µs
load_schema 8.90 µs 9.96 µs 15.08 µs
validate_loaded 1.54 µs 1.75 µs 1.79 µs

Lower is better. Measured on Apple M1.


Installation

Rust

[dependencies]
xmlschema-rs = "0.5.1"

Node.js (Native)

npm install xmlschema-js

WebAssembly

wasm-pack build --target web --out-dir wasm --out-name xmlschema-wasm --no-default-features --features wasm

Quick Start

Rust

use xmlschema_rs::XmlSchema;

fn main() -> Result<(), xmlschema_rs::XsdError> {
    let schema = XmlSchema::new("schema.xsd")?;

    schema.validate("document.xml")?;           // Validate
    let valid = schema.is_valid("document.xml"); // Check validity
    let json = schema.to_json("document.xml")?;  // Convert to JSON

    Ok(())
}

Node.js

import pkg from 'xmlschema-js';
const { XmlSchema, isValid, toJson } = pkg;

// One-shot validation
if (isValid(xml, xsd)) console.log('Valid!');

// Reusable schema
const schema = new XmlSchema(xsd);
schema.validate(xml);
console.log(schema.toJson(xml));

// File-based (native only)
const fileSchema = XmlSchema.fromFile('./schema.xsd');
fileSchema.validateFile('./document.xml');

WebAssembly

import init, { WasmXmlSchema, isValidXml } from 'xmlschema-wasm';

await init();

if (isValidXml(xml, xsd)) console.log('Valid!');

const schema = new WasmXmlSchema(xsd);
schema.validate(xml);
console.log(schema.toJson(xml));

Core Features

XSD 1.1 Assertions

use xmlschema_rs::{SchemaLoader, SchemaOptions, XsdVersion};

let xsd = r#"<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="range">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="min" type="xs:integer"/>
        <xs:element name="max" type="xs:integer"/>
      </xs:sequence>
      <xs:assert test="min le max"/>
    </xs:complexType>
  </xs:element>
</xs:schema>"#;

let options = SchemaOptions { version: XsdVersion::Xsd11, ..Default::default() };
let schema = SchemaLoader::new(options).load_from_str(xsd, "test.xsd".to_string())?;

assert!(schema.is_valid_str("<range><min>5</min><max>10</max></range>"));
assert!(!schema.is_valid_str("<range><min>10</min><max>5</max></range>"));

Schema Registry

Manage multiple schemas with import/include resolution:

use xmlschema_rs::SchemaRegistry;

let mut registry = SchemaRegistry::new();
registry.register("types.xsd", types_xsd_content);
registry.register("main.xsd", main_xsd_content);

let schema = registry.load_schema("main.xsd")?;         // XSD 1.0
let schema11 = registry.load_schema_xsd11("main.xsd")?; // XSD 1.1

Converters

Converter Description
DefaultConverter @ prefix for attributes
ParkerConverter Simple, discards attributes
BadgerFishConverter Preserves all XML info
GDataConverter Google Data protocol
AbderaConverter Apache Abdera
JsonMLConverter Array-based
ColumnarConverter Flat/tabular
UnorderedConverter Groups same-named elements
use xmlschema_rs::converters::{ParkerConverter, Converter, ConversionOptions};

let json = ParkerConverter::new().convert_str(xml, &ConversionOptions::default())?;

Advanced Features

Partial Validation

Validate specific XPath with ancestor tracking:

use xmlschema_rs::{XmlSchema, PathResolver, PartialValidationOptions};
use xmlschema_rs::validators::Validator;

let schema = XmlSchema::new("schema.xsd")?;
let doc = roxmltree::Document::parse(xml)?;
let mut validator = Validator::new(&schema);

// Validate single path (ancestors + subtree)
validator.validate_path(&doc, "/catalog/item[@id='123']")?;

// Validate multiple paths
validator.validate_paths(&doc, &["/catalog/item[1]", "/catalog/item[3]"])?;

// With options
let opts = PartialValidationOptions::new()
    .with_max_depth(2)
    .with_track_constraints(true);
validator.validate_path_with_options(&doc, "/catalog", opts)?;

Incremental Updates

Update and validate only changed parts with structural sharing:

use xmlschema_rs::{XmlSchema, DocumentChange, IncrementalValidator};

let mut validator = IncrementalValidator::new(&schema, xml)?;

// Apply changes (only affected parts re-validated)
validator.apply_change(DocumentChange::UpdateText {
    path: "/catalog/item/price".into(),
    text: "29.99".into(),
})?;

validator.apply_change(DocumentChange::Insert {
    parent_path: "/catalog".into(),
    index: 2,
    xml: "<item id='3'><name>New</name></item>".into(),
})?;

if validator.is_valid() {
    println!("{}", validator.to_string());
}

DocumentChange Types:

Change Description
Replace { path, xml } Replace element
Insert { parent_path, index, xml } Insert element
Remove { path } Remove element
UpdateText { path, text } Update text content
UpdateAttribute { path, name, value } Update/remove attribute

Structural Sharing - Only modified paths allocate new memory:

Before:              After UpdateText("/root/a/b"):
    root                 root' (new)
   /    \               /    \
  a      c             a'     c (reused)
 / \                  / \
b   d                b'  d (reused)
                    (new)

Streaming Validation

Memory-efficient validation for large documents with progress tracking:

use xmlschema_rs::validators::{
    StreamingConfig, StreamingDecoder, MemoryAwareOptions, validate_with_limits,
};
use roxmltree::Document;

// Configure streaming
let config = StreamingConfig::new()
    .with_memory_limit_mb(100)   // 100MB limit
    .with_chunk_size(1000)       // Process 1000 elements per chunk
    .with_progress_tracking();

let xml = "<root><item>value</item></root>";
let doc = Document::parse(xml)?;

let decoder = StreamingDecoder::new(&doc)
    .with_config(config)
    .with_progress_callback(|progress| {
        println!("Progress: {:.1}%", progress * 100.0);
    });

// Memory-limited validation
let options = MemoryAwareOptions::new()
    .with_max_document_size(10 * 1024 * 1024)  // 10MB max
    .with_max_depth(50);                        // 50 levels max

let result = validate_with_limits(xml, &options);

Validation Hooks

Custom validation logic injection using hook chains:

use xmlschema_rs::validators::{
    ValidationHook, HookResult, HookChain, LoggingHook, MaxErrorsHook,
    StreamingDecoder,
};
use roxmltree::Document;

// Create a hook chain
let hooks = HookChain::new()
    .with(LoggingHook::with_level(0))  // Log errors only
    .with(MaxErrorsHook::new(10));      // Stop after 10 errors

let xml = "<root><a>1</a><b>2</b></root>";
let doc = Document::parse(xml)?;

// Decode with hooks
let decoder = StreamingDecoder::new(&doc).with_hooks(hooks);
let results: Vec<_> = decoder.decode_iter().collect();

Built-in Hooks:

Hook Description
LoggingHook Log validation events (configurable verbosity)
MaxErrorsHook Stop validation after N errors
HookChain Combine multiple hooks

Data Binding

Convert XML to typed Rust structures with automatic type conversion:

use xmlschema_rs::converters::{DataBindingConverter, DataBindingConfig, TypeHint};
use xmlschema_rs::ConversionOptions;
use xmlschema_rs::converters::Converter;

// Configure type hints
let config = DataBindingConfig::new()
    .add_type_hint("age", TypeHint::Integer)
    .add_type_hint("active", TypeHint::Boolean)
    .with_camel_case();  // Convert field names to camelCase

let converter = DataBindingConverter::new(config);

let xml = r#"<user><first-name>John</first-name><age>30</age><active>true</active></user>"#;
let result = converter.convert_str(xml, &ConversionOptions::default())?;

// Result has typed values: age=30 (integer), active=true (boolean)
// Field names are converted: first-name → firstName

TypeHint Options: String, Integer, Float, Boolean, Array

ElementTree API

Python lxml/ElementTree-compatible API for element access:

use xmlschema_rs::converters::{Element, fromstring, tostring};

// Parse XML
let xml = r#"<person id="1"><name>John</name><age>30</age></person>"#;
let elem = fromstring(xml)?;

// Navigate elements
assert_eq!(elem.tag, "person");
assert_eq!(elem.get("id"), Some("1"));
assert_eq!(elem.findtext("name"), Some("John"));

// Find children
let name = elem.find("name").unwrap();
assert_eq!(name.text, Some("John".to_string()));

// Find all matching elements
let items = elem.findall("item");

// Iterate children
for child in elem.iter() {
    println!("{}: {:?}", child.tag, child.text);
}

// Serialize back to XML
let xml_out = tostring(&elem);

API Reference

Rust API

Category API Description
Schema XmlSchema::new(path) Load from file
XmlSchema::from_str(xsd) Load from string
XmlSchema::from_url(url) Load from URL
validate() / is_valid() Validate XML
to_json() / from_json() Convert
Top-Level validate(xml, schema) Validate
is_valid(xml, schema) Check validity
to_dict() / to_json() Convert
Registry SchemaRegistry::new() Create registry
register(uri, content) Add schema
load_schema(uri) Load with imports
Partial Validator::validate_path() Validate path
PathResolver::find(xpath) Find elements
Incremental IncrementalDocument::parse() Parse with sharing
IncrementalValidator::apply_change() Update + validate
Streaming StreamingDecoder::new() Create decoder
StreamingConfig::new() Configure limits
validate_with_limits() Memory-limited validation
Hooks HookChain::new().with() Chain hooks
LoggingHook / MaxErrorsHook Built-in hooks
Data Binding DataBindingConverter::new() Create converter
DataBindingConfig::add_type_hint() Configure types
ElementTree fromstring() / tostring() Parse/serialize
Element::find() / findall() Navigate elements

Node.js API

Class Method Description
XmlSchema new(xsd) Create from string
fromFile(path) Load from file
withOptions(xsd, version) Create with XSD version
validate(xml) / validateFile(path) Validate (throws)
isValid(xml) / isValidFile(path) Check validity
toJson(xml) / toJsonWithConverter(xml, name) Convert to JSON
fromJson(json) Convert from JSON
getErrors(xml) Get errors as JSON
SchemaRegistry new() Create registry
register(uri, content) / registerFile(uri, path) Register schema
loadSchema(uri) / loadSchemaXsd11(uri) Load schema
IncrementalDocument new(xml) Parse XML
updateText(path, text) Update → new doc
insertElement(parent, idx, xml) Insert → new doc
removeElement(path) Remove → new doc
IncrementalValidator new(schema, xml) Create validator
updateText(path, text) Update + validate
isValid() / getErrors() Check validity

Top-Level Functions: validate, isValid, toJson, fromJson, convertXmlToJson, getConverters

WebAssembly API

Class Method Description
WasmXmlSchema new(xsd) Create schema
validate(xml) / isValid(xml) Validate
toJson(xml) / fromJson(json) Convert
getErrors(xml) Get errors as JSON
WasmSchemaRegistry register(uri, content) Register schema
loadSchema(uri) / loadSchemaXsd11(uri) Load schema
WasmIncrementalDocument new(xml) Parse XML
updateText(path, text) Update → new doc
WasmIncrementalValidator new(schema, xml) Create validator
updateText(path, text) Update + validate

CLI

cargo build --release

# Main CLI
xmlschema check schema.xsd
xmlschema validate schema.xsd document.xml
xmlschema to-json schema.xsd document.xml --converter=parker
xmlschema codegen schema.xsd --derive=Debug,Clone -o bindings.rs

# Python-compatible CLIs
xmlschema-validate --schema schema.xsd document.xml
xmlschema-xml2json --schema schema.xsd document.xml
xmlschema-json2xml --schema schema.xsd document.json

Development

# Rust
cargo test                # Run tests
cargo bench               # Run benchmarks
cargo clippy              # Lint

# WebAssembly (in wasm/ directory)
cd wasm
npm run build             # Build WASM package
npm test                  # Run tests

# Node.js Native (in npm/ directory)
cd npm
npm install
npm run build             # Release build
npm run build:debug       # Debug build
npm test

# W3C tests (requires Python 3.8+)
python3 -m venv .venv
source .venv/bin/activate   # Linux/macOS
pip install xmlschema lxml
./tests/w3c_setup.sh
.venv/bin/python tests/w3c_tests.py

Project Structure

xmlschema-rs/
├── src/
│   ├── lib.rs              # Library entry point
│   ├── wasm.rs             # WebAssembly bindings
│   ├── napi.rs             # Node.js native bindings
│   ├── validators/         # XSD validation core
│   │   ├── streaming.rs    # Streaming decoder
│   │   ├── streaming_ext.rs # Extended streaming with limits
│   │   ├── hooks.rs        # Validation hooks
│   │   ├── incremental.rs  # Incremental updates
│   │   └── path_resolver.rs # XPath resolution
│   ├── converters/         # XML-to-JSON converters
│   │   ├── data_binding.rs # Type-safe data binding
│   │   └── element_tree.rs # ElementTree API
│   └── xpath/              # XPath 2.0 engine
├── npm/                    # Node.js native package (xmlschema-js)
│   ├── package.json
│   ├── index.js
│   └── test.mjs
├── wasm/                   # WebAssembly package (xmlschema-wasm)
│   ├── package.json
│   ├── build.sh
│   ├── test.mjs
│   └── xmlschema-wasm.js
├── tests/                  # Integration tests
└── benches/                # Benchmarks
    ├── validation_rust.rs  # Rust (cargo bench)
    ├── validation_napi.mjs # Node.js native
    └── validation_wasm.mjs # WebAssembly

Python Compatibility

Python xmlschema Rust xmlschema-rs
XMLSchema(source) XmlSchema::new(source)
schema.validate(xml) schema.validate(xml)
schema.is_valid(xml) schema.is_valid(xml)
schema.to_dict(xml) schema.to_dict(xml)
validate(xml, schema) validate(xml, schema)
to_dict(xml, schema) to_dict(xml, schema)

References

License

MIT

About

XML Schema Validator for Pure Rust

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors