Resolve RO-Crate FAIR Digital Object inputs for scientific workflows.
Given a Workflow RO-Crate that declares its expected inputs as FormalParameter entities, fdo-resolver scans a directory of data RO-Crates and matches them to the workflow's input slots by encodingFormat, additionalType, name, and variableMeasured.
pip install fdo-resolverfrom fdo_resolver import FDOResolver
# Load input slots from a Workflow RO-Crate
resolver = FDOResolver.from_workflow_crate("path/to/workflow-rocrate/")
# Resolve data crates against the workflow profile
result = resolver.resolve("path/to/input-data/")
print(result.is_complete) # True if all required inputs matched
print(result.paths) # {param_name: Path(...), ...}
print(result.summary()) # Human-readable summaryWhen input datasets describe their columns via variableMeasured (using schema.org PropertyValue with propertyID pointing to semantic identifiers like I-ADOPT nanopublications), the resolver can match columns between workflow expectations and data offerings.
Workflow RO-Crate declares expected variables:
{
"@id": "#param-buildings",
"@type": "FormalParameter",
"name": "buildings",
"encodingFormat": "application/flatgeobuf",
"variableMeasured": [
{"@id": "#expected-elderly-singles"}
]
}{
"@id": "#expected-elderly-singles",
"@type": "PropertyValue",
"name": "elderly_singles",
"propertyID": "https://w3id.org/np/RA...",
"additionalType": "sensitivity_indicator"
}Data RO-Crate describes its columns:
{
"@id": "buildings.fgb",
"@type": "File",
"encodingFormat": "application/flatgeobuf",
"variableMeasured": [
{"@id": "#var-es"}
]
}{
"@id": "#var-es",
"@type": "PropertyValue",
"name": "ES",
"propertyID": "https://w3id.org/np/RA..."
}Resolve and get column mappings:
result = resolver.resolve("path/to/input-data/")
# Get the column mapping for the buildings input
binding = result.bindings["buildings"]
print(binding.column_mapping)
# {"elderly_singles": "ES"}
# workflow expects "elderly_singles" → data has it as column "ES"The propertyID (e.g. an I-ADOPT nanopublication URI) is the semantic key that connects the workflow's expected variable to the data's actual column name. This allows different cities to use different column names while the workflow remains generic.
resolver = FDOResolver.from_parameters([
{
"name": "buildings",
"encoding_format": "application/flatgeobuf",
"variables_measured": [
{
"name": "elderly_singles",
"property_id": "https://w3id.org/np/RA...",
"role": "sensitivity_indicator",
},
],
},
{
"name": "flood_levels",
"encoding_format": "application/geo+json",
"additional_type": "https://example.org/FloodLevelCollection",
},
])
result = resolver.resolve("path/to/input-data/")resolver.create_run_crate(
"path/to/output/",
name="Hamburg Flood Risk Results",
description="PFRMA and PFRWB risk indices",
bindings=result, # records input provenance
output_files={"Risk layer": Path("output/risk.fgb")},
)The resolver scores each data entity against each workflow parameter:
encodingFormat— file format match (strongest signal)additionalType— semantic type match (e.g.FloodLevelCollection)variableMeasured— overlap ofpropertyIDURIs between expected and actual variablesname— name-based match (weakest signal)
Scores are combined and the best matches are assigned greedily.
MIT