Skip to content

FAIR2Adapt/fdo-resolver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fdo-resolver

Resolve RO-Crate FAIR Digital Object inputs for scientific workflows.

Given a Workflow RO-Crate that declares its expected inputs as FormalParameter entities, fdo-resolver scans a directory of data RO-Crates and matches them to the workflow's input slots by encodingFormat, additionalType, name, and variableMeasured.

Installation

pip install fdo-resolver

Usage

Basic: resolve data crates to workflow inputs

from fdo_resolver import FDOResolver

# Load input slots from a Workflow RO-Crate
resolver = FDOResolver.from_workflow_crate("path/to/workflow-rocrate/")

# Resolve data crates against the workflow profile
result = resolver.resolve("path/to/input-data/")

print(result.is_complete)  # True if all required inputs matched
print(result.paths)        # {param_name: Path(...), ...}
print(result.summary())    # Human-readable summary

Variable-level matching with I-ADOPT

When input datasets describe their columns via variableMeasured (using schema.org PropertyValue with propertyID pointing to semantic identifiers like I-ADOPT nanopublications), the resolver can match columns between workflow expectations and data offerings.

Workflow RO-Crate declares expected variables:

{
  "@id": "#param-buildings",
  "@type": "FormalParameter",
  "name": "buildings",
  "encodingFormat": "application/flatgeobuf",
  "variableMeasured": [
    {"@id": "#expected-elderly-singles"}
  ]
}
{
  "@id": "#expected-elderly-singles",
  "@type": "PropertyValue",
  "name": "elderly_singles",
  "propertyID": "https://w3id.org/np/RA...",
  "additionalType": "sensitivity_indicator"
}

Data RO-Crate describes its columns:

{
  "@id": "buildings.fgb",
  "@type": "File",
  "encodingFormat": "application/flatgeobuf",
  "variableMeasured": [
    {"@id": "#var-es"}
  ]
}
{
  "@id": "#var-es",
  "@type": "PropertyValue",
  "name": "ES",
  "propertyID": "https://w3id.org/np/RA..."
}

Resolve and get column mappings:

result = resolver.resolve("path/to/input-data/")

# Get the column mapping for the buildings input
binding = result.bindings["buildings"]
print(binding.column_mapping)
# {"elderly_singles": "ES"}
# workflow expects "elderly_singles" → data has it as column "ES"

The propertyID (e.g. an I-ADOPT nanopublication URI) is the semantic key that connects the workflow's expected variable to the data's actual column name. This allows different cities to use different column names while the workflow remains generic.

Programmatic parameter definitions

resolver = FDOResolver.from_parameters([
    {
        "name": "buildings",
        "encoding_format": "application/flatgeobuf",
        "variables_measured": [
            {
                "name": "elderly_singles",
                "property_id": "https://w3id.org/np/RA...",
                "role": "sensitivity_indicator",
            },
        ],
    },
    {
        "name": "flood_levels",
        "encoding_format": "application/geo+json",
        "additional_type": "https://example.org/FloodLevelCollection",
    },
])

result = resolver.resolve("path/to/input-data/")

Creating output Workflow Run Crates

resolver.create_run_crate(
    "path/to/output/",
    name="Hamburg Flood Risk Results",
    description="PFRMA and PFRWB risk indices",
    bindings=result,  # records input provenance
    output_files={"Risk layer": Path("output/risk.fgb")},
)

How matching works

The resolver scores each data entity against each workflow parameter:

  1. encodingFormat — file format match (strongest signal)
  2. additionalType — semantic type match (e.g. FloodLevelCollection)
  3. variableMeasured — overlap of propertyID URIs between expected and actual variables
  4. name — name-based match (weakest signal)

Scores are combined and the best matches are assigned greedily.

License

MIT

About

Resolve FDO to enable implementation of FDO-in and FDO-out paradigm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages