Skip to content

FAIR2Adapt/fair-data-access

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fair-data-access

ODRL-based access control for FAIR data with nanopublication policies and encrypted data packages.

Overview

This tool provides automated access control for research data that is private but not sensitive -- data that can be shared under specific conditions (e.g., academic research only) but should not be openly published.

It combines:

  • ODRL (Open Digital Rights Language) for machine-readable access policies
  • Nanopublications for immutable, verifiable policy records and audit trails
  • AES-256-GCM encryption for data-at-rest protection
  • DIDs (Decentralized Identifiers) for requester identity
  • GitHub Actions for automated policy evaluation and key distribution

Architecture

Nanopub Network          GitHub Pages/Actions       Zenodo / S3 Pangeo@EOSC
 ├─ ODRL policies         ├─ Wrapped keys            ├─ Encrypted data files
 ├─ Access grants          ├─ Policy evaluation       └─ RO-Crate metadata
 └─ I-ADOPT variables     └─ Key wrapping                (unencrypted)

Access request flow

  1. Researcher finds a dataset via its RO-Crate metadata (on RO-Hub, Dataverse, etc.)
  2. Metadata references the ODRL policy (nanopub URI) and the key server (GitHub Pages)
  3. Researcher opens a GitHub Issue using the access request template
  4. GitHub Actions automatically:
    • Resolves the requester's DID to get their public key
    • Fetches the ODRL policy nanopub
    • Evaluates the policy against the request
    • If approved: wraps the dataset key, publishes it to GitHub Pages, records the grant as a nanopub
  5. Researcher downloads the wrapped key, decrypts the data, runs their analysis

Installation

pip install -e .

Usage

Data provider workflow

# 1. Generate a keypair for your DID
fair-data-access keygen -d ~/.fair-data-access/

# 2. Create a DID document (served via GitHub Pages)
fair-data-access did-doc did:web:fair2adapt.github.io:fair-data-access public_key.pem

# 3. Create an ODRL policy
fair-data-access policy \
  --uid "https://fair2adapt.eu/policy/hamburg-buildings" \
  --target "https://fair2adapt.eu/data/hamburg-buildings" \
  --permit-actions use reproduce \
  --prohibit-actions distribute commercialize \
  --purpose https://w3id.org/dpv#AcademicResearch \
  --require-attribution \
  -o policies/hamburg-buildings.jsonld

# 4. Encrypt your data
fair-data-access encrypt buildings.gpkg --save-key dataset_key.txt

# 5. Upload encrypted data to Zenodo/S3
# 6. Store dataset_key.txt content as a GitHub Secret (KEY_HAMBURG_BUILDINGS)
# 7. Publish ODRL policy as nanopub via Nanodash template (see below)
# 8. Update policies/registry.json with the nanopub URI

Publishing ODRL policies via Nanodash

Policies should be created via the Nanodash template forms to ensure compatibility:

The nanopubs/ directory also contains notebooks for programmatic template creation and retraction.

Data consumer workflow

# 1. Generate your DID keypair
fair-data-access keygen -d ~/.fair-data-access/

# 2. Set up did:web (serve did.json at your domain)
fair-data-access did-doc did:web:myuni.edu:me public_key.pem

# 3. Open an access request issue on GitHub
#    (use the issue template at the data-access repo)

# 4. After approval, download and decrypt
curl -o wrapped_key.json https://fair2adapt.github.io/fair-data-access/keys/<did-hash>/hamburg-buildings.key
fair-data-access decrypt buildings.gpkg.enc -k <unwrapped-key>

Integration with urban_pfr pipeline

In FDO mode, the pipeline can automatically handle encrypted inputs:

from fair_data_access import decrypt_file, evaluate_policy, fetch_policy
from fair_data_access.keys import unwrap_key
from fair_data_access.rocrate import load_encrypted_input

Project structure

fair-data-access/
  fair_data_access/
    encrypt.py          # AES-256-GCM encryption/decryption
    keys.py             # ECDH key wrapping/unwrapping
    did.py              # DID resolution and document creation
    policy.py           # ODRL policy creation and evaluation
    nanopub_utils.py    # Nanopub publishing (policies + access grants)
    rocrate.py          # RO-Crate integration for encrypted data
    cli.py              # Command-line interface
  scripts/              # GitHub Actions helper scripts
  nanopubs/
    config/             # JSON configs for policy nanopubs
    create_odrl_template_v2.ipynb  # GroupedStatement assertion templates
    create_odrl_policy_nanopub.ipynb  # Programmatic policy generation
    disapprove_nanopub.ipynb  # Retract/disapprove nanopubs
  policies/             # ODRL policy files and registry
  .github/
    workflows/          # Automated access request processing
    ISSUE_TEMPLATE/     # Access request form
  docs/                 # GitHub Pages (served key files)

Migrating to other platforms

The GitHub Pages/Actions setup is a lightweight starting point. The same components work on:

  • LifeWatch/EOSC: Replace GitHub Actions with a FastAPI service
  • University server: Same FastAPI service behind institutional auth
  • Hamburg municipality: Fork this repo, add their own dataset keys

The ODRL policies (as nanopubs), encrypted data packages, and RO-Crate metadata remain unchanged across platforms.

Related

About

ODRL-based access control for FAIR data with nanopublication policies and encrypted data packages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors