sdmx_ML_validation

We test here the idea that one can use a unique method, independent of data to create the sets of validation rules needed for various datasets/stages of the GSBPM. See our slides for the sdmx-2025 conference.

This involves, in addition to using expert knowledge based rules, which still play a role, exploiting:

the SDMX registry codelists and DSD (data structure definition) files, using R-validate, Python: sdmxthon
data properties (type, range, distribution, correlation, ...) using R-validate_suggest
machine learning algorithms (e.g. apriori, eclat algorithms using R-arules) trained on the whole datasets to discover association rules. The new rule sets are subsequently evaluated against the traditional rule repositories
machine/deep learning algorithms for anomaly detection (unsupervised, e.g. deep-/isolation forest as in the R- solitude or HRTnomaly) which can single out rare/unique patterns in data which might be signaling errors

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
examples_rule_sourcing		examples_rule_sourcing
README.md		README.md
slides_Iceland.pdf		slides_Iceland.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sdmx_ML_validation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sdmx_ML_validation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages