Skip to content

violetacln/sdmx_ML_validation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

sdmx_ML_validation

We test here the idea that one can use a unique method, independent of data to create the sets of validation rules needed for various datasets/stages of the GSBPM. See our slides for the sdmx-2025 conference.

This involves, in addition to using expert knowledge based rules, which still play a role, exploiting:

  • the SDMX registry codelists and DSD (data structure definition) files, using R-validate, Python: sdmxthon

  • data properties (type, range, distribution, correlation, ...) using R-validate_suggest

  • machine learning algorithms (e.g. apriori, eclat algorithms using R-arules) trained on the whole datasets to discover association rules. The new rule sets are subsequently evaluated against the traditional rule repositories

  • machine/deep learning algorithms for anomaly detection (unsupervised, e.g. deep-/isolation forest as in the R- solitude or HRTnomaly) which can single out rare/unique patterns in data which might be signaling errors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages