Skip to content

GabrielFranciscoSM/IBM-DataScienceTools-final-work

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Open Source Tools for Data Science — IBM

Welcome to a compact exploration of IBM's open-source tools for data science. This repository contains a hands-on notebook that surveys popular IBM open-source projects, shows short demos, and gives pointers so you can try them locally.

🚀 What you'll find here

  • A curated notebook: DataSienceEcosystem.ipynb — a guided tour of IBM open-source tools for model development, fairness, robustness, and deployment.
  • Short descriptions and links (where helpful) to notable IBM projects you can explore further.

Why this repository

IBM maintains a rich set of open-source projects that help with fairness, robustness, model packaging, and production-ready examples. This repo gathers a small, practical selection so you can learn by running examples and inspecting code in the included notebook.

Notable IBM Open Source Tools (high level)

  • AI Fairness 360 (AIF360)

    • A toolkit to help detect and mitigate bias in machine learning models. It contains metrics, datasets, and mitigation algorithms across pre-, in-, and post-processing stages.
  • Adversarial Robustness Toolbox (ART)

    • Tools to evaluate and harden models against adversarial examples and attacks. Useful for security-minded ML workflows.
  • Lale

    • A higher-level Python library for automated pipeline generation and safe model composition. It integrates with scikit-learn and AutoML tooling.
  • Model Asset eXchange (MAX)

    • A collection of ready-to-deploy model demos (Docker + Flask/REST examples) that show how to package and serve ML models.
  • Egeria (metadata and governance)

    • Useful when you want lineage, metadata, and governance across data and ML assets (optional for advanced users).

Note: The notebook contains small runnable examples and pointers; follow the Quickstart section below to get them running locally.

Quickstart — run the notebook

These commands work on Linux/macOS with Bash. They create an isolated environment and start Jupyter Lab/Notebook so you can run DataSienceEcosystem.ipynb:

# create & activate a virtual environment (optional but recommended)
python3 -m venv .venv
source .venv/bin/activate

# install minimal tooling
pip install --upgrade pip
pip install jupyterlab notebook

# if you have a requirements.txt, install it
if [ -f requirements.txt ]; then
	pip install -r requirements.txt
fi

# start Jupyter Lab (or use `jupyter notebook` if you prefer)
jupyter lab

Open the notebook DataSienceEcosystem.ipynb in the browser, run the cells in order, and follow the markdown notes.

How to explore the notebook

  • Start with the introduction cells (setup & environment). They explain which packages are used.
  • Try the small demos for fairness (AIF360) and robustness (ART) — they are lightweight examples, not full experiments.
  • Look for %%bash or !pip install cells: they help install missing packages when needed.

License & credits

This repository (the notebook and any added files) is provided for learning and demonstration. Check individual IBM projects for their specific licenses (most are Apache-2.0 or permissive open-source licenses).

About

Final work of the course of Data Science tool of IBM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors