Digital4Better Open Data

Open datasets maintained by Digital4Better to describe the environmental footprint of digital services, cloud infrastructure, electricity systems, and AI models.

This repository is meant to be used as a data source, not as developer documentation. The main audience is analysts, sustainability teams, researchers, product teams, and anyone who needs reusable reference data in JSON or CSV.

These reference datasets are used, among other things, by fruggr, Digital4Better's platform for measuring and managing the environmental footprint of digital services.

What You Can Find Here

The repository is organized as a set of reusable data collections:

Collection	What it covers	Main files
`data/ai`	AI model catalog across vendors and cloud providers	`models.json`
`data/cloud`	Cloud regions, virtual machines, CPUs, accelerators	`-regions.`, `-vms.`, `cpus.`, `accelerators.`
`data/country`	Countries, regions, continents, and distance referentials	`regions.`, `countries.`, `continents.`, `-distances.*`
`data/energy`	Environmental impacts of electricity production technologies	`energy-impacts.*`
`data/mix`	Electricity mix by geography and time period	`world-`, `continent-`, `country-`, `subdivision-`
`data/factor`	Electricity impact factors derived from energy mix data	`world-`, `continent-`, `country-`, `subdivision-`
`data/equipment`	Equipment energy and embodied impact reference data	`energy.`, `embodied.`

Why This Repository Exists

These datasets are used to:

estimate the environmental footprint of digital services
compare cloud infrastructure options across providers and regions
model electricity-related impacts by country, continent, or subdivision
enrich internal or public sustainability dashboards
document AI models and their characteristics in a structured way

Highlights

AI Models

The AI catalog in data/ai/models.json documents model families from providers such as OpenAI, Anthropic, Google, Mistral, Meta, Qwen, DeepSeek, Amazon, Cohere, and others.

This makes it useful for market mapping, observatories, governance, and cloud/AI portfolio analysis.

Main source families:

cloud provider catalogs such as AWS Bedrock, Azure AI Foundry, Google Cloud Vertex AI, OVHcloud AI Endpoints, and Scaleway Generative APIs
official model vendor documentation such as OpenAI, Anthropic, Mistral, Qwen, and DeepSeek
model cards and open model hubs such as Hugging Face
technical reports and synthesis sources such as LifeArchitect and ApXML Models

Cloud Infrastructure

The cloud referentials in data/cloud provide structured information for major providers including AWS, Azure, GCP, OVHcloud, and Scaleway.

Typical use cases:

mapping regions and datacenter footprints
comparing VM families and hardware characteristics
linking compute infrastructure to sustainability calculations

Main source families:

provider region and infrastructure documentation from AWS, Microsoft Azure, Google Cloud, OVHcloud, and Scaleway
manufacturer and hardware reference sources for CPUs and accelerators
curated cross-checking from provider instance catalogs and infrastructure specification pages

Electricity Mix And Impact Factors

The datasets in data/mix and data/factor help translate electricity consumption into environmental impacts.

They are available at several levels:

world
continent
country
subdivision

And across different time granularities:

yearly
monthly

Green-only variants are also available through files ending with -green.

Main source families:

electricity generation data from Ember monthly electricity data and Ember yearly electricity data
impact factors built from lifecycle assessment literature, including UNECE 2021 - Life cycle assessment of electricity generation options and related academic work such as this Energy paper

Geography And Distances

The datasets in data/country provide geographic referentials used to map countries, continents, subdivisions, and estimated network distances.

Typical use cases:

geographic normalization
country and subdivision mapping
rough estimation of distances between users, countries, regions, and datacenters

Main source families:

ISO country and subdivision standards
internally maintained geographic referentials used to derive administrative mappings and distance approximations

Equipment Reference Data

The datasets in data/equipment provide reference values for embodied impacts and operational energy of common digital equipment categories.

Typical use cases:

footprint modeling at equipment level
simplified lifecycle modeling for digital services
comparative analysis of device or infrastructure categories

Main source families:

Digital4Better internal modeling inputs
lifecycle assessment literature and equipment reference datasets used for sustainability calculations

Formats

Most collections are published in both formats:

JSON for structured or nested data
CSV for tabular exploration, spreadsheets, and BI tools

If a collection is only available in one format, it is usually because that format is the most natural one for the data structure.

Quick Navigation

AI models: data/ai/models.json
Cloud regions: data/cloud
Country and region referentials: data/country
Energy impacts: data/energy/energy-impacts.json
Electricity mix: data/mix
Electricity factors: data/factor
Equipment data: data/equipment

Notes On Data Quality

This repository aims to provide transparent and reusable reference data, but some values should be interpreted with care.

Some fields are derived from public documentation, model cards, technical reports, or literature rather than official disclosures.
Some collections include explicit uncertainty markers such as estimated.
AI and cloud catalogs evolve quickly, so historical and legacy entries may coexist with current ones.
Environmental factors are based on a mix of primary data, literature, and modeling assumptions.

When available, source URLs are kept directly in the data files themselves.

License

This repository is published under the ODC Open Database License (ODbL).

Name		Name	Last commit message	Last commit date
Latest commit History 488 Commits
.github/workflows		.github/workflows
data		data
site		site
.editorconfig		.editorconfig
.gitignore		.gitignore
.nvmrc		.nvmrc
CONTRIBUTING.md		CONTRIBUTING.md
LICENCE		LICENCE
README.md		README.md
index.ts		index.ts
package.json		package.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Digital4Better Open Data

What You Can Find Here

Why This Repository Exists

Highlights

AI Models

Cloud Infrastructure

Electricity Mix And Impact Factors

Geography And Distances

Equipment Reference Data

Formats

Quick Navigation

Notes On Data Quality

Related Links

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Digital4Better Open Data

What You Can Find Here

Why This Repository Exists

Highlights

AI Models

Cloud Infrastructure

Electricity Mix And Impact Factors

Geography And Distances

Equipment Reference Data

Formats

Quick Navigation

Notes On Data Quality

Related Links

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages