A Bag is an intermediate representation (IR) for hierarchical data in Python.
In compiler design, an IR is a data structure that sits between source code and machine code. It captures the essential structure while abstracting away format-specific details, making it easier to analyze, transform, and generate output.
The same principle applies to data: configuration files, API responses, documents, and UI structures all share a common shape—named things containing values, organized hierarchically, with metadata attached—but we typically scatter this across dictionaries, classes, JSON, XML, and database rows.
A Bag provides a canonical representation for this common pattern:
flowchart LR
subgraph Sources
JSON[JSON file]
XML[XML file]
API[API response]
DB[Database]
Input[User input]
end
subgraph BAG[" "]
Tree["Unified tree<br/>of named nodes"]
end
subgraph Outputs
HTML[HTML]
XMLout[XML]
DBout[Database]
JSONout[JSON]
UI[UI]
end
JSON --> Tree
XML --> Tree
API --> Tree
DB --> Tree
Input --> Tree
Tree --> HTML
Tree --> XMLout
Tree --> DBout
Tree --> JSONout
Tree --> UI
Decoupling: Your application logic works with one structure, regardless of input/output formats. Change your data source from XML to JSON? Your code doesn't change.
Uniformity: One access pattern (bag['path.to.value']), one way to attach metadata, one subscription model—instead of learning different APIs for each library.
Transformation: Operate on the structure itself: walk the tree, filter nodes, transform values, validate structure—without knowing if it came from a file, an API, or a database.
Round-tripping: Serialize to XML, JSON, or MessagePack and back, preserving types, attributes, and structure—including lazy-loaded values (resolvers).
Every node in a Bag has:
| Component | Purpose |
|---|---|
| Label | The node's name (key in the hierarchy) |
| Value | The data it holds (any Python value, or another Bag) |
| Attributes | Metadata attached to the node |
| Tag | Optional semantic type (like XML elements) |
Access is path-based: bag['config.database.host'] navigates the hierarchy using dot notation.
Bag provides four layers—use only what you need:
| Layer | Purpose | Use When |
|---|---|---|
| Core Bag | Paths, values, attributes, serialization | Always |
| Resolvers | Lazy-loaded, computed values | API calls, DB queries, expensive computations |
| Subscriptions | React to changes | Validation, logging, sync, computed properties |
| Builders | Domain-specific languages | HTML, Markdown, XML with structure validation |
Try genro-bag directly in your browser with Google Colab:
pip install genro-bag| Resource | Description |
|---|---|
| Full Documentation | Complete guide with examples |
| Why Bag? | Detailed comparison with alternatives |
| Getting Started | Learn the core concepts |
| Directory | Description |
|---|---|
src/genro_bag/ |
Core implementation |
src/genro_bag/resolvers/ |
Built-in resolvers (URL, Directory, OpenAPI) |
src/genro_bag/builders/ |
Built-in builders (HTML, Markdown, XSD) |
examples/ |
Usage examples |
tests/ |
Test suite (1500+ tests) |
docs/ |
Sphinx documentation source |
pip install -e ".[dev]"
pytestApache License 2.0 — see LICENSE for details.
Copyright 2025 Softwell S.r.l. — Genropy Team