Skip to content

[WIP] Extensible tag classification model discovery through Entry Points#463

Draft
Roel Bollens (RoelBollens-TomTom) wants to merge 4 commits intodevfrom
discovery-rework-with-tags
Draft

[WIP] Extensible tag classification model discovery through Entry Points#463
Roel Bollens (RoelBollens-TomTom) wants to merge 4 commits intodevfrom
discovery-rework-with-tags

Conversation

@RoelBollens-TomTom
Copy link
Collaborator

@RoelBollens-TomTom Roel Bollens (RoelBollens-TomTom) commented Mar 10, 2026

Extensible tag classification model discovery through Entry Points

This replaces the hardcoded model classification system with tag-based classification model discovery through Entry Points. This is based on #440 by Seth and several schema (ad-hoc) coding sessions where Seth, Vic, Dana, Tristan and Roel participated in.

Model discovery moved into system, eliminating assumptions about Overture in the process. The hardcoded namespace concept ("overture", "annex") and the ModelKind classifier is replaced with with tags -- string labels derived by tag providers. Tags become the filtering, grouping, and classification mechanism for model discovery, driven by introspection and package metadata rather than central coordination.

system provides generic tag-based grouping without understanding what any particular tag means. Any package can register tag providers that classify models without special support in the discovery layer.

Purpose

Tags serve three roles:

  • CLI filtering: select subsets of models for output and codegen
    (--tag system:feature, --tag draft)
  • Classification and endorsement: distinguish features from extensions,
    mark models as vetted or approved by an authority
  • Marketplace taxonomy: browse and classify models and extensions in a
    future extension catalog

These roles overlap -- a tag like overture:theme=buildings serves both filtering and taxonomy. The design accommodates this overlap through structured tags that encode both ownership and dimension.

Tag Format

Tags are strings following the pattern [prefix:]key[=value]:

  • Plain: overture, draft, feature
  • Prefixed: system:extension -- : separates ownership
  • Prefixed k/v: overture:theme=buildings

: signals ownership and enables prefix reservation (see Privileged Packages and Tag Reservation). = signals a dimension with a value (groupable via --group-by). One level of each -- no nested colons or multiple = signs.

Minimal launch set

Tag Meaning
system:feature This model is a feature type (has geometry, inherits from Feature)
overture:theme=<theme> Which Overture theme this belongs to (e.g., buildings, transportation)
overture:official Placeholder for a lifecycle/endorsement tag — exact name deferred pending Dana and Tristan's work on extension lifecycle
overture:feature (vs. system:feature) The distinction between system:feature (geometry-aware base) and overture:feature (adds theme, type, sources, version) surfaced during the use-case exercise. Both tags exist implicitly in the inheritance hierarchy, and both should be present when applicable (an Overture feature gets both). This was deferred and likely will still drop off from the initial set

Extensions

Additional extensions and accompanied tags will be introduced in a future PR. Extensions allows to augment existing types with new fields (columns).

Tag Meaning
system:extension This model is an extension (adds columns/fields to an existing type)

CLI

The list-types command has been updated to support filtering and grouping by tags. Currently, it no longer displays the description or fully qualified class name. The json-schema and validate commands kept their existing interface but have been modified to maintain backward compatibility. Further changes can be introduced in a future update.

Examples

% overture-schema list-types
address             overture:feature  overture:official  overture:theme=addresses  system:feature
bathymetry          overture:feature  overture:official  overture:theme=base  system:feature
building            overture:feature  overture:official  overture:theme=buildings  system:feature
building_part       overture:feature  overture:official  overture:theme=buildings  system:feature
connector           overture:feature  overture:official  overture:theme=transportation  system:feature
division            overture:feature  overture:official  overture:theme=divisions  system:feature
division_area.      overture:feature  overture:official  overture:theme=divisions  system:feature
division_boundary   overture:feature  overture:official  overture:theme=divisions  system:feature
infrastructure      overture:feature  overture:official  overture:theme=base  system:feature
land                overture:feature  overture:official  overture:theme=base  system:feature
land_cover          overture:feature  overture:official  overture:theme=base  system:feature
land_use            overture:feature  overture:official  overture:theme=base  system:feature
place               overture:feature  overture:official  overture:theme=places  system:feature
segment             overture:feature  overture:official  overture:theme=transportation  system:feature
sources            
water               overture:feature  overture:official  overture:theme=base  system:feature
% overture-schema list-types --group-by overture:theme
overture:theme=addresses (1)
→ address            overture:feature  overture:official  overture:theme=addresses  system:feature

overture:theme=base (6)
→ bathymetry         overture:feature  overture:official  overture:theme=base  system:feature
→ infrastructure     overture:feature  overture:official  overture:theme=base  system:feature
→ land               overture:feature  overture:official  overture:theme=base  system:feature
→ land_cover         overture:feature  overture:official  overture:theme=base  system:feature
→ land_use           overture:feature  overture:official  overture:theme=base  system:feature
→ water              overture:feature  overture:official  overture:theme=base  system:feature

overture:theme=buildings (2)
→ building           overture:feature  overture:official  overture:theme=buildings  system:feature
→ building_part      overture:feature  overture:official  overture:theme=buildings  system:feature

overture:theme=divisions (3)
→ division           overture:feature  overture:official  overture:theme=divisions  system:feature
→ division_area      overture:feature  overture:official  overture:theme=divisions  system:feature
→ division_boundary  overture:feature  overture:official  overture:theme=divisions  system:feature

overture:theme=places (1)
→ place              overture:feature  overture:official  overture:theme=places  system:feature

overture:theme=transportation (2)
→ connector          overture:feature  overture:official  overture:theme=transportation  system:feature
→ segment            overture:feature  overture:official  overture:theme=transportation  system:feature
% overture-schema list-types --tag overture:official --exclude-tag overture:theme=base
address            overture:feature  overture:official  overture:theme=addresses  system:feature
building           overture:feature  overture:official  overture:theme=buildings  system:feature
building_part      overture:feature  overture:official  overture:theme=buildings  system:feature
connector          overture:feature  overture:official  overture:theme=transportation  system:feature
division           overture:feature  overture:official  overture:theme=divisions  system:feature
division_area      overture:feature  overture:official  overture:theme=divisions  system:feature
division_boundary  overture:feature  overture:official  overture:theme=divisions  system:feature
place              overture:feature  overture:official  overture:theme=places  system:feature
segment            overture:feature  overture:official  overture:theme=transportation  system:feature

Deviations

  • Tag providers are additive only and can't remove existing tags.
  • The execution order of tag providers is non-deterministic.
  • There is currently no warning on a tag amount limit

Comment on lines +215 to +221
if namespace and namespace != "overture":
filtered_models = {
key: model_class
for key, model_class in all_models.items()
if namespace in key.tags
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that anyone's relying on --namespace right now, so I'd drop this in favor of the --tag <namespace> support that you've added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had left the json-schema and validate commands to kept their existing interface and modified them to maintain backward compatibility, I still plan update then (and codegen cli's commands) but left it a mystery as if it would be done as part of this PR or an other one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. My point was to not worry about backward-compatibility.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, I didn't care about making it backwards compatible, it was just the shorter pain. But I'll refactor the CLI commands in this PR now that codegen cli is also in, that will also resolve some of the other issues.

models = discover_models(namespace=namespace)
actual_themes = {key.theme for key in models.keys()}
models = discover_models()
actual_themes = {next(iter(tags_by_key(key.tags, "overture:theme")),None) for key in models.keys()}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract next(iter(tags_by_key(key.tags, "overture:theme")), None) into a helper?

Ah, but where (since theme isn't a system concept)? Maybe core, but the intent is for codegen to drop its dependency on core in favor of system after this PR.

) -> None:
"""Test that resolve_types returns models from expected themes."""
from overture.schema.core.discovery import discover_models
from overture.schema.system.discovery import discover_models
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad from before; we should aim to eliminate non-top-level imports unless they're absolutely necessary. I'd like to add a linting rule to catch this at some point (my local linter config adds this now, so codegen should be clean).

return {
tag
for tag in tags
if TAG_RE.match(tag) and not tag.startswith(reserved_namespaces)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_filter_tags builds prefix checks from RESERVED_TAGS, so "overture" here implicitly reserves both the plain tag and the overture: prefix. What happens when you need to reserve a plain tag that shouldn't also claim a prefix namespace?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

filters = []

if tags:
filters.append(lambda key: all(tag in key.tags for tag in tags))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all() here means --tag foo --tag bar requires BOTH tags (AND). Is that the intended semantics, or should repeated --tag flags use OR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional, but I guess this should still be formalized so I'll leave this open for discussion.

from the coding session minutes:

--tag semantics are AND
The mockups assume --tag overture --tag system:feature means "must have both tags" (AND). This contradicts the design doc's stated OR semantics. The group implicitly operated with AND throughout the exercise and no one objected. This should be formalized.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great. I'm good with that; it didn't track with how I'd been thinking about it earlier is all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll still leave this open. Maybe after refactoring the other CLI commands opinions might change.

}
TAG = r"[a-z0-9][a-z0-9_-]*"
NAMESPACE_TAG = r"[a-z0-9]+:[a-z0-9]+(?:=[a-z0-9_.-]+)?"
TAG_RE = re.compile(rf"^(?:{TAG}|{NAMESPACE_TAG})$")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TAG allows hyphens and underscores ([a-z0-9_-]), but NAMESPACE_TAG's key segment doesn't — acme:my-ext would fail validation even though my-ext is a valid plain tag. Intentional?

Copy link
Collaborator Author

@RoelBollens-TomTom Roel Bollens (RoelBollens-TomTom) Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tag definition/validation needs some more love. I added a regex to have some validation but very little thought went into it. Same where TAG doesn't allow for decimal points/periods were a value in a namespace tag would (3.14 would fail but acme:pi=3.14 will pass).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That handling of numbers makes sense to me. Raw tags with decimals feels somewhat confusing, so while it's restrictive, I could justify it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants