Architecture: move codegen backends toward SemanticSchema as the common backend contract

Follow-up to #96.

## Summary

`#96` introduced the compiler-oriented foundations: stable symbol IDs, normalization, and `SemanticSchema`.

Today, backend consumption is still split:

- Python partially consumes `SemanticSchema` and raw `Schema`
- Rust mostly renders from raw `Schema` after backend-local preprocessing
- TypeScript mostly renders from raw `Schema` after backend-local preprocessing
- OpenAPI walks raw `Schema` directly

That means we still have multiple backend contracts and multiple places where semantic fixes can fail to propagate consistently.

This issue is for moving the architecture toward a model where all backends consume `SemanticSchema` directly, or at minimum consume a single backend-facing IR derived from it.

Recent work on the Python branch established a clearer boundary between shared compiler concerns and backend-local concerns:
- stable IDs are now assigned in the compiler path instead of constructors
- Python type-to-language mappings (e.g. `i32` → `int`, `chrono::DateTime` → `datetime`) were evaluated for inclusion in the schema layer but deliberately kept as a backend-local static table — these are static codegen knowledge, not per-type annotations (#128 review)

That second point is instructive for this issue: the boundary between "shared meaning" and "backend-specific rendering" needs to be drawn carefully. Type-to-language mappings are backend-local. Ordering, dependency analysis, and symbol identity are shared concerns.

## Why

The main benefits are architectural and compiler-facing:

- One canonical backend contract instead of a mix of raw-schema and semantic-schema paths
- Less backend drift in ordering, naming, dependency handling, and type resolution
- Fewer backend-local schema mutations and ad hoc preprocessing steps
- Stronger guarantees that symbol identity, dependency analysis, and normalization semantics are applied consistently across languages
- Better foundation for future transforms like monomorphization, richer validation, and backend-specific lowering passes

This is also the natural continuation of the direction described in #96: shared frontend/compiler stages with thinner, more predictable backends.

## Current pain points

- Python still needs to synchronize semantic ordering with raw schema lookups and some raw-schema mutation
- Rust and TypeScript still rely on raw schema traversal after local consolidation
- OpenAPI still bypasses the semantic layer entirely
- `Schema` phase boundaries are implicit because important transforms are performed in-place and repeated in backend code
- Compiler concerns like stable symbol identity are still stored on raw schema structs, even though they are primarily needed by normalization/codegen

## Proposed direction

1. Define the desired common backend contract explicitly.
Either:
- all backends consume `SemanticSchema`, or
- all backends consume a single codegen IR lowered from `SemanticSchema`

2. Make the raw schema vs compiler-schema boundary explicit.
At minimum:
- raw/interchange `Schema` remains wire-focused
- stable IDs are assigned by the compiler path, not treated as constructor-level business data
- semantic/codegen stages consume compiler-owned identity and dependency information

3. Move backend-independent meaning into the shared frontend/compiler layers.
Examples:
- resolved references
- stable ordering
- dependency information
- naming/consolidation decisions
- symbol identity
- normalized container / fallback semantics that every backend would otherwise rediscover separately

4. Keep backend-specific rendering local to the backend.
Examples:
- Python type mappings, runtime-provided types, and imports
- TypeScript type mappings and intersection-type strategies
- Rust derives / ownership choices
- OpenAPI-specific schema projection

Language-specific type mappings are static codegen knowledge and belong in the backend, not the schema. Only information that cannot be inferred at codegen time (e.g. Rust `additional_derives` from source-code annotations) should travel on the schema.

5. Reduce direct raw `Schema` traversal in backends over time.

## Likely subproblems

- Audit what each backend still needs from raw `Schema` that is not represented in `SemanticSchema`
- Enrich `SemanticSchema` where it is missing required information
- Decide how naming and consolidation should appear to backends
- Decide whether some backend-facing concerns belong in a post-semantic lowering stage rather than in `SemanticSchema` itself
- Decide whether the long-term design should keep `id` fields on raw schema structs or move identity fully into a compiler-owned layer
- Migrate one backend at a time, ideally starting with the backend already furthest along

## Non-goals

- This issue is not necessarily about deleting raw `Schema`
- This issue is not necessarily about making every backend identical internally
- This issue is not necessarily about serializing every backend-specific config into `reflectapi.json`
- This issue is not about pretending different languages can share one universal final mapping layer

## Suggested first step

Do a backend-by-backend audit of:
- which raw `Schema` fields are still read directly
- which of those reads are truly backend-specific
- which should instead be represented in `SemanticSchema` or a shared lowering stage

That audit should produce a staged migration plan rather than a single large rewrite.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture: move codegen backends toward SemanticSchema as the common backend contract #129

Summary

Why

Current pain points

Proposed direction

Likely subproblems

Non-goals

Suggested first step

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Architecture: move codegen backends toward SemanticSchema as the common backend contract #129

Description

Summary

Why

Current pain points

Proposed direction

Likely subproblems

Non-goals

Suggested first step

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions