Skip to content

Add native_uuid connection option for UNIQUEIDENTIFIER return type control #447

@thegoodwinner

Description

@thegoodwinner

Hi,

I'm migrating a product codebase that uses SQLAlchemy mssql+pyodbc - on Azure Functions; primarily due to SIGSEGV crashes, traced back to likely runtime/driver issues.

So far so good - however the refactor has been significant due to UUID handling behaviours (which are extensively used in the ORM) - pyodbc returns string by default (v2 flags via native_uuid).


Is your feature request related to a problem? Please describe.

mssql-python always returns uuid.UUID objects from UNIQUEIDENTIFIER columns. For codebases migrating from pyodbc (which returns strings by default), this causes widespread breakage across every codepath that touches a GUID:

  • AttributeErroe: existing code calls .strip(), .upper(), .replace() on values that are now uuid.UUID, not str
  • TypeError in JSON serialization: uuid.UUID is not JSON-serializable by default; every API response, queue message, and telemetry payload that includes a GUID fails
  • Silent equality mismatches: "ABC..." == UUID("ABC...") is False, causing lookups, dictionary keys, and set membership checks to silently break
  • SQLAlchemy result processor crash: SQLAlchemy's _python_UUID helper calls value.replace("-", "") assuming a string input, raising AttributeError when it receives a uuid.UUID object (this is the same class of bug as pymssql might support native UUID, set attribute sqlalchemy/sqlalchemy#9414 which was filed for pymssql)

This is not a niche concern, pyodbc is the dominant Python MSSQL driver, so most codebases migrating to mssql-python will hit this. In my production codebase (~700 UNIQUEIDENTIFIER columns, ~90 files touching GUIDs), adopting mssql-python required a global monkey-patch on SQLAlchemy internals and an extensive audit of every string operation on GUID values.

Describe the solution you'd like

A native_uuid parameter on connect() that controls whether UNIQUEIDENTIFIER columns are returned as uuid.UUID objects or str:

Current behavior preserved (default): returns uuid.UUID objects

conn = mssql_python.connect(conn_str, native_uuid=True)

Migration-friendly: returns str (hyphenated, e.g. "110e2700-9d34-44e9-ba0e-bd74401a54a4")

conn = mssql_python.connect(conn_str, native_uuid=False)

Both modes should continue to accept uuid.UUID objects as bind parameters (input handling is already correct).

The implementation would be minimal: in the result-row processing where UUID(bytes_le=raw_bytes) is already constructed:

val = uuid.UUID(bytes_le=raw_bytes)
return val if self._native_uuid else str(val)

This should apply consistently across fetchone(), fetchmany(), and fetchall() (the inconsistency from #241 was already fixed).

Describe alternatives you've considered

  1. Monkey-patching SQLAlchemy internals: This is what we currently do. We patch sqlalchemy.sql.sqltypes._python_UUID to short-circuit when it receives a uuid.UUID object instead of a string. It works but is fragile (depends on an internal API that could be renamed/removed) and doesn't help non-SQLAlchemy consumers.

My current workaround in production

original = sqltypes._python_UUID

def _python_uuid(value):
    if isinstance(value, UUID):
        return value  # short-circuit: already a UUID
    return original(value)

sqltypes._python_UUID = _python_uuid
  1. Application-level str() coercion everywhere: Wrapping every DB result in str() at the repository layer. This is error-prone at scale (easy to miss a codepath) and defeats the purpose of having typed UUID support.
  2. SQLAlchemy TypeDecorator: A custom column type that overrides process_result_value to stringify. This works for ORM queries but doesn't cover text() queries or raw cursor usage.
  3. Waiting for the official SQLAlchemy dialect: The dialect being developed (referenced in coercion of UNIQUEIDENTIFIER / uuid values from binary to string only works with fetchone(), not fetchall() or fetchmany() #241) will presumably set supports_native_uuid = True, which solves the SQLAlchemy layer. But it doesn't help raw DB-API consumers, Pydantic serialization, or non-SQLAlchemy frameworks (Django, SQLModel, etc.).

Additional context

Precedent: pyodbc addressed this exact problem with its native_uuid flag (mkleehammer/pyodbc#177, added in v4.0.9). The default is False (strings), with True opting into uuid.UUID objects. mssql-python could use the inverse default (True) since it's a newer driver, while still offering the escape hatch.

Ecosystem pattern:

Driver UNIQUEIDENTIFIER return type Configurable?
pyodbc str (default) / uuid.UUID (opt-in) Yes — pyodbc.native_uuid
pymssql 2.x uuid.UUID (always) No
mssql-python uuid.UUID (always) No (this request)

Migration context: mssql-python positions itself as a modern, Microsoft-supported alternative to pyodbc. The primary adoption path is pyodbc → mssql-python. A compatibility flag would significantly lower the migration
barrier, teams can switch drivers first, then migrate UUID handling incrementally, rather than requiring a big-bang refactor of every GUID touchpoint.

Related issues:

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions