Skip to content

Add support for the geometry library?#208

Merged
wojtekmach merged 5 commits intoelixir-ecto:masterfrom
aseigo:feature/add-support-for-gepmetry-lib
Jan 29, 2026
Merged

Add support for the geometry library?#208
wojtekmach merged 5 commits intoelixir-ecto:masterfrom
aseigo:feature/add-support-for-gepmetry-lib

Conversation

@aseigo
Copy link
Copy Markdown
Contributor

@aseigo aseigo commented Jan 8, 2026

This PR adds support for the geometry library for geometry types. Currently, myxql only supports the geo library for geometry columns, which makes using it with libraries such as geo_sql (disclaimer: I am the author of geo_sql) which use geometry infeasible.

With these changes, it is possible to use myxql with the geometry library, and this in turns makes it possible to use geo_sql with MySQL/MariaDB databases.

While this works (and based on my testing, works quite well), I'm less convinced that the implementation is very beautiful. It would be nice if it were possible to register handlers for custom types rather than having to hack them into library itself.

If the maintainers of this library have ideas on what that could look like, I'd be happy to have a conversation about it.

In the meantime, this PR can at least serve as a demonstration of how one might accomplish the goal of allowing geometry rather than geom to handle geometry types in a MySQL database without larger changes to the library.

@greg-rychlewski
Copy link
Copy Markdown
Member

greg-rychlewski commented Jan 9, 2026

Thanks for the PR.

Are the geo/geometry libraries basically providing functions to convert structs to MySQL protocol and vice versa? If so then I think it’s a bit of slippery slope to do it in this way. It kind of sets the precedent that we’ll hard code support for various libraries doing similar things. Which could turn into a maintenance nightmare. The reason it works for JSON is because they are exposing the same function names/arities for decoding/encoding. But these two geometry libraries aren't.

The options I’m seeing are:

  1. Switch to geometry package if it has superseded geo package
  2. If we are second guessing the choice to use an outside library for this use case then add support directly in MyXQL. Another option would be to allow extensions like Postgrex. But both these options are probably a lot of work. With the latter being the most complex.
  3. If there is no reason to use geometry over geo outside of subjective preference then maybe we make the unfortunate decision to say you need to use our chosen library if you want to integrate with us

I’d be interested to hear what the other two maintainers say though because they were here from the beginning and I’m looking at things after the fact

@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

If so then I think it’s a bit of slippery slope to do it in this way.

I agree! :) As I noted in the description of the PR, I don't think this is a very beautiful implementation, and there's a conversation to be had on how to make that better.

The purpose of this PR is to demonstrate that other libraries can be used with geometry data stored in a MySQL database, but that there are limitations in this driver preventing it. It's more of a conversation-starter than a final answer :)

If there is no reason to use geometry over geo outside of subjective preference

The reasons geo_sql switched from geo to geometry is performance (~50% less memory usage, 2-10x faster WKB/T parsing) and correctness (e.g. geometry correctly handles EMPTY while geo does not).

The application I'm working with literally can not use geo because geo can not parse the data we produce. Even if we were OK with the performance penalties of geo, the correctness issues are another matter. (And yes, PRs were submitted to address those issues. It is what it is.)

Another option would be to allow extensions like Postgrex.

While a possibility, I agree that would be a significant amount of work. While it would probably be the cleanest solution, there is a simpler solution in this particular case: as with JSON, allow the application to define in configuration what library to use for geometry fields. It would need to be an MFA for serialization and deserialization since there are no API standards for these libraries.

It could still fall back to geo, but would provide an escape hatch for people who wish to use other libraries to handle geometry data stored in a MySQL/MariaDB database

That would, however, be an even larger change than this PR, as it means moving geometry support from a purely compile-time decision to a runtime one. It would also be a non-general solution: it would only relieve the situation around geometry fields.

I'm happy to implement that in an update to this PR if a runtime configuration approach would be acceptable.

you need to use our chosen library if you want to integrate with us

That's certainly a possibility! It would also preclude geo_sql from being usable with this driver :/

@josevalim
Copy link
Copy Markdown
Member

The reason we don't really have an extension support is because MySQL doesn't have (unless it changed) custom types support anyway, so it is simpler to handle them with a small contracts like JSON and Geo.

@aseigo aseigo marked this pull request as draft January 9, 2026 11:00
@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

So here is another approach based on configuration and a protocol:

  • Introduce MyXQL.Protocol.Encoder: a protocol for encoding structs
  • Providing encoders for Geo structs if the geo library is available at compile-time
  • Fetch the wkb decoder from configuration, defaulting to the geo library, via Application.get_env(:myxql, :wkb_decoder)

This preserves the existing functionality, while allowing other libraries to provide WKB encoding (via the protocol) and decoding (via the wkb decoder configuration).

If this looks like a workable approach to others, then I'll verify this works with geo_sql and move this PR out of draft again.

Thoughts?

@josevalim
Copy link
Copy Markdown
Member

josevalim commented Jan 9, 2026

My thinking is that, if we need a behaviour or a module for decoding, we might as well use it for both? 🤔 Edit: for clarity, I don't see a need for the protocol.

@aseigo aseigo force-pushed the feature/add-support-for-gepmetry-lib branch from 8acb7e5 to 2dca7ae Compare January 9, 2026 11:07
…her than `geo`

This is accomplished by:

  * Add `MyXQL.Protocol.Encoder`, a protocol for encoding structs
  * Providing encoders for Geo structs if the `geo` library is available at compile-time
  * Fetch the wkb decoder from configuration, defaulting to the `geo` library, via
    `Application.get_env(:myxql, :wkb_decoder)`

This preserves the existing functionality, while allowing other libraries to provide
WKB encoding (via the protocol) and decoding (via the wkb decoder configuration).
@aseigo aseigo force-pushed the feature/add-support-for-gepmetry-lib branch from 2dca7ae to 7a2f1cc Compare January 9, 2026 11:09
@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

My thinking is that, if we need a behaviour or a module for decoding, we might as well use it for both? 🤔 Edit: for clarity, I don't see a need for the protocol.

How do you recommend doing encoding?

A struct is passed in for encoding. When it is the geom library, then those will be %Geom.* structs, and when using geometry they will be %Geometry.* structs. So there needs to be some way to map those structs to the matching encoder.

How will Values.encode_binary_value know which function to encode those structs with?

In the other direction, decoding is not possible with a protocol as the binary is not distinguished by any type: it's just a binary that encodes some type of geometry.

@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

Thinking a bit more on this, I suppose one could define a behaviour for geometry modules that must provide a decoding function and an encoding function which gets passed a struct .. if the implementation does not know how to encode that struct it would return .... nil? .. so that the encoder could continue (e.g. JSON-encoding the struct instead).

That seems even more awkward (and less performant) since any time a struct is passed in for decoding, Values.encode_binary_value would need to try the geometry encoder (and/or whatever other encoders that might be added in future?).

... which sounds like a less efficient Protocol with more steps to me?

@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

It is also worth pointing out that the approach of a Protocol also means that an application can insert geometry data no matter which library it comes from, making the choice between geom, geometry, or whatever else completely moot. Applications would be free to mix and match for mutations.

Only decoding remains fixed to one library.

@josevalim
Copy link
Copy Markdown
Member

You can define a module and pass it to :geometry_library. It should have two functions, encode and decode. One receives unknown structs and the other receives the binary? The encode function can also return :unknown if it doesn't know the data type, so we fallback to JSON encoding or something else. We could also start with :geometry_library being nil.

@wojtekmach
Copy link
Copy Markdown
Member

Agree with above. I think a protocol would be cleaner for just encoding but it feels awkward to have encoding protocol and decoding behaviour/MFArgs/whatever. So a single module abstractions seems more compelling to me.

@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

It may be useful to consider encoding structs and decoding geometry columns as related but, actually, separate concerns

With a protocol, any struct can have custom encoding. This is not specific to geometry data.

In the specific case of geometry columns, it would allow storing any data that can be encoded as wkb. This is not just geo or geometry structs, but also structs from e.g. the geospatial or openstreetmap libraries. Currently, we have to create a geo struct from the openstreetmap one, and then store it (this is a real world use case I have in a production application).

With a protocol for encoding, that application could instead provide a defimpl for encoding and be done with it. With a module, it would end up something like:

defmodule MyApp.MySQLGeoEncoder do
  @behaviour MyXQL.Protocol.GeometryEncoder

 def encode(%{"lat" => lat, "lon" => long.}) do
   ... wkb encoding here ...
 end

  def encode(other), do: GeoSQL.MySQL.GeoEncoder.encode(other)

 defdelegate decode, to:  GeoSQL.MySQLGeoEncoder
end

That is not particularly scalable as a solution, and prevents libraries from shipping with support out of the box.

This all applies to non-geometry columns in the database, as well, of course.

And then there is decoding geometry columns. When decoding data that is stored in MySQL's srid+wkid storage format, that is known at retrieval, and there needs to be a way to represent those to the application. That process "needs" to be configurable so applications may control how they view that data by default, but that is all.

It also makes no sense to say "decode this WKB to an openstreetmap REST API response": encoding it TO geometry necessarily results in loss of information

So the two processes of encoding-to-wkb and decoding-from-wkb are not symmetric.

As a developer I am less bothered by having a protocol + a configuration entry than I am being limited because the two things have been tied together.

There also doesn't seem to be a nice way to avoid something being configured. Even with a Behaviour, which module to use for it would need to be set somewhere. The D/X pain point of configuration is not resolved by a Behaviour implementation.

tl;dr -> As a user of this library, providing a single module that implements a behaviour for geometry columns does not make anything easier (I still have to write just as much code and I still have to configure things), while being less useful than a protocol.

IOW: similar D/X, worse results.

@josevalim
Copy link
Copy Markdown
Member

josevalim commented Jan 9, 2026

In the specific case of geometry columns, it would allow storing any data that can be encoded as wkb. This is not just geo or geometry structs, but also structs from e.g. the geospatial or openstreetmap libraries. Currently, we have to create a geo struct from the openstreetmap one, and then store it (this is a real world use case I have in a production application).

This could be solved by having a protocol in Geo, which other data structures can implement, and the Geo.MyXQL behaviour would then call the protocol and provide the functionality needed. This comes with the benefit that Geo becomes then the canonical representation, otherwise folks have to write OpenStreetMap -> MySQL, OpenStreetMap -> PostgreSQL, etc.

Overall, I am not convinced we need an extensible functionality in MyXQL when MySQL itself is not extensible. If anything, relying everyone to know MySQL or PostgreSQL representations rather than just converting to a canonical one sounds like more work to everybody.

Finally, it is worth saying that, if you have a behaviour, you can add a protocol on top. However, if MyXQL imposes a protocol, then that's a stronger requirement. If you like protocols, you can implement them on top of the behaviour, but not vice-versa.

@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

rather than just converting to a canonical one sounds like more work to everybody.

This is an argument in favour of using a protocol since there is no agreement on what the canonical representation is.

I don't want to have to convince every single developer out there to use geometry library I'm using (or vice versa) just to be able to store the same bytes in a database, and the more straight-forward way to get there with the minimal amount of dependency baggage is a protocol for encoding that allows the encoding to be hosted with the implementations of the structs.

Even if the only implementations of the protocol were for %Geo.*{} and %Geometry.*{} structs, it would be a win for interoperability with minimal effort and minimal dependency bloat (apps that use only one or the other library can continue to do so).

But it's fine.

My goal is to find some way to bring GeoSQL to MySQL users, and even an imperfect solution is better than no solution.

At least if there is a behaviour, then I can provide a module in GeoSQL that encodes both Geo and Geometry structs, and decodes to Geometry structs. Not great, but at least something that will work.

@josevalim
Copy link
Copy Markdown
Member

josevalim commented Jan 9, 2026

This is an argument in favour of using a protocol since there is no agreement on what the canonical representation is.

The reason why this doesn't work is because the MySQL representation is not canonical either. Solving it for MySQL, won't solve it for PostgreSQL. So having a third party generic representation, such as Geo or Geometry, means it could behave similarly as a LSP, as an intermediate representation everyone relies on.

In any case, if you feel strongly about a protocol decoupled from your library, you could easily ship a myxql_geo_protocol, that converts the behaviour into a protocol for everyone to use. There is no reason for MyXQL to prescribe the solution.

Can be configured by setting the `:myxql, :geometry_codec` configuration
key to the name of a module that implements the GeometryCodec behaviour

default to geo, if available.
@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 9, 2026

Solving it for MySQL, won't solve it for PostgreSQL.

There are already solutions for pgsql.

So having a third party generic representation,

The issue being that in practice there is no generic representation. We can continue to say it should be A or B all we want, the reality is that it is A and B.

Translating all B to A is not only wasteful in terms of performance, it means that one has to be a perfect superset of the other, and we all have to agree on which one of those it is. The unfortunate situation is that geometry is the superset, but geom is what is in broader use due to having been around far longer.

Unless someone wants to take on the task of moving the ecosystem over to geometry (I don't want, nor have the time needed, to do that), this is simply where we are.

But it's academic at this point. I've committed a solution to this branch which implements a behaviour, MyXQL.Protocol.GeometryCodec, makes it configurable, and provides a default implementation for Geo if it is available.

Co-authored-by: José Valim <jose.valim@gmail.com>
@aseigo aseigo requested a review from josevalim January 13, 2026 09:51
@josevalim
Copy link
Copy Markdown
Member

@aseigo I am only slightly familiar with this project, don't feel confident with merging it. :)

Comment on lines +253 to +262
def encode_binary_value(struct) when is_struct(struct) do
# see if it is a geometry struct
case MyXQL.Protocol.GeometryCodec.do_encode(struct) do
:unknown ->
string = json_library().encode!(struct)
{:mysql_type_var_string, encode_string_lenenc(string)}

encoded ->
encoded
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In hindsight, another idea is this contract:

defmodule MyXQL.Protocol.GeometryCodec do
  @callback encode(struct()) :: ...

  @callback encode_structs() :: [module()]

  @callback decode(...) :: ...
end
  if codec = Application.compile_env(:myxql, :geometry_codec) do
    structs = codec.encode_structs()
    
    def encode_binary_value(%struct{} = value) when struct in structs do
      unquote(codec).encode(value)
    end
  end

the benefits are:

  1. we do more work at compile-time. We'd need to mix deps.compile --force myxql when swapping codecs though.
  2. we don't duplicate JSON-encoding logic which is in def encode_binary_value(term) when is_list(term) or is_map(term) do below.

I concede it's more in poor-man's protocols uncanny valley but I personally kinda like doing more work at compile-time on principle.

WDYT @aseigo @josevalim?

As an aside, the clause below was arguably accidentally too broad. In hindsight it arguably should have been:

- def encode_binary_value(term) when is_list(term) or is_map(term) do
+ def encode_binary_value(term) when is_list(term) or is_non_struct_map(term) do

that is, if a myxql user needed to encode custom structs, we'd intentionally provide a mechanism to do so (even if it ended up being the same, json encoding)

Copy link
Copy Markdown
Contributor Author

@aseigo aseigo Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to recompile dependencies is always a bit of an annoyance, and is one more thing to forget to do. It also means that GeoSQL can not set the codec at runtime for its users, and instead needs to point devs of it to docs on how to enable the correct codec in MyXQL.

But if it has a meaningful performance impact it could be worth it, and it certainly wouldn't be the end of the world in terms of developer inconvenience (it's "just" one more manual config entry, after all).

So personally I lean gently towards the runtime approach, but could also work with the compile-time one.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are using compile_env, would we warn if you change its value but don't recompile?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a warning that'd be helpful, for sure! :)

Copy link
Copy Markdown
Member

@wojtekmach wojtekmach Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think we'd get the warning indeed. So let's go with compile-time config and encode_structs callback? Or that doesn't seem appealing?

Regarding GeoSQL automatically setting the codec, on principle I'd avoid library A changing global configuration of library B. In this particular case though that seems harmless and you're of course free to do whatever you want to do in your libraries.

If runtime config is your preference, another option is a similar encode_structs callback idea but at the runtime. Something like:

def encode_binary_value(term) when is_list(term) or is_map(term) do
  codec = Application.get_env(:myxql, :geometry_codec)
  struct_names = codec && codec.encode_structs()

  case term do
    %struct_name{} = struct when struct_name in struct_names ->
      codec.encode(struct)

    _ ->
      string = json_library().encode!(struct)
      {:mysql_type_var_string, encode_string_lenenc(string)}
  end
end

though in that case probably better to expose a geometry_codec/1 function to set it that'd put it in persistent_term.

I'm fine with going with whatever solution as long as we don't paint ourselves into a corner and can encode other types in the future, if needed.

Copy link
Copy Markdown
Contributor Author

@aseigo aseigo Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding GeoSQL automatically setting the codec, on principle I'd avoid library A changing global configuration of library B

It does this in response to the user of the library requesting the library be setup for GIS data and queries. It helps those users avoid is knowing what needs to be done to make that happen, which is especially nice since most people dealing with mapping data are not GIS people and every single backend GeoSQL supports (MySQL will be the 4th) does it completely differently. Oof!

So while I agree with you in the general case, this is not only very helpful for users of GeoSQL, it is only done at the explicit request of the user.

So let's go with compile-time config and encode_structs callback?

That would would be sufficient, sure. If that's what you'd prefer, I'll make adjustments to this branch.

I'm fine with going with whatever solution

Same here. :) So, as the library maintainer, if you could make a decision I'll adapt the code as necessary to it.

Cheers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look at this tonight and since MySQL does not have custom types, it looks rather difficult to make this generic for any kind of struct. So I pushed 9b0af2f to show what it could look like as a compile-time construct. The Value module doesn't really change, just the codec behaviour. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtekmach Any further thoughts on whether you prefer a compile-time implementation as in 9b0af2f ? I'm fine either way, just want to get this moving towards a mergeable state :)

Co-authored-by: Wojtek Mach <wojtekmach@users.noreply.github.com>
@wojtekmach
Copy link
Copy Markdown
Member

Sorry I’m out for a week or so, I probably won’t be able to work on this until then. Sorry for back n forth. let’s go with the previous approach, runtime config and contract that returns :unknown atom

@wojtekmach
Copy link
Copy Markdown
Member

Please undraft so we can run ci

@aseigo aseigo force-pushed the feature/add-support-for-gepmetry-lib branch from 9b0af2f to 0cf16d0 Compare January 17, 2026 18:35
@aseigo aseigo marked this pull request as ready for review January 17, 2026 18:36
@aseigo
Copy link
Copy Markdown
Contributor Author

aseigo commented Jan 27, 2026

This has been marked as ready for review, and @wojtekmach has made their way back? Any updates on this one? Thanks!

@wojtekmach wojtekmach merged commit c3ff2cd into elixir-ecto:master Jan 29, 2026
10 checks passed
@wojtekmach
Copy link
Copy Markdown
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants