Add support for the geometry library?#208
Conversation
|
Thanks for the PR. Are the geo/geometry libraries basically providing functions to convert structs to MySQL protocol and vice versa? If so then I think it’s a bit of slippery slope to do it in this way. It kind of sets the precedent that we’ll hard code support for various libraries doing similar things. Which could turn into a maintenance nightmare. The reason it works for JSON is because they are exposing the same function names/arities for decoding/encoding. But these two geometry libraries aren't. The options I’m seeing are:
I’d be interested to hear what the other two maintainers say though because they were here from the beginning and I’m looking at things after the fact |
I agree! :) As I noted in the description of the PR, I don't think this is a very beautiful implementation, and there's a conversation to be had on how to make that better. The purpose of this PR is to demonstrate that other libraries can be used with geometry data stored in a MySQL database, but that there are limitations in this driver preventing it. It's more of a conversation-starter than a final answer :)
The reasons The application I'm working with literally can not use
While a possibility, I agree that would be a significant amount of work. While it would probably be the cleanest solution, there is a simpler solution in this particular case: as with JSON, allow the application to define in configuration what library to use for geometry fields. It would need to be an MFA for serialization and deserialization since there are no API standards for these libraries. It could still fall back to That would, however, be an even larger change than this PR, as it means moving geometry support from a purely compile-time decision to a runtime one. It would also be a non-general solution: it would only relieve the situation around geometry fields. I'm happy to implement that in an update to this PR if a runtime configuration approach would be acceptable.
That's certainly a possibility! It would also preclude |
|
The reason we don't really have an extension support is because MySQL doesn't have (unless it changed) custom types support anyway, so it is simpler to handle them with a small contracts like JSON and Geo. |
|
So here is another approach based on configuration and a protocol:
This preserves the existing functionality, while allowing other libraries to provide WKB encoding (via the protocol) and decoding (via the wkb decoder configuration). If this looks like a workable approach to others, then I'll verify this works with Thoughts? |
|
My thinking is that, if we need a behaviour or a module for decoding, we might as well use it for both? 🤔 Edit: for clarity, I don't see a need for the protocol. |
8acb7e5 to
2dca7ae
Compare
…her than `geo`
This is accomplished by:
* Add `MyXQL.Protocol.Encoder`, a protocol for encoding structs
* Providing encoders for Geo structs if the `geo` library is available at compile-time
* Fetch the wkb decoder from configuration, defaulting to the `geo` library, via
`Application.get_env(:myxql, :wkb_decoder)`
This preserves the existing functionality, while allowing other libraries to provide
WKB encoding (via the protocol) and decoding (via the wkb decoder configuration).
2dca7ae to
7a2f1cc
Compare
How do you recommend doing encoding? A struct is passed in for encoding. When it is the How will In the other direction, decoding is not possible with a protocol as the binary is not distinguished by any type: it's just a binary that encodes some type of geometry. |
|
Thinking a bit more on this, I suppose one could define a behaviour for geometry modules that must provide a decoding function and an encoding function which gets passed a struct .. if the implementation does not know how to encode that struct it would return .... That seems even more awkward (and less performant) since any time a struct is passed in for decoding, ... which sounds like a less efficient Protocol with more steps to me? |
|
It is also worth pointing out that the approach of a Protocol also means that an application can insert geometry data no matter which library it comes from, making the choice between Only decoding remains fixed to one library. |
|
You can define a module and pass it to |
|
Agree with above. I think a protocol would be cleaner for just encoding but it feels awkward to have encoding protocol and decoding behaviour/MFArgs/whatever. So a single module abstractions seems more compelling to me. |
|
It may be useful to consider encoding structs and decoding geometry columns as related but, actually, separate concerns With a protocol, any struct can have custom encoding. This is not specific to geometry data. In the specific case of geometry columns, it would allow storing any data that can be encoded as wkb. This is not just geo or geometry structs, but also structs from e.g. the With a protocol for encoding, that application could instead provide a defmodule MyApp.MySQLGeoEncoder do
@behaviour MyXQL.Protocol.GeometryEncoder
def encode(%{"lat" => lat, "lon" => long.}) do
... wkb encoding here ...
end
def encode(other), do: GeoSQL.MySQL.GeoEncoder.encode(other)
defdelegate decode, to: GeoSQL.MySQLGeoEncoder
endThat is not particularly scalable as a solution, and prevents libraries from shipping with support out of the box. This all applies to non-geometry columns in the database, as well, of course. And then there is decoding geometry columns. When decoding data that is stored in MySQL's srid+wkid storage format, that is known at retrieval, and there needs to be a way to represent those to the application. That process "needs" to be configurable so applications may control how they view that data by default, but that is all. It also makes no sense to say "decode this WKB to an openstreetmap REST API response": encoding it TO geometry necessarily results in loss of information So the two processes of encoding-to-wkb and decoding-from-wkb are not symmetric. As a developer I am less bothered by having a protocol + a configuration entry than I am being limited because the two things have been tied together. There also doesn't seem to be a nice way to avoid something being configured. Even with a Behaviour, which module to use for it would need to be set somewhere. The D/X pain point of configuration is not resolved by a Behaviour implementation. tl;dr -> As a user of this library, providing a single module that implements a behaviour for geometry columns does not make anything easier (I still have to write just as much code and I still have to configure things), while being less useful than a protocol. IOW: similar D/X, worse results. |
This could be solved by having a protocol in Geo, which other data structures can implement, and the Geo.MyXQL behaviour would then call the protocol and provide the functionality needed. This comes with the benefit that Geo becomes then the canonical representation, otherwise folks have to write OpenStreetMap -> MySQL, OpenStreetMap -> PostgreSQL, etc. Overall, I am not convinced we need an extensible functionality in MyXQL when MySQL itself is not extensible. If anything, relying everyone to know MySQL or PostgreSQL representations rather than just converting to a canonical one sounds like more work to everybody. Finally, it is worth saying that, if you have a behaviour, you can add a protocol on top. However, if MyXQL imposes a protocol, then that's a stronger requirement. If you like protocols, you can implement them on top of the behaviour, but not vice-versa. |
This is an argument in favour of using a protocol since there is no agreement on what the canonical representation is. I don't want to have to convince every single developer out there to use geometry library I'm using (or vice versa) just to be able to store the same bytes in a database, and the more straight-forward way to get there with the minimal amount of dependency baggage is a protocol for encoding that allows the encoding to be hosted with the implementations of the structs. Even if the only implementations of the protocol were for But it's fine. My goal is to find some way to bring GeoSQL to MySQL users, and even an imperfect solution is better than no solution. At least if there is a behaviour, then I can provide a module in GeoSQL that encodes both Geo and Geometry structs, and decodes to Geometry structs. Not great, but at least something that will work. |
The reason why this doesn't work is because the MySQL representation is not canonical either. Solving it for MySQL, won't solve it for PostgreSQL. So having a third party generic representation, such as Geo or Geometry, means it could behave similarly as a LSP, as an intermediate representation everyone relies on. In any case, if you feel strongly about a protocol decoupled from your library, you could easily ship a |
Can be configured by setting the `:myxql, :geometry_codec` configuration key to the name of a module that implements the GeometryCodec behaviour default to geo, if available.
There are already solutions for pgsql.
The issue being that in practice there is no generic representation. We can continue to say it should be A or B all we want, the reality is that it is A and B. Translating all B to A is not only wasteful in terms of performance, it means that one has to be a perfect superset of the other, and we all have to agree on which one of those it is. The unfortunate situation is that Unless someone wants to take on the task of moving the ecosystem over to But it's academic at this point. I've committed a solution to this branch which implements a behaviour, MyXQL.Protocol.GeometryCodec, makes it configurable, and provides a default implementation for |
Co-authored-by: José Valim <jose.valim@gmail.com>
|
@aseigo I am only slightly familiar with this project, don't feel confident with merging it. :) |
| def encode_binary_value(struct) when is_struct(struct) do | ||
| # see if it is a geometry struct | ||
| case MyXQL.Protocol.GeometryCodec.do_encode(struct) do | ||
| :unknown -> | ||
| string = json_library().encode!(struct) | ||
| {:mysql_type_var_string, encode_string_lenenc(string)} | ||
|
|
||
| encoded -> | ||
| encoded | ||
| end |
There was a problem hiding this comment.
In hindsight, another idea is this contract:
defmodule MyXQL.Protocol.GeometryCodec do
@callback encode(struct()) :: ...
@callback encode_structs() :: [module()]
@callback decode(...) :: ...
end if codec = Application.compile_env(:myxql, :geometry_codec) do
structs = codec.encode_structs()
def encode_binary_value(%struct{} = value) when struct in structs do
unquote(codec).encode(value)
end
endthe benefits are:
- we do more work at compile-time. We'd need to
mix deps.compile --force myxqlwhen swapping codecs though. - we don't duplicate JSON-encoding logic which is in
def encode_binary_value(term) when is_list(term) or is_map(term) dobelow.
I concede it's more in poor-man's protocols uncanny valley but I personally kinda like doing more work at compile-time on principle.
WDYT @aseigo @josevalim?
As an aside, the clause below was arguably accidentally too broad. In hindsight it arguably should have been:
- def encode_binary_value(term) when is_list(term) or is_map(term) do
+ def encode_binary_value(term) when is_list(term) or is_non_struct_map(term) dothat is, if a myxql user needed to encode custom structs, we'd intentionally provide a mechanism to do so (even if it ended up being the same, json encoding)
There was a problem hiding this comment.
Having to recompile dependencies is always a bit of an annoyance, and is one more thing to forget to do. It also means that GeoSQL can not set the codec at runtime for its users, and instead needs to point devs of it to docs on how to enable the correct codec in MyXQL.
But if it has a meaningful performance impact it could be worth it, and it certainly wouldn't be the end of the world in terms of developer inconvenience (it's "just" one more manual config entry, after all).
So personally I lean gently towards the runtime approach, but could also work with the compile-time one.
There was a problem hiding this comment.
Because we are using compile_env, would we warn if you change its value but don't recompile?
There was a problem hiding this comment.
If there's a warning that'd be helpful, for sure! :)
There was a problem hiding this comment.
Good point, I think we'd get the warning indeed. So let's go with compile-time config and encode_structs callback? Or that doesn't seem appealing?
Regarding GeoSQL automatically setting the codec, on principle I'd avoid library A changing global configuration of library B. In this particular case though that seems harmless and you're of course free to do whatever you want to do in your libraries.
If runtime config is your preference, another option is a similar encode_structs callback idea but at the runtime. Something like:
def encode_binary_value(term) when is_list(term) or is_map(term) do
codec = Application.get_env(:myxql, :geometry_codec)
struct_names = codec && codec.encode_structs()
case term do
%struct_name{} = struct when struct_name in struct_names ->
codec.encode(struct)
_ ->
string = json_library().encode!(struct)
{:mysql_type_var_string, encode_string_lenenc(string)}
end
endthough in that case probably better to expose a geometry_codec/1 function to set it that'd put it in persistent_term.
I'm fine with going with whatever solution as long as we don't paint ourselves into a corner and can encode other types in the future, if needed.
There was a problem hiding this comment.
Regarding GeoSQL automatically setting the codec, on principle I'd avoid library A changing global configuration of library B
It does this in response to the user of the library requesting the library be setup for GIS data and queries. It helps those users avoid is knowing what needs to be done to make that happen, which is especially nice since most people dealing with mapping data are not GIS people and every single backend GeoSQL supports (MySQL will be the 4th) does it completely differently. Oof!
So while I agree with you in the general case, this is not only very helpful for users of GeoSQL, it is only done at the explicit request of the user.
So let's go with compile-time config and encode_structs callback?
That would would be sufficient, sure. If that's what you'd prefer, I'll make adjustments to this branch.
I'm fine with going with whatever solution
Same here. :) So, as the library maintainer, if you could make a decision I'll adapt the code as necessary to it.
Cheers.
There was a problem hiding this comment.
I took another look at this tonight and since MySQL does not have custom types, it looks rather difficult to make this generic for any kind of struct. So I pushed 9b0af2f to show what it could look like as a compile-time construct. The Value module doesn't really change, just the codec behaviour. WDYT?
There was a problem hiding this comment.
@wojtekmach Any further thoughts on whether you prefer a compile-time implementation as in 9b0af2f ? I'm fine either way, just want to get this moving towards a mergeable state :)
Co-authored-by: Wojtek Mach <wojtekmach@users.noreply.github.com>
|
Sorry I’m out for a week or so, I probably won’t be able to work on this until then. Sorry for back n forth. let’s go with the previous approach, runtime config and contract that returns :unknown atom |
|
Please undraft so we can run ci |
9b0af2f to
0cf16d0
Compare
|
This has been marked as ready for review, and @wojtekmach has made their way back? Any updates on this one? Thanks! |
|
Thank you! |
This PR adds support for the
geometrylibrary for geometry types. Currently,myxqlonly supports thegeolibrary for geometry columns, which makes using it with libraries such asgeo_sql(disclaimer: I am the author ofgeo_sql) which usegeometryinfeasible.With these changes, it is possible to use
myxqlwith thegeometrylibrary, and this in turns makes it possible to usegeo_sqlwith MySQL/MariaDB databases.While this works (and based on my testing, works quite well), I'm less convinced that the implementation is very beautiful. It would be nice if it were possible to register handlers for custom types rather than having to hack them into library itself.
If the maintainers of this library have ideas on what that could look like, I'd be happy to have a conversation about it.
In the meantime, this PR can at least serve as a demonstration of how one might accomplish the goal of allowing
geometryrather thangeomto handle geometry types in a MySQL database without larger changes to the library.