Skip to content

[DRAFT][Geo][SQL] Enhance SRID/CRS support based on PROJ data#54543

Draft
uros-db wants to merge 4 commits intoapache:masterfrom
uros-db:geo-srids
Draft

[DRAFT][Geo][SQL] Enhance SRID/CRS support based on PROJ data#54543
uros-db wants to merge 4 commits intoapache:masterfrom
uros-db:geo-srids

Conversation

@uros-db
Copy link
Contributor

@uros-db uros-db commented Feb 27, 2026

What changes were proposed in this pull request?

Extends Spark's Spatial Reference System (SRS) support to 10000+ entries sourced from the PROJ library's EPSG
and ESRI databases, substantially improving the breadth of GeometryType and GeographyType support.

Why are the changes needed?

Currently, Geometry and Geography types offer only limited SRID support (a few hardcoded values).

Does this PR introduce any user-facing change?

Yes, 10000+ additional SRID/CRS values are supported for geospatial types.

How was this patch tested?

Updated the corresponding unit tests for SRS mapping.

Was this patch authored or co-authored using generative AI tooling?

Yes, Claude 4.6 Opus.

*/
private void addOgcOverride(int srid, String ogcStringId) {
SpatialReferenceSystemInformation existing = sridToSrs.get(srid);
if (existing != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the intent here is that we override an existing value, I think it would be best to throw an error if existing is null. Something like:

if (existing == null) {
  throw new RuntimeException("SRID " + srid + " should have already been registered");
}
SpatialReferenceSystemInformation ogcEntry =
    new SpatialReferenceSystemInformation(srid, ogcStringId, existing.isGeographic());
sridToSrs.put(srid, ogcEntry);
stringIdToSrs.put(ogcStringId, ogcEntry);
stringIdToSrs.put(existing.stringId(), ogcEntry);

Comment on lines +347 to +348
# Add Spark-specific entry: SRID 0 (Cartesian, no defined SRS).
all_entries.append((0, "SRID:0", False))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to do this inside the SpatialReferenceSystemCache.java and geo_utils.py instead of here. I believe it is cleaner that the data generated by this script map completely to what we get from Proj.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants