Is your feature request related to a problem or challenge? Please describe what you are trying to do.
PyLogicalPlan can currently only serialize or deserialize built in functions and table providers. It currently uses DefaultLogicalExtensionCodec in this function.
Users would like to be able to serialize and deserialize plans that include custom functions and table providers. One such example is the Iceberg python integration.
This topic was mentioned in the #datafusion-python channel in Discord.
Describe the solution you'd like
Create a logical extension codec that can process user defined functions and table providers. This likely means some work in the upstream repository.
One approach could consist of:
- Add
encode and decode methods to the FFI_TableProvider, FFI_ScalarUDF and so on.
- Implement
LogicalExtensionCodec on ForeignTableProvider, ForeignScalarUDF, and so on where the only methods implemented are those appropriate to table provide, udf, etc. That is for ForeignTableProvider it only supports try_encode_table_provider and try_decode_table_provider.
- Add a
LogicalExtensionCodec to datafusion-python. For the table providers, iterate through all registered table providers and see if any of them successfully decode. The encode method is more straightforward.
Describe alternatives you've considered
The user could output the logical plan as a SQL command and then pass that along as a string.
Additional context
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
PyLogicalPlancan currently only serialize or deserialize built in functions and table providers. It currently usesDefaultLogicalExtensionCodecin this function.Users would like to be able to serialize and deserialize plans that include custom functions and table providers. One such example is the Iceberg python integration.
This topic was mentioned in the #datafusion-python channel in Discord.
Describe the solution you'd like
Create a logical extension codec that can process user defined functions and table providers. This likely means some work in the upstream repository.
One approach could consist of:
encodeanddecodemethods to theFFI_TableProvider,FFI_ScalarUDFand so on.LogicalExtensionCodeconForeignTableProvider,ForeignScalarUDF, and so on where the only methods implemented are those appropriate to table provide, udf, etc. That is forForeignTableProviderit only supportstry_encode_table_providerandtry_decode_table_provider.LogicalExtensionCodectodatafusion-python. For the table providers, iterate through all registered table providers and see if any of them successfully decode. The encode method is more straightforward.Describe alternatives you've considered
The user could output the logical plan as a SQL command and then pass that along as a string.
Additional context