Skip to content

Optionally support LZO#51

Closed
milesgranger wants to merge 5 commits intomasterfrom
maybe-add-lzo
Closed

Optionally support LZO#51
milesgranger wants to merge 5 commits intomasterfrom
maybe-add-lzo

Conversation

@milesgranger
Copy link
Owner

@milesgranger milesgranger commented Mar 31, 2021

I've recently got a decent pure rust implementation of minilzo done here: https://github.com/milesgranger/minilzo3

Toying with the idea of closing #41 while also keeping the MIT license by making it an optional feature.

From the rust build this is pretty straight forward, can build using the feature lzo and poof, you're in GPL land, but for PyPI wheels, I think we'd need to have a cramjam-lzo, cramjam-features-which-use-licenses-that-suck or something to that effect so a user can explicitly opt into a cramjam distribution which includes LZO.

Not sure how to accomplish this. Maybe add a CI matrix for features like this to build and upload into a different package namespace on PyPI on a per optional feature basis?

@martindurant
Copy link

I suspect that the users of fastparquet - aside from individual hobbyists - anyone would be happier to not even have to consider the inclusion or not of GPL.

You can, of course, make multiple package names and wheels for them all, and you can have conda-forge make multiple outputs from one feedstock. I don't see how you can depend on an OR of packages, with conda or pip, so that fastparquet can't say "I need cramjam, but cramjam-lzo will do". That is, unless cramjam-lzo depends on cramjam, which would mean duplicated binary code.

@milesgranger
Copy link
Owner Author

This was a concern and something I view as a nonstarter, a package per and duplicate code. I am not sure how to proceed really. Happy that LZO is just as fast as python-lzo, but unhappy dealing with restrictive licenses.

One alternative is having two versions of cramjam, one as it is now and another "full", so developers can do as they potentially are now, try and import the lzo submodule and deal the same way as trying to import python-lzo for that feature. Then dependent on the end user if they have the full cramjam installed (and thus, they deal with implications of GPL/etc on their own).

Or maybe it's just not worth it. 👀

@martindurant
Copy link

You can't do that with static libraries, though, right? Either it's there or not, so two different versions of the package or two packages. And then it's not obvious how to offer the options to the user.

@milesgranger
Copy link
Owner Author

We can control what ends up in the wheel using feature flags, #[cfg(feature = "lzo")] as it's in the PR now, but this could rather be #[cfg(feature = "full")] to conditionally include lzo or other libs which have none MIT or otherwise restrictive licenses and distribute cramjam and cramjam-full wheels. For lib devs like yourself, I suppose raising an informative error could direct a user to install the latter if they need LZO when attempting from cramjam import lzo fails?

My thinking with the initial post was installing a subpackage directly so no duplicate code would exist such as cramjam.lzo delta code, but I really don't think that's possible. 🤔

@martindurant
Copy link

For lib devs like yourself, I suppose raising an informative error could direct a user to install the latter if they need LZO when attempting from cramjam import lzo fails?

but fastparquet would need to depend on the minimal version, so you potentially end up with both :|. Having fastparquet depend on neither is not a good solution, the basic install needs to be functional.

@milesgranger
Copy link
Owner Author

Good point.. so the only viable option is if it can be distributed as a submodule only pip install cramjam.lzo and only include the delta code. Is that a fair assessment?

@martindurant
Copy link

It would be a fine thing to do, but this should be carefully documented, that whoever wants to depend on cramjam needs to make that decision when defining their requirements. In practice, I imagine that means the non-lzo will get used, exclusively.

@milesgranger
Copy link
Owner Author

In practice, I imagine that means the non-lzo will get used, exclusively.

Well that's an unfortunate idea.. you don't think some people would want to avoid using python-lzo, because it too depends on system packages when they could just use cramjam.lzo for the same effect? I hope so, as that's sorta the whole point of this project. 😅

@martindurant
Copy link

Only if they know that they specifically need LZO!

I have a feeling it may be possible to build as a single package but with different version-build strings, like cramjam-v1.0.0 Vs cramjam-v1.0.0-lzo, since you can have OR versions (not package names), at least in conda.

@milesgranger
Copy link
Owner Author

Closing in favor of #52

@milesgranger milesgranger deleted the maybe-add-lzo branch April 2, 2021 16:47
@milesgranger milesgranger mentioned this pull request Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants