-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Any good idea as to how we can properly save the sample to json?
Claude did not have any ideas that would work for all. Here is a summary of what it came up with.
Option 1: Represent the function as a lookup key (recommended)
Rather than serializing arbitrary callables, define a registry of known/supported scattering functions inside tof, and serialize only the name (key) of the function:
# A registry of known inelastic scattering functions
INELASTIC_FUNC_REGISTRY = {
"linear": lambda wav_i: wav_i * 1.1,
"debye_waller": debye_waller_func,
# ...
}
class InelasticSample:
def as_json(self) -> dict:
func_name = self._func_name # stored at construction time
if func_name not in INELASTIC_FUNC_REGISTRY:
raise ValueError(f"Cannot serialize unknown function: {func_name}")
return {
"type": "inelastic_sample",
"distance": var_to_dict(self.distance),
"name": self.name,
"func": func_name,
}This is safe, readable, and consistent with how the rest of the codebase works (e.g. "clockwise" / "anti-clockwise" for chopper direction). The downside is that users can only use pre-registered functions — but that's a feature, not a bug, since it makes JSON files portable and reproducible.
Option 2: Serialize via module:qualname reference
If you want to support user-defined functions but still avoid eval, you can store the function's fully qualified importable name using Python's __module__ and __qualname__.
import importlib
def func_to_json(func) -> dict:
return {
"module": func.__module__,
"qualname": func.__qualname__,
}
def func_from_json(d: dict):
mod = importlib.import_module(d["module"])
# Walk qualname for nested classes/functions
obj = mod
for part in d["qualname"].split("."):
obj = getattr(obj, part)
return objThis serializes as e.g. {"module": "mypackage.scattering", "qualname": "debye_waller"}. On deserialization, you import the module and look up the attribute — no eval involved. The caveat: it only works for importable (module-level) functions, not lambdas or closures.
Option 3: dill / cloudpickle + base64
If you truly need to serialize arbitrary callables (including lambdas and closures):
import base64
import cloudpickle # or dill
def func_to_json(func) -> str:
return base64.b64encode(cloudpickle.dumps(func)).decode()
def func_from_json(s: str):
return cloudpickle.loads(base64.b64decode(s))This works for almost any Python callable. However:
It's not human-readable in the JSON
It's Python-version and library-version sensitive (pickles can break across versions)
It still carries security concerns when loading from untrusted sources (pickle deserialisation can execute arbitrary code too, just like eval)
So this is really only suitable if the JSON files are treated as internal/trusted artifacts.
Originally posted by @nvaytet in #124 (comment)