Skip to content

Major lljson rework to make round-trippable serialization easier#61

Merged
HaroldCindy merged 18 commits intomainfrom
harold/add_json_replacer
Mar 21, 2026
Merged

Major lljson rework to make round-trippable serialization easier#61
HaroldCindy merged 18 commits intomainfrom
harold/add_json_replacer

Conversation

@HaroldCindy
Copy link
Contributor

@HaroldCindy HaroldCindy commented Mar 15, 2026

Of note, there's now a notion of replacer and reviver callbacks as in JS' JSON APIs. An example of their use is in the tests folder as lljson_typedjson.lua.

We went with this form since it allows constructing a different representation of the object before serializing without requiring you to construct an entire, serializable copy before calling lljson.encode(). That allows you to save memory, since the serializable version of each object only need to be alive as long as we're still traversing the object.

Additionally, an empty table is now encoded as [] by default. This is probably the most common meaning for an empty table, but you can also apply object_mt as a metatable or add __jsonhint="object" to your own metatable to force serialization as an object. Similarly, array_mt or __jsonhint="array" will act as a hint to treat your object as an array.

__len should no longer be used as a hint that the object should be treated as an array, that's what __jsonhint is for.

Also added a new options table format to lljson.encode() and friends. The table now allows you to specify that __tojson hooks should be skipped, so you can manually invoke them at your leisure in your replacer hooks.

Of note, there's now a notion of replacer and reviver callbacks as in
JS' `JSON` APIs. An example of their use is in the tests folder as
`lljson_typedjson.lua`.

We went with this form since it allows constructing a different representation
of the object before serializing without requiring you to construct an entire,
serializable copy before calling `lljson.encode()`. That allows you to save memory,
since the serializable version of each object only need to be alive as long
as we're still traversing the object.

Additionally, an empty table is now encoded as `[]` by default. This is
probably the most common meaning for an empty table, but you can also
apply `object_mt` as a metatable or add `__jsontype="object"` to your
own metatable to force serialization as an object. Similarly, `array_mt`
or `__jsontype="array"` will act as a hint to treat your object as an array.

`__len` should no longer be used as a hint that the object should be treated
as an array, that's what `__jsontype` is for.

Also added a new options table format to `lljson.encode()` and friends. The
table now allows you to specify that `__tojson` hooks should be skipped, so
you can manually invoke them at your leisure in your replacer hooks.
@HaroldCindy HaroldCindy marked this pull request as ready for review March 16, 2026 23:08
HaroldCindy and others added 4 commits March 19, 2026 09:26
Co-authored-by: Tapple Gao <tapplek@gmail.com>
This helps improve round-trippability of JSON payloads from outside SL,
preserving the `array`-ness or `object`-ness of empty tables especially.
@HaroldCindy
Copy link
Contributor Author

lljson.(sl)decode() now accepts an options table that you can specify track_path and replacer on. The reviver now receives a 4th arg, which is a ctx table. If you choose to set track_path to true, ctx.path will be a table filled with the current path within the document, otherwise it will be nil. Generally you shouldn't need this, but it's there if you want it.

Additionally, we automatically set object_mt and array_mt on things as they're decoded so that they can round-trip correctly through lljson.encode(lljson.decode(input)) without mis-handling empty objects / arrays. You can still replace the metatables on these tables, those are just the default ones they'll be deserialized with.

Thank y'all very much for your input! I think we've arrived at something semi-reasonable here, but I'd be interested to hear any thoughts on those recent changes.

Copy link

@Suzanna-Linn Suzanna-Linn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • About track_path:
    I like it.

  • About calling __tojson only once:
    Yes, I think that it's the best way.

  • About getting __jsontype before calling __tojson (without replacer):
    I think that it's good for advanced scripters, but that it will be difficult to understand for intermediate scripters.
    I would prefer to have:

    • __tojson (always)
    • replacer
    • __jsontype
    • serialization (no __tojson)
  • About automatically setting object_mt and array_mt:
    I think this would be better as an optional parameter.

    • It could be problem in some cases. For instance: we receive an array from a server, add some keys to it, and send it to another server expecting it to be an object.
    • And it's not always useful, only when there is an empty object, or an object that could become empty before the next encoding.
    • It's also useful to know in the reviver if a table comes from array or object. I don't think there is any other way to know this. Could we have a ctx.jsontype: string in the reviver?

@HaroldCindy
Copy link
Contributor Author

@Suzanna-Linn

Thanks for the feedback!

I would prefer to have:

  • __tojson (always)
  • replacer
  • __jsontype
  • serialization (no __tojson)

Interested to hear others' thoughts on this. Is there any non-hypothetical usecase for a metatable that sometimes serializes with object semantics and sometimes with array semantics? Would it be acceptable to check for __jsontype on the __tojson retval and give that priority over the __jsontype from the main object if it's present? Should the main object's metatable be consulted at all if there's a __tojson?

It could be problem in some cases. For instance: we receive an array from a server, add some keys to it, and send it to another server expecting it to be an object.

That's probably my fault, __jsontype is really more like __jsonhint, and I should rename it. If something cannot be reasobly encoded as an array, it will always be encoded as an object. Basically if what we're left with after __tojson is something with non-integer indices, we will encode as an object even if __jsontype="array" so in practice, the situation you mention will be fine. The mutated table will be unambiguously an object due to the keys on it, so it will be encoded as an object.

And it's not always useful, only when there is an empty object, or an object that could become empty before the next encoding.

It's also useful in, say, cases where the server returns you an empty object specifically, you mutate it by adding some integer keys, and the serializer is able to realize that since this must be an object, those integer keys are meant to be string-ified and not treated as array indices

It's also useful to know in the reviver if a table comes from array or object. I don't think there is any other way to know this. Could we have a ctx.jsontype: string in the reviver?

As @tapple mentioned, this is achievable with a metatable check now that the reviver receives metatable'd tables.

You can just use __tojson() for doing these kinds of things now, and
it's very strange to only consult `__index` in the array case. Barely
even works, and was just inherited from cjson. Let's ditch it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants