Skip to content

Update msgpack#46

Merged
puredanger merged 1 commit intomasterfrom
msgpack
Feb 26, 2026
Merged

Update msgpack#46
puredanger merged 1 commit intomasterfrom
msgpack

Conversation

@JarrodCTaylor
Copy link
Contributor

@JarrodCTaylor JarrodCTaylor commented Feb 23, 2026

MsgpackEmitter.java

  • The old API had writeArrayEnd() / writeMapEnd() that the emitter called. msgpack-core has no end markers, so emitArrayEnd() and emitMapEnd() are now empty bodies. They're required by the Emitter interface which the JSON emitter still needs.

MsgpackParser.java

  • Calls to readMapEnd(true) and readArrayEnd() were removed entirely since the new API has no equivalent.
  • parseLong() has been simplified so overflow now throws MessageIntegerOverflowException instead of the accidental MessageTypeException. It was accidental because a BigInteger result was created but never returned.

@puredanger
Copy link
Contributor

What is the binary compatibility story between versions? Have you run any tests with deserializing data written by the old version in the new version and vice-versa?

@JarrodCTaylor
Copy link
Contributor Author

For deserializing I tested the following in a repl by writing to a file and reading back going from old -> new and from new -> old. Everything deserialized correctly going both directions between the previous java-transit version and the code form this PR.

    (def test-data
      {:nil-val nil
       :true-val true
       :false-val false
       :zero 0
       :small-int 42
       :neg-int -7
       :large-long 999999999999999
       :float-val 3.14
       :double-val Double/MAX_VALUE
       :nan-val ##NaN
       :inf-val ##Inf
       :neg-inf-val ##-Inf
       :string-short "hello"
       :string-long "this string is longer than thirty one bytes so it needs more text than the other string"
       :keyword :foo/bar
       :symbol 'baz/quux
       :char-val \x
       :uuid-val (UUID/fromString "550e8400-e29b-41d4-a716-446655440000")
       :uri-val (URI. "https://foo.com")
       :date-val (Date. 1396909037000)
       :bigint-val (bigint 99999999999999999999)
       :bigdec-val (bigdec "123.456789012345678901234567890")
       :bytes-val (byte-array [0 1 2 127 -128 -1])
       :vector-val [1 "2" :three]
       :list-val '(4 5 6)
       :set-val #{:a :b :c}
       :nested-map {:outer {:inner [1 2 3]}}
       :ratio-val 1/3})

return cache.cacheRead(mp.readValue().asRawValue().getString(), asMapKey, this);
case STRING:
return cache.cacheRead(mp.unpackString(), asMapKey, this);
case BINARY: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do this unless we also change the emitter to emit binary.

@puredanger
Copy link
Contributor

Before merging, I think we should pull the binary parser unless we have some reason to believe something could have made that. And we should expand the round-trip coverage in TransitMPTest to cover more cases. Ideally this would be generative but maybe that's better covered in transit-clj land.

@puredanger
Copy link
Contributor

In a separate commit we should remove those emitInteger long test branches - those are always false.

@JarrodCTaylor
Copy link
Contributor Author

This demonstrates the reason I think we might include the BINARY case. In the previous java-transit 1.1.389 if you do the following you will get a string "from binary". Without the BINARY case using the msgpack-core dep you will get nil. Including the BINARY case maintains the return string behavior.

  (let [out (ByteArrayOutputStream.)
        packer (MessagePack/newDefaultPacker out)
        bytes (.getBytes "from binary" "UTF-8")]
    (.packBinaryHeader packer (alength bytes))
    (.writePayload packer bytes)
    (.flush packer)
    (let [in  (ByteArrayInputStream. (.toByteArray out))
          rdr (t/reader in :msgpack)]
      (prn (t/read rdr))))

@puredanger
Copy link
Contributor

(History on this binary msgpack stuff: cognitect/transit-format#2)

@puredanger
Copy link
Contributor

I'm not concerned about hand-built msgpack streams, only about the parser being able to read what the emitter emits.

My worry about adding support for parsing BINARY as an encoded string is that we are cutting off more efficient options that we may want to use in the future (using binary directly). To really support this, we may need to add a new format, or possibly it can be added incrementally to the existing msgpack format.

So, let's remove that BINARY parser for now.

@JarrodCTaylor
Copy link
Contributor Author

Concerns about the future usage of binary make sense. The BINARY case has been removed.

@puredanger puredanger merged commit 7d7bd09 into master Feb 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants