Skip to content

Extending the built-in type set with (a) tagged union(s) isn't supported? #140

@goodboy

Description

@goodboy

My use case: handling an IPC stream of arbitrary object messages, specifically with msgpack. I desire to use Structs for custom object serializations that can be passed between memory boundaries.


My presumptions were originally:

  • a top level decoder is used to process msgpack data over an IPC channel
  • by default, i'd expect that decoder will decode using the python type set to be able to accept arbitrary msgpack bytes and tagged msgspec.Structs
  • if a custom tagged struct was placed inside some std python type (aka embedded), i'd expect this decoder (if enabled as such) to be able to detect the tagged object field (say {"type": "CustomStruct", "field0": "blah"}) and automatically know that the embedded msgpack object is one of our custom tagged structs and should be decoded as a CustomStruct.

Conclusions

Based on below thread:

  • you can't easily define the std type set and a custom tagged struct using Union
  • Decoder(Any | Struct) won't work even for top level Structs in the msgpack frame

This took me a (little) while to figure out because the docs didn't have an example for this use case, but if you want to create a Decoder that will handle a Union of tagged structs and it will still also process the standard built-in type set, you need to specify the subset of the std types that don't conflict with Struct as per @jcrist's comment in the section starting with:

This is not possible, for the same reason as presented above. msgspec forbids ambiguity.

So Decoder(Any | MyStructType) will not work.

I had to dig into the source code to figure this out and it's probably worth documenting this case for users?


Alternative solutions:

It seems there is no built-in way to handle an arbitrary serialization encode-stream that you wish to decode into the default set as well as be able to decode embedded tagged Struct types.

But, you can do a couple other things inside custom codec routines to try and accomplish this:

  • create a custom boxed Any struct type, as per @jcrist's comment under the section starting with:

    Knowing nothing about what you're actually trying to achieve here, why not just define an extra Struct type in the union that can wrap Any.

  • consider creating a top-level boxing Msg type and then using msgspec.Raw and a custom decoder table to decode payload msgpack data as in my example below

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions