This proposal specifies the IDL union encoding in the Fory xlang serialization format using three union type IDs:
UNION (31): union value without embedded union schema identity (schema known from context)
TYPED_UNION (32): union value with embedded registered numeric type id
NAMED_UNION (33): union value with embedded registered type name / shared typedef
This design moves union schema identity into Type Meta, making union consistent with STRUCT/ENUM/EXT patterns and
natural to carry inside Any.
1. IDL Syntax
1.1 Union definition
union Contact [id=0] {
string email = 1;
int32 phone = 2;
}
Rules:
- Each union alternative MUST have a stable tag number (
= 1, = 2, ...).
- Tag numbers MUST be unique within the union.
- Tag numbers SHOULD follow protobuf evolution rules: do not reuse removed tag numbers.
1.2 Union usage
message Person [id=1] {
Contact contact = 1;
}
2. Mapping from Other IDLs
2.1 Protobuf oneof → Fory union
Protobuf:
message Person {
oneof contact {
string email = 1;
int32 phone = 2;
}
}
Mapping:
- The
oneof group maps to a Fory union.
- Each
oneof field number becomes the union alternative tag number (case_id).
2.2 FlatBuffers union → Fory union
FlatBuffers:
union Equipment { Weapon, Monster }
Mapping example:
union Equipment {
Weapon weapon = 0;
Monster monster = 1;
}
- FlatBuffers discriminator values map to union alternative tag numbers (
case_id).
3. Type IDs
3.1 Internal Type ID Table Updates
| Type ID |
Name |
Description |
| 31 |
UNION |
Union value, schema identity is NOT embedded (context required). |
| 32 |
TYPED_UNION |
Union value with embedded registered numeric union type id. |
| 33 |
NAMED_UNION |
Union value with embedded union type name / shared typedef. |
3.2 Type Meta Encoding
All type IDs are written as varuint32 as per the xlang Type Meta rules.
UNION (31): no additional type meta payload
TYPED_UNION (32): followed by union_type_id (varuint32)
NAMED_UNION (33): followed by named-type meta payload:
- meta share disabled:
namespace + type_name (meta strings)
- meta share enabled: shared TypeDef marker + TypeDef body (per xlang meta share)
Notes:
union_type_id uses the standard Full Type ID rule:
Full Type ID = (user_type_id << 8) | internal_type_id
- How union schemas are registered/mapped is implementation-defined, but the numeric
union_type_id MUST be stable
between producer and consumer when TYPED_UNION is used.
4. Union Value Payload Encoding
A union value payload is encoded as:
| case_id (varuint32) | case_value (encoded as Any-style value) |
4.1 case_id
case_id is the union alternative tag number from FDL/protobuf/FlatBuffers mapping.
- It is encoded as
varuint32.
case_id MUST be stable for evolution and MUST NOT be reused for a different alternative.
4.2 case_value (MUST be encoded as Any-style value)
To guarantee that unknown alternatives can be skipped, case_value MUST be encoded as a full Fory value,
equivalent to encoding the value as if it were stored in Any/UNKNOWN:
| field_ref_meta | field_value_type_meta | field_value_bytes |
Where:
field_ref_meta is the standard reference meta (NULL/REF/NOT_NULL/REF_VALUE).
field_value_type_meta is the standard xlang Type Meta (a varuint32 type_id plus optional meta payload).
field_value_bytes is the value bytes encoded according to field_value_type_meta.
This is required even for primitives (e.g., INT32, STRING) to ensure skipping is always possible.
5. Full Wire Layout Examples
5.1 UNION (schema known from context)
Used when the deserializer already knows the union schema (e.g., the field is declared as a specific union type):
| ... outer ref meta ... | type_id=UNION(31) | case_id | case_value(any-style) |
5.2 TYPED_UNION (schema embedded by numeric id)
Used when union schema is not known from context, e.g., union is stored in Any:
| ... outer ref meta ... | type_id=TYPED_UNION(32) | union_type_id | case_id | case_value(any-style) |
5.3 NAMED_UNION (schema embedded by name/typedef)
Used when union schema is resolved by name or via meta share TypeDef:
| ... outer ref meta ... | type_id=NAMED_UNION(33) | (namespace,type_name OR typedef marker) | case_id | case_value(any-style) |
6. Decoding Rules
6.1 High-level decoding algorithm
- Read outer
ref meta (per standard rules).
- Read
type_id as varuint32.
- If
type_id == UNION(31):
- Union schema MUST be provided by context (declared field type / target type).
- If
type_id == TYPED_UNION(32):
- Read
union_type_id (varuint32) and resolve the union schema from the registry.
- If
type_id == NAMED_UNION(33):
- Read named-type meta (name strings or TypeDef marker) and resolve the union schema.
- Read
case_id (varuint32).
- Read
case_value as Any-style value:
- Read
field_ref_meta
- If non-null and not a reference:
- Read
field_value_type_meta
- Read/construct the value using that type meta
6.2 Unknown case_id handling (forward compatibility)
If the resolved union schema does not contain case_id, the decoder MUST still consume the case value:
- Read
field_ref_meta.
- If non-null and not a ref:
- Read
field_value_type_meta
- Call standard
skipValue(field_value_type_meta.type_id) to skip field_value_bytes.
This guarantees that adding new union alternatives is forward compatible.
7. When to Use UNION vs TYPED_UNION vs NAMED_UNION
- Use
UNION (31) when the union schema is known from context:
- struct fields declared as a union type
- collections/maps with declared union element/value type
- explicit
deserialize<TUnion>()
- Use
TYPED_UNION (32) when the union schema is not known from context and numeric registration is available:
- union stored in
Any
- union stored in fully dynamic
UNKNOWN fields
- Use
NAMED_UNION (33) when numeric registration is not available or name-based resolution is preferred:
- unregistered union schemas
- cross-language name mapping
- meta share environments using shared TypeDef
Implementations MAY choose to always write TYPED_UNION/NAMED_UNION for simplicity, but UNION is recommended where
context exists for smaller payloads.
8. Compatibility and Evolution Notes
case_id MUST be treated as a stable identifier (like protobuf field number).
- Adding a new alternative is forward compatible:
- old readers skip unknown
case_id because case values carry standard type meta.
- Removing an alternative is backward compatible if:
- removed
case_id is not reused
- readers treat unknown alternatives as “present but ignored”.
9. Summary
- Introduce three internal union type IDs:
UNION(31), TYPED_UNION(32), NAMED_UNION(33).
- Union schema identity is carried in Type Meta for typed/named unions, consistent with other user-defined types.
- Union payload is always:
case_id(varuint32) + case_value encoded as Any-style value (ref meta + type meta + value).
- Unknown union alternatives can always be skipped safely.
Related Issues
#3027
This proposal specifies the IDL union encoding in the Fory xlang serialization format using three union type IDs:
UNION(31): union value without embedded union schema identity (schema known from context)TYPED_UNION(32): union value with embedded registered numeric type idNAMED_UNION(33): union value with embedded registered type name / shared typedefThis design moves union schema identity into Type Meta, making union consistent with
STRUCT/ENUM/EXTpatterns andnatural to carry inside
Any.1. IDL Syntax
1.1 Union definition
Rules:
= 1,= 2, ...).1.2 Union usage
2. Mapping from Other IDLs
2.1 Protobuf
oneof→ ForyunionProtobuf:
Mapping:
oneofgroup maps to a Foryunion.oneoffield number becomes the union alternative tag number (case_id).2.2 FlatBuffers
union→ ForyunionFlatBuffers:
Mapping example:
case_id).3. Type IDs
3.1 Internal Type ID Table Updates
3.2 Type Meta Encoding
All type IDs are written as
varuint32as per the xlang Type Meta rules.UNION (31): no additional type meta payloadTYPED_UNION (32): followed byunion_type_id (varuint32)NAMED_UNION (33): followed by named-type meta payload:namespace+type_name(meta strings)Notes:
union_type_iduses the standard Full Type ID rule:Full Type ID = (user_type_id << 8) | internal_type_idunion_type_idMUST be stablebetween producer and consumer when
TYPED_UNIONis used.4. Union Value Payload Encoding
A union value payload is encoded as:
4.1
case_idcase_idis the union alternative tag number from FDL/protobuf/FlatBuffers mapping.varuint32.case_idMUST be stable for evolution and MUST NOT be reused for a different alternative.4.2
case_value(MUST be encoded asAny-style value)To guarantee that unknown alternatives can be skipped,
case_valueMUST be encoded as a full Fory value,equivalent to encoding the value as if it were stored in
Any/UNKNOWN:Where:
field_ref_metais the standard reference meta (NULL/REF/NOT_NULL/REF_VALUE).field_value_type_metais the standard xlang Type Meta (avaruint32 type_idplus optional meta payload).field_value_bytesis the value bytes encoded according tofield_value_type_meta.This is required even for primitives (e.g.,
INT32,STRING) to ensure skipping is always possible.5. Full Wire Layout Examples
5.1 UNION (schema known from context)
Used when the deserializer already knows the union schema (e.g., the field is declared as a specific union type):
5.2 TYPED_UNION (schema embedded by numeric id)
Used when union schema is not known from context, e.g., union is stored in
Any:5.3 NAMED_UNION (schema embedded by name/typedef)
Used when union schema is resolved by name or via meta share TypeDef:
6. Decoding Rules
6.1 High-level decoding algorithm
ref meta(per standard rules).type_idasvaruint32.type_id == UNION(31):type_id == TYPED_UNION(32):union_type_id (varuint32)and resolve the union schema from the registry.type_id == NAMED_UNION(33):case_id (varuint32).case_valueas Any-style value:field_ref_metafield_value_type_meta6.2 Unknown
case_idhandling (forward compatibility)If the resolved union schema does not contain
case_id, the decoder MUST still consume the case value:field_ref_meta.field_value_type_metaskipValue(field_value_type_meta.type_id)to skipfield_value_bytes.This guarantees that adding new union alternatives is forward compatible.
7. When to Use UNION vs TYPED_UNION vs NAMED_UNION
UNION (31)when the union schema is known from context:deserialize<TUnion>()TYPED_UNION (32)when the union schema is not known from context and numeric registration is available:AnyUNKNOWNfieldsNAMED_UNION (33)when numeric registration is not available or name-based resolution is preferred:Implementations MAY choose to always write
TYPED_UNION/NAMED_UNIONfor simplicity, butUNIONis recommended wherecontext exists for smaller payloads.
8. Compatibility and Evolution Notes
case_idMUST be treated as a stable identifier (like protobuf field number).case_idbecause case values carry standard type meta.case_idis not reused9. Summary
UNION(31),TYPED_UNION(32),NAMED_UNION(33).case_id(varuint32)+case_valueencoded as Any-style value (ref meta + type meta + value).Related Issues
#3027