Skip to content

feat(java): enhance java unsigned int/array type system#3190

Merged
chaokunyang merged 11 commits intoapache:mainfrom
chaokunyang:refine_array_type_system
Jan 22, 2026
Merged

feat(java): enhance java unsigned int/array type system#3190
chaokunyang merged 11 commits intoapache:mainfrom
chaokunyang:refine_array_type_system

Conversation

@chaokunyang
Copy link
Copy Markdown
Collaborator

@chaokunyang chaokunyang commented Jan 22, 2026

Why?

Java lacks native unsigned integer types and efficient primitive collections, forcing developers to:

  • Use boxed wrappers (Integer, Long, etc.) which create performance overhead
  • Manually handle unsigned arithmetic with signed types
  • Use inefficient ArrayList instead of primitive-backed lists
  • Write boilerplate conversion code between signed and unsigned representations

This leads to reduced performance and increased memory overhead in scenarios involving unsigned data and large collections of primitives.

What does this PR do?

This PR enhances Fory's Java type system with comprehensive unsigned integer support and primitive-backed collections:

Unsigned Integer Types

  • Adds wrapper classes Uint8, Uint16, Uint32, Uint64 with proper unsigned arithmetic
  • Provides specialized serializers for unsigned types with optimized dispatch
  • Implements unsigned-aware field serialization/deserialization in object codecs
  • Adds dedicated dispatch IDs (EXT_UINT8, EXT_UINT16, EXT_UINT32, EXT_VAR_UINT32, EXT_UINT64, EXT_VAR_UINT64)

Primitive-Backed Collections

  • Implements primitive list types: BoolList, Int8List, Int16List, Int32List, Int64List, Float32List, Float64List
  • Implements unsigned list types: Uint8List, Uint16List, Uint32List, Uint64List
  • All lists support zero-copy array access, auto-resizing, and primitive overloads to avoid boxing
  • Registers specialized serializers for all primitive and unsigned list types

Type System Enhancements

  • Registers unsigned types with proper type IDs (Types.UINT8 through Types.UINT64)
  • Maps primitive list types to array type IDs (Types.BOOL_ARRAY, Types.INT8_ARRAY, etc.)
  • Updates FieldTypes to correctly handle unsigned type metadata during schema evolution
  • Extends field skipping logic to handle new unsigned dispatch IDs

Code Generation Support

  • Updates BaseObjectCodecBuilder to generate optimized serialization code for unsigned fields
  • Adds GraalVM native-image support for all new serializers and collection types

Related issues

#1017

Does this PR introduce any user-facing change?

Yes - This PR introduces new public APIs:

  • New Types: Uint8, Uint16, Uint32, Uint64 wrapper classes

  • New Collections: BoolList, Int8List, Int16List, Int32List, Int64List, Uint8List, Uint16List, Uint32List, Uint64List, Float32List, Float64List

  • New Serializers: UnsignedSerializers and PrimitiveListSerializers

  • Does this PR introduce any public API change?

  • Does this PR introduce any binary protocol compatibility change?

    • New dispatch IDs are added but existing protocol remains compatible

Benchmark

Performance benefits include:

  • Zero-copy access: Primitive lists provide direct array access without boxing overhead
  • Reduced memory footprint: Primitive collections use ~4-8x less memory than boxed equivalents
  • Optimized serialization: Dedicated dispatch IDs and specialized serializers for unsigned types
  • JIT-friendly code: Generated codecs for unsigned fields eliminate virtual dispatch

Specific benchmarks can be added using the test classes:

  • PrimitiveListsTest - validates all primitive list operations
  • UnsignedTest - validates unsigned arithmetic and conversions

@chaokunyang chaokunyang force-pushed the refine_array_type_system branch from be85542 to 6e77799 Compare January 22, 2026 07:13
@chaokunyang chaokunyang merged commit 95692b5 into apache:main Jan 22, 2026
58 checks passed
@chaokunyang chaokunyang mentioned this pull request Jan 22, 2026
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants