Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions arrow-array/src/array/fixed_size_binary_array.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ pub struct FixedSizeBinaryArray {
value_data: Buffer,
nulls: Option<NullBuffer>,
len: usize,
value_length: i32,
value_length: i32, // validated to be a valid usize by `try_new`
}

impl FixedSizeBinaryArray {
Expand Down Expand Up @@ -169,12 +169,11 @@ impl FixedSizeBinaryArray {
self.len()
);
let offset = i + self.offset();
let value_length = self.value_length as usize; // validated in Self::try_new
let value_offset = offset.checked_mul(value_length).expect("offset overflow");
// SAFETY: value offset/length computed correctly using checked arithmetic
unsafe {
let pos = self.value_offset_at(offset);
std::slice::from_raw_parts(
self.value_data.as_ptr().offset(pos as isize),
(self.value_offset_at(offset + 1) - pos) as usize,
)
std::slice::from_raw_parts(self.value_data.as_ptr().add(value_offset), value_length)
}
}

Expand All @@ -186,7 +185,7 @@ impl FixedSizeBinaryArray {
/// # Safety
///
/// Caller is responsible for ensuring that the index is within the bounds
/// of the array
/// of the array and the resulting byte offset fits in `i32`
pub unsafe fn value_unchecked(&self, i: usize) -> &[u8] {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add new methods that are alternatives to value_offset and value_offset_at that return usize, so we don't need this limitation? Or at least update this method so it doesn't use them and doesn't suffer from i32 overflow. Because with this change, value now handles when value_offset is greater than max i32, but value_unchecked still doesn't.

I would expect value_unchecked to work correctly for all cases where value doesn't panic.

And existing code that uses value_unchecked might validate the index but not be aware of this hidden safety requirement.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short answer is yes. I also spent some more time reviewing the code in FixedSizeBinaryArray and I am now convinced there are several other miuses of i32 <-> usize . I am working on an improvement, though I worry it will be a larger PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let offset = i + self.offset();
let pos = self.value_offset_at(offset);
Expand Down Expand Up @@ -486,6 +485,8 @@ impl FixedSizeBinaryArray {
})
}

/// Returns the byte offset for the element at index `i` without
/// checking for overflow.
#[inline]
fn value_offset_at(&self, i: usize) -> i32 {
self.value_length * i as i32
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this arithmetic can overflow if the result is larger than i32::MAX

Expand Down
Loading