-
Notifications
You must be signed in to change notification settings - Fork 60
Description
Sorry if this is a duplicate, but I like the "keywords" I've showcased in this issue. Other related issues:
- Validity of pointers and references to memory not allocated by the compiler #285
- Could we have Rust without provenance? #287
- Storing an object as &Header, but reading the data past the end of the header #256 <- this is a pervasive instance of that.
- What about: "container_of"-style pointer arithmetic? #243
- Stacked Borrows: raw pointer usable only for
Ttoo strict? #134 <- canonical issue? - Provenance #52
Note that the quite loaded term "provenance" is being used here as described mainly in #134.
Unleaking
The stdlib libs docs currently state, regarding Box::leak:
Dropping the returned reference [return value of
Box::leak()] will cause a memory leak. If this is not acceptable, the reference should first be wrapped with theBox::from_rawfunction producing aBox. ThisBoxcan then be dropped which will properly destroyTand release the allocated memory.
So, even if there is no code snippet, such statement is stating that:
fn drop_box<T> (boxed: Box<T>)
{
drop(unsafe { Box::from_raw(Box::leak(boxed)) });
}is sound, no matter the alloc::Global backing it.
-
A far-fetched / contrived generalization to any
impl Allocatorfn drop_box_in<T, A : Allocator> (boxed: Box<T, A>) { let (mut ptr, alloc) = Box::into_raw_with_allocator(x); unsafe { ptr = &mut *ptr; drop(Box::from_raw_in(ptr, alloc)); } }
Which, given Box's implementation, is assuming that if somebody asks an impl GlobalAlloc —or an impl Alloc if generalizing— memory for a Layout::new::<T>() (through alloc or realloc), and gets back a non-null pointer ptr, then it is then legal to give back ptr to that impl Alloc's dealloc (or realloc), but with ptr's provenance having been "shrunk" down to that T's layout (e.g., through ptr = <*mut _>::cast(&mut *ptr.cast::<MaybeUninit<T>>());).
This, in practice, can be quite problematic for many (most?) GlobalAlloc implementations out there, since they do often perform some bookkeeping and whatnot laid out contiguously to the yielded allocation, and such metadata would thus not be accessible from such a returned pointer alone: the allocator would thus need to keep some extra data / state to be able to get back provenance over the user-facing allocation and the contiguous metadata.
A simplified example
use ::std::{
alloc::{self, GlobalAlloc as _, Layout},
mem::drop as stuff,
ptr,
};
unsafe trait AllocI32 {
fn alloc_i32() -> Option<ptr::NonNull<i32>>;
unsafe
fn dealloc(_: ptr::NonNull<i32>);
}
struct MyAllocUsingGlobalAllocUnderTheHood;
#[repr(C)]
struct I32AndMeta {
alloc: i32,
meta: u8,
}
static SYSTEM_ALLOC: alloc::System = alloc::System;
unsafe impl AllocI32 for MyAllocUsingGlobalAllocUnderTheHood {
fn alloc_i32() -> Option<ptr::NonNull<i32>> {
let layout = Layout::new::<I32AndMeta>();
ptr::NonNull::new(unsafe { SYSTEM_ALLOC.alloc_zeroed(layout) })
.map(ptr::NonNull::cast)
}
unsafe fn dealloc(ptr: ptr::NonNull<i32>) {
let layout = Layout::new::<I32AndMeta>();
let meta = ptr.as_ptr().cast::<u8>().add(4).read();
stuff(meta);
SYSTEM_ALLOC.dealloc(ptr.as_ptr().cast(), layout);
}
}The interesting lines here are:
unsafe fn dealloc(ptr: ptr::NonNull<i32>) {
let layout = Layout::new::<I32AndMeta>();
let meta = ptr.as_ptr().cast::<u8>().add(4).read();if ptr were to stem from &i32 (e.g., let r: &i32 = …; dealloc(r.into());), even if that &i32 had originated from an alloc()-yielded ptr (let r: &i32 = alloc().unwrap().as_ref();), then the operation reading meta would not be well-defined: r.add(4) would yield an off-by-one pointer, which would not be usable to perform a read-dereference with.
The two possible workarounds
In a world without any abstraction whatsoever, the answer to this problem is easy: keep a pointer with provenance over that allocated I32AndMeta around (such as the ptr returned by alloc itself), and use it to "launder" the received ptr. But since there is this Alloc / GlobalAlloc boundary, the question remains: who should be responsible for doing this?
-
Would it be the
Allocator, as in:unsafe fn dealloc(ptr: ptr::NonNull<i32>) { let layout = Layout::new::<I32AndMeta>(); let ptr = ptr.as_ptr().cast::<u8>(); + let ptr_with_provenance: *mut u8 = Self::get_ptr_with_provenance(…); + // amend `ptr` so that it points to the same place, but with a fixed provenance + let ptr = <*mut u8>::wrapping_offset( + ptr_with_provenance, + isize::wrapping_sub( + ptr.as_ptr() as _, + ptr_with_provenance as _, + ), + ); let meta = ptr.add(4).read(); // <- this read is now well-defined! stuff(meta); SYSTEM_ALLOC.dealloc(ptr.as_ptr().cast(), layout); } -
or would it be the user of the
Allocator, by declaring "unleaking" to be a contract violation / by requiring that a pointer with the originally-obtained provenance be the only valid input for a{de,re}alloccall?pub struct MutRefWithOriginalProvenance<'lt, T> { ptr: ptr::NonNull<T>, _phantom: PhantomData<&'lt mut T>, } impl<T> Deref{,Mut} for … { type Target = T; … } impl… MutRefWithOriginalProvenance… { pub fn into_raw… }
let mut r: MutRefWithOriginalProvenance<'static, i32> = Box::reversible_leak(Box::new(42)); *r += 27; unsafe { drop(Box::from_raw(r.into_raw())); }
This point ought to be clarified, and if going for the latter —or until confirming the former—, then the stdlib docs should be updated to actually disincentivize unleaking.
My potentially-obvious two cents
It feels like the "legalized unleaking" approach has the drawback of requiring that extra get_ptr_with_provenance(…) operation, which could come with a non-negligible cost for all GlobalAlloc implementations, only to allow a potentially deemed niche "unleaking" operation.
But it also feels like "forbidden unleaking" approach is quite a footgun, if, for instance, even the stdlib docs have gotten it wrong for such a long time.
So this seems like the classic "let's gauge/measure the performance benefits of 'forbidden unleaking' / the performance cost of 'legalized unleaking'" to compare them against the footguns that forbidding it introduces.
Finally, and this is technically beyond Rust's reach, there is also the question of non-Rust pervasive implementations of GlobalAlloc, such as that of libc (malloc, calloc, realloc, free) and whatnot. Such implementations do use metadata, and according to @chorman0773, the cost of a get_ptr_with_provenance(…) operation would be very much non-negligible (and, technically, even more so since Rust cannot go and tweak such an implementation, and would thus have to wrap it in a black-box API kind of fashion).
-
So, from the looks of it / IIUC, the only practical approach w.r.t. a legalized unleaking would be to ban
malloc& friends from being used forGlobalAllocimplementations! But I may very well be wrong; I'll let @chorman0773 (and others) chime in and clarify this hypothetical point (although if this were to be true, then I guess there is really no other choice than forbidding unleaking).- This, in turn, could be deemed inconvenient for FFI users which would have chosen a
malloc-powered#[global_allocator]to allowfree-ing to occur from the C side, but this is yet another topic…
- This, in turn, could be deemed inconvenient for FFI users which would have chosen a