Implement placement-in protocol for HashMap#40390
Implement placement-in protocol for HashMap#40390bors merged 1 commit intorust-lang:masterfrom F001:placementHashMap
HashMap#40390Conversation
|
While this works technically, the implementation is not correct. The point of the placement-in protocol is to put value directly into some place, in this case into the To implement this you will likely need to do some internal changes to the Entry(-ies), so it would be possible to obtain a pointer for both |
|
Thank you for the review comment! cc @arthurprs Please correct me if anything is wrong. I used a temporary field to store the value because of panic safety. AFAK, if the I'm looking forward to your suggestions. |
|
Your suggestion sounds ok to me. It will avoid unnecessary V copies for Entry::Vacant. To avoid any unnecessary V copies for Entry::Occupied you probably need a variant of robin_hood that will make space without copying the uninitialized V into the bucket. For rollback you can implement Drop for EntryPlace (drop still runs in case of panics) and use pop_internal to fix the table if it comes to that (forget what it returns). BinaryHeap uses a similar strategy to avoid corrupting the structure if T comparisons panics. |
|
Thanks for your suggestion. I have updated the implementation. For now, it can avoid unnecessary V copy for Entry::Vacant. I'll continue to investigate more optimization. |
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
I suggest using a finalized flag instead. Also, the flag should probably be the last field as it may save 7 bytes of stack.
There was a problem hiding this comment.
You are unlikely to have more than one of these per hashmap alive at a time, so this is not very concerning. Also we’re getting field reordering soon, which will do this for everybody automatically.
There was a problem hiding this comment.
As suggested below, using forget can avoid the flag.
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
I'm probably missing something obvious but do we really need to wrap the bucket with Option?
There was a problem hiding this comment.
I'm just lazy that I want to use existing FullBucket::take to remove the entry. It takes a self parameter. But in the drop method, there is only &mut self, the bucket field can't move.
It is fixed by adding another FullBucket::remove method, which takes a &mut self parameter. In drop method, I can call this remove now.
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
Thinking a bit more about this you can forget(self) here, avoiding the flag altogether.
There was a problem hiding this comment.
Good suggestion. Fixed.
nagisa
left a comment
There was a problem hiding this comment.
Nice improvement over the previous version! @arthurprs’ notes seem very relevant (and they are also much more familiar with the HashMap code), so these should be fixed.
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
You are unlikely to have more than one of these per hashmap alive at a time, so this is not very concerning. Also we’re getting field reordering soon, which will do this for everybody automatically.
|
I realised there’s one possible alternative in behaviour. Current implementation tries to recover the previous value if the placement expression fails, however it is not obvious to me whether this is a better approach compared to, say, simply making the key vacant in case of panic. Here are some points in favour of leaving the entry vacant instead of restoring the value if panic happens:
|
|
Very good points, leaving a previous filled bucket empty on panic sounds reasonable. |
|
cc @rust-lang/libs |
|
cc @rust-lang/libs, anyone have feedback on @nagisa's last comment? |
|
I agree that the precise state of the value being modified doesn't matter too much. |
src/libstd/collections/hash/table.rs
Outdated
There was a problem hiding this comment.
This is possibly incorrect. I think you’ll notice why if you add a test that looks like this (you probably should one similar to it):
struct Banana<'a>(&'a mut bool);
impl Drop for Banana {
fn drop(&mut self) {
if !*self.0 { panic!("double drop!"); }
*self.0 = false;
}
}
let mut hm = HashMap::new();
let mut can_drop = true;
hm.insert(0, Banana(&mut can_drop));
hm.entry(0) <- panic!("boom") ;
// first drop happens in `make_place`, where the `Banana(true)` gets dropped and `can_drop` is set to false
// then a `*place.pointer() = panic!("boom")` is executed, which unwinds, thus dropping the place
// place destructor drops the `Banana(false)`, and thus double-panic occurs and the process aborts.
//
// In other words, current implementation of Drop reads uninitialized memory.
There was a problem hiding this comment.
Note: that the code might not reproduce exactly the way I described it, but it is still reading uninitialized memory.
There was a problem hiding this comment.
Yeah! Good point. Fixed.
src/libstd/collections/hash/table.rs
Outdated
There was a problem hiding this comment.
Note: that the code might not reproduce exactly the way I described it, but it is still reading uninitialized memory.
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
You can avoid doing this mem::uninitialized dance by simply doing a
std::ptr::drop_in_place(o.elem.bucket.read_mut().1)
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
I think this will drop and uninitialized V as you only inserted the key?
There was a problem hiding this comment.
Yes, nagisa has mentioned this. I'm fixing it.
nagisa
left a comment
There was a problem hiding this comment.
I’ve only got nits left. Marking the functions internal functions as unsafe makes sense as they leave around uninitialized data which the caller should handle appropriately.
r=me once nits are fixed
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
This can probably be factored out into a separate test. (i.e. a different #[test] function)
src/libstd/collections/hash/table.rs
Outdated
There was a problem hiding this comment.
Similarly here, whole function unsafe.
src/libstd/collections/hash/table.rs
Outdated
There was a problem hiding this comment.
I’d probably make this whole function unsafe. (i.e. pub unsafe fn put key)
src/libstd/collections/hash/map.rs
Outdated
There was a problem hiding this comment.
This should probably be unsafe fn too.
|
@bors r+ |
|
Oh, bors didn’t notice the delegation above :/ |
|
@bors delegate=nagisa |
|
✌️ @nagisa can now approve this pull request |
|
@bors r+ |
|
📌 Commit 584c798 has been approved by |
Implement placement-in protocol for `HashMap` CC rust-lang#30172 r? @nagisa
CC #30172
r? @nagisa