Skip to content

Commit f846389

Browse files
committed
Auto merge of #150681 - meithecatte:always-discriminate, r=JonathanBrouwer,Nadrieril
Make operational semantics of pattern matching independent of crate and module The question of "when does matching an enum against a pattern of one of its variants read its discriminant" is currently an underspecified part of the language, causing weird behavior around borrowck, drop order, and UB. Of course, in the common cases, the discriminant must be read to distinguish the variant of the enum, but currently the following exceptions are implemented: 1. If the enum has only one variant, we currently skip the discriminant read. - This has the advantage that single-variant enums behave the same way as structs in this regard. - However, it means that if the discriminant exists in the layout, we can't say that this discriminant being invalid is UB. This makes me particularly uneasy in its interactions with niches – consider the following example ([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=5904a6155cbdd39af4a2e7b1d32a9b1a)), where miri currently doesn't detect any UB (because the semantics don't specify any): <details><summary>Example 1</summary> ```rust #![allow(dead_code)] use core::mem::{size_of, transmute}; #[repr(u8)] enum Inner { X(u8), } enum Outer { A(Inner), B(u8), } fn f(x: &Inner) { match x { Inner::X(v) => { println!("{v}"); } } } fn main() { assert_eq!(size_of::<Inner>(), 2); assert_eq!(size_of::<Outer>(), 2); let x = Outer::B(42); let y = &x; f(unsafe { transmute(y) }); } ``` </details> 2. For the purpose of the above, enums with marked with `#[non_exhaustive]` are always considered to have multiple variants when observed from foreign crates, but the actual number of variants is considered in the current crate. - This means that whether code has UB can depend on which crate it is in: #147722 - In another case of `#[non_exhaustive]` affecting the runtime semantics, its presence or absence can change what gets captured by a closure, and by extension, the drop order: #147722 (comment) - Also at the above link, there is an example where removing `#[non_exhaustive]` can cause borrowck to suddenly start failing in another crate. 3. Moreover, we currently make a more specific check: we only read the discriminant if there is more than one *inhabited* variant in the enum. - This means that the semantics can differ between `foo<!>`, and a copy of `foo` where `T` was manually replaced with `!`: #146803 - Moreover, due to the privacy rules for inhabitedness, it means that the semantics of code can depend on the *module* in which it is located. - Additionally, this inhabitedness rule is even uglier due to the fact that closure capture analysis needs to happen before we can determine whether types are uninhabited, which means that whether the discriminant read happens has a different answer specifically for capture analysis. - For the two above points, see the following example ([playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2024&gist=a07d8a3ec0b31953942e96e2130476d9)): <details><summary>Example 2</summary> ```rust #![allow(unused)] mod foo { enum Never {} struct PrivatelyUninhabited(Never); pub enum A { V(String, String), Y(PrivatelyUninhabited), } fn works(mut x: A) { let a = match x { A::V(ref mut a, _) => a, _ => unreachable!(), }; let b = match x { A::V(_, ref mut b) => b, _ => unreachable!(), }; a.len(); b.len(); } fn fails(mut x: A) { let mut f = || match x { A::V(ref mut a, _) => (), _ => unreachable!(), }; let mut g = || match x { A::V(_, ref mut b) => (), _ => unreachable!(), }; f(); g(); } } use foo::A; fn fails(mut x: A) { let a = match x { A::V(ref mut a, _) => a, _ => unreachable!(), }; let b = match x { A::V(_, ref mut b) => b, _ => unreachable!(), }; a.len(); b.len(); } fn fails2(mut x: A) { let mut f = || match x { A::V(ref mut a, _) => (), _ => unreachable!(), }; let mut g = || match x { A::V(_, ref mut b) => (), _ => unreachable!(), }; f(); g(); } ``` </details> In light of the above, and following the discussion at #138961 and #147722, this PR ~~makes it so that, operationally, matching on an enum *always* reads its discriminant.~~ introduces the following changes to this behavior: - matching on a `#[non_exhaustive]` enum will always introduce a discriminant read, regardless of whether the enum is from an external crate - uninhabited variants now count just like normal ones, and don't get skipped in the checks As per the discussion below, the resolution for point (1) above is that it should land as part of a separate PR, so that the subtler decision can be more carefully considered. Note that this is a breaking change, due to the aforementioned changes in borrow checking behavior, new UB (or at least UB newly detected by miri), as well as drop order around closure captures. However, it seems to me that the combination of this PR with #138961 should have smaller real-world impact than #138961 by itself. Fixes #142394 Fixes #146590 Fixes #146803 (though already marked as duplicate) Fixes parts of #147722 Fixes rust-lang/miri#4778 r? @Nadrieril @RalfJung @rustbot label +A-closures +A-patterns +T-opsem +T-lang
2 parents 5d04477 + af302a6 commit f846389

File tree

46 files changed

+925
-140
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+925
-140
lines changed

‎compiler/rustc_hir_typeck/src/expr_use_visitor.rs‎

Lines changed: 7 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -819,14 +819,12 @@ impl<'tcx, Cx: TypeInformationCtxt<'tcx>, D: Delegate<'tcx>> ExprUseVisitor<'tcx
819819
/// The core driver for walking a pattern
820820
///
821821
/// This should mirror how pattern-matching gets lowered to MIR, as
822-
/// otherwise lowering will ICE when trying to resolve the upvars.
822+
/// otherwise said lowering will ICE when trying to resolve the upvars.
823823
///
824824
/// However, it is okay to approximate it here by doing *more* accesses than
825825
/// the actual MIR builder will, which is useful when some checks are too
826-
/// cumbersome to perform here. For example, if after typeck it becomes
827-
/// clear that only one variant of an enum is inhabited, and therefore a
828-
/// read of the discriminant is not necessary, `walk_pat` will have
829-
/// over-approximated the necessary upvar capture granularity.
826+
/// cumbersome to perform here, because e.g. they require more typeck results
827+
/// than available.
830828
///
831829
/// Do note that discrepancies like these do still create obscure corners
832830
/// in the semantics of the language, and should be avoided if possible.
@@ -1853,26 +1851,13 @@ impl<'tcx, Cx: TypeInformationCtxt<'tcx>, D: Delegate<'tcx>> ExprUseVisitor<'tcx
18531851
}
18541852

18551853
/// Checks whether a type has multiple variants, and therefore, whether a
1856-
/// read of the discriminant might be necessary. Note that the actual MIR
1857-
/// builder code does a more specific check, filtering out variants that
1858-
/// happen to be uninhabited.
1859-
///
1860-
/// Here, it is not practical to perform such a check, because inhabitedness
1861-
/// queries require typeck results, and typeck requires closure capture analysis.
1862-
///
1863-
/// Moreover, the language is moving towards uninhabited variants still semantically
1864-
/// causing a discriminant read, so we *shouldn't* perform any such check.
1865-
///
1866-
/// FIXME(never_patterns): update this comment once the aforementioned MIR builder
1867-
/// code is changed to be insensitive to inhhabitedness.
1854+
/// read of the discriminant might be necessary.
18681855
#[instrument(skip(self, span), level = "debug")]
18691856
fn is_multivariant_adt(&self, ty: Ty<'tcx>, span: Span) -> bool {
18701857
if let ty::Adt(def, _) = self.cx.structurally_resolve_type(span, ty).kind() {
1871-
// Note that if a non-exhaustive SingleVariant is defined in another crate, we need
1872-
// to assume that more cases will be added to the variant in the future. This mean
1873-
// that we should handle non-exhaustive SingleVariant the same way we would handle
1874-
// a MultiVariant.
1875-
def.variants().len() > 1 || def.variant_list_has_applicable_non_exhaustive()
1858+
// We treat non-exhaustive enums the same independent of the crate they are
1859+
// defined in, to avoid differences in the operational semantics between crates.
1860+
def.variants().len() > 1 || def.is_variant_list_non_exhaustive()
18761861
} else {
18771862
false
18781863
}

‎compiler/rustc_mir_build/src/builder/matches/match_pair.rs‎

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -291,22 +291,18 @@ impl<'tcx> MatchPairTree<'tcx> {
291291
}
292292
}
293293

294-
PatKind::Variant { adt_def, variant_index, args, ref subpatterns } => {
294+
PatKind::Variant { adt_def, variant_index, args: _, ref subpatterns } => {
295295
let downcast_place = place_builder.downcast(adt_def, variant_index); // `(x as Variant)`
296296
cx.field_match_pairs(&mut subpairs, extra_data, downcast_place, subpatterns);
297297

298-
let irrefutable = adt_def.variants().iter_enumerated().all(|(i, v)| {
299-
i == variant_index
300-
|| !v.inhabited_predicate(cx.tcx, adt_def).instantiate(cx.tcx, args).apply(
301-
cx.tcx,
302-
cx.infcx.typing_env(cx.param_env),
303-
cx.def_id.into(),
304-
)
305-
}) && !adt_def.variant_list_has_applicable_non_exhaustive();
306-
if irrefutable {
307-
None
308-
} else {
298+
// We treat non-exhaustive enums the same independent of the crate they are
299+
// defined in, to avoid differences in the operational semantics between crates.
300+
let refutable =
301+
adt_def.variants().len() > 1 || adt_def.is_variant_list_non_exhaustive();
302+
if refutable {
309303
Some(TestableCase::Variant { adt_def, variant_index })
304+
} else {
305+
None
310306
}
311307
}
312308

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#![allow(deref_nullptr)]
2+
3+
enum Never {}
4+
5+
fn main() {
6+
unsafe {
7+
match *std::ptr::null::<Result<Never, Never>>() {
8+
//~^ ERROR: read discriminant of an uninhabited enum variant
9+
Ok(_) => {
10+
lol();
11+
}
12+
Err(_) => {
13+
wut();
14+
}
15+
}
16+
}
17+
}
18+
19+
fn lol() {
20+
println!("lol");
21+
}
22+
23+
fn wut() {
24+
println!("wut");
25+
}
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
error: Undefined Behavior: read discriminant of an uninhabited enum variant
2+
--> tests/fail/match/all_variants_uninhabited.rs:LL:CC
3+
|
4+
LL | match *std::ptr::null::<Result<Never, Never>>() {
5+
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Undefined Behavior occurred here
6+
|
7+
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
8+
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
9+
10+
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
11+
12+
error: aborting due to 1 previous error
13+

src/tools/miri/tests/fail/closures/deref-in-pattern.rs renamed to src/tools/miri/tests/fail/match/closures/deref-in-pattern.rs

File renamed without changes.

src/tools/miri/tests/fail/closures/deref-in-pattern.stderr renamed to src/tools/miri/tests/fail/match/closures/deref-in-pattern.stderr

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
error: Undefined Behavior: constructing invalid value: encountered a dangling reference (use-after-free)
2-
--> tests/fail/closures/deref-in-pattern.rs:LL:CC
2+
--> tests/fail/match/closures/deref-in-pattern.rs:LL:CC
33
|
44
LL | let _ = || {
55
| _____________^

src/tools/miri/tests/fail/closures/partial-pattern.rs renamed to src/tools/miri/tests/fail/match/closures/partial-pattern.rs

File renamed without changes.

src/tools/miri/tests/fail/closures/partial-pattern.stderr renamed to src/tools/miri/tests/fail/match/closures/partial-pattern.stderr

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
error: Undefined Behavior: constructing invalid value: encountered a dangling reference (use-after-free)
2-
--> tests/fail/closures/partial-pattern.rs:LL:CC
2+
--> tests/fail/match/closures/partial-pattern.rs:LL:CC
33
|
44
LL | let _ = || {
55
| _____________^

src/tools/miri/tests/fail/closures/uninhabited-variant.rs renamed to src/tools/miri/tests/fail/match/closures/uninhabited-variant1.rs

File renamed without changes.

src/tools/miri/tests/fail/closures/uninhabited-variant.stderr renamed to src/tools/miri/tests/fail/match/closures/uninhabited-variant1.stderr

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
error: Undefined Behavior: read discriminant of an uninhabited enum variant
2-
--> tests/fail/closures/uninhabited-variant.rs:LL:CC
2+
--> tests/fail/match/closures/uninhabited-variant1.rs:LL:CC
33
|
44
LL | match r {
55
| ^ Undefined Behavior occurred here
@@ -8,9 +8,9 @@ LL | match r {
88
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
99
= note: stack backtrace:
1010
0: main::{closure#0}
11-
at tests/fail/closures/uninhabited-variant.rs:LL:CC
11+
at tests/fail/match/closures/uninhabited-variant1.rs:LL:CC
1212
1: main
13-
at tests/fail/closures/uninhabited-variant.rs:LL:CC
13+
at tests/fail/match/closures/uninhabited-variant1.rs:LL:CC
1414

1515
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
1616

0 commit comments

Comments
 (0)