No, the opposite. The hardware instruction has the IEEE-754-2008 behavior, and we insert canonicalizes to get the signaling-nan-as-quiet behavior (i.e., it is identical to the minimumnum expansion). The problem with the old intrinsic definition is there’s no way to get to the underlying instruction behavior. So this is the same situation AArch64 is in, it’s just AArch64 wasn’t/isn’t using the expansion to insert the canonicalizes and violated the old specification.
There aren’t any free bits for fast math flags. There’s an older RFC out scrounging for bits to add new ones. In this particular case, since these are intrinsics, you could maybe get away with adding nofpclass(snan) on the callsite’s arguments. We don’t have a way to emit that from a source language though. But if we’re asserting no signaling nans, that means the result is poison which isn’t what we want either.
By constrained versions, do you mean an additional pair of intrinsics, which are not strictfp, with these semantics? Just punting this to “use strictfp” is part of the problematic status quo we’ve been in for years. It doesn’t address the production non-strictfp case, and strictfp is a largely stalled project. But this is more or less how I envisioned minnum/maxnum landing. They are there if you know the target supports them, but otherwise should be avoided.
Yes
Not really. Turning signaling nan into poison is too strong. Users who just want the hardware instruction shouldn’t need to introduce UB on signaling nan.
If we wanted to insert nsnan or nofpclass(snan)s automatically, we’re still limited by the lack of guarantees in the IR for when quieting will occur to actually introduce that. An nsnan flag would be more useful in codegen, particularly if we strengthened the rules in SDAG/gMIR to mandate canonicalizing operations actually canonicalize.
The main one would be proving the sign bit isn’t 0, which can enable downstream no-signed-zeros assumptions. e.g. using maxnum(x, 0.0) with fuzzy zero handling can’t be assumed to produce a 0 sign bit.
It’s not required, but ordering the sign bit is a better quality of implementation (and I’m not aware of any hardware implementation which doesn’t respect this). As the possible codegen benefit is equally expressible with the nsz bit, strengthening the signed zero handling is strictly more expressive.
nsz is orthogonal to any particular operation. You can equally attach nsz to either, it doesn’t make sense to distinguish these. Forcibly embedding nsz into the underlying instruction definition doesn’t buy any benefit.
They’re not important, but that doesn’t mean we can just choose to miscompile a signaling nan this way. We’re allowed to not quiet, but returning an entirely different result is a big problem. All of the real world users who do not care about signaling nans pay the cost of additional instructions to quiet the signaling nans. We’ve gone a very long time with AArch64 and other targets not handling signaling nans as written without complaints I’ve seen.
Signed zero is much, much more important than signaling nans.
Getting the correct value and floating-point exceptions are different cases. minnum/maxnum shouldn’t require full exception support