Hi, Fortran 2018 introduced intrinsic functions for all the IEEE-754 comparison operations, compareQuiet* and compareSignaling* I want to introduce those into the Fortran front-end, and make them emit the right code. But cannot find the correspondance between IEEE-754 nomenclature and GCC internal representation. I understand that the middle-end representation was mostly created with C in mind, so I assume that the correspondance is that used by the C standard. That helps me to some extent, as I can find draft documents that seem to list the following table (page 8 of https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1615.pdf): compareQuietEqual == compareQuietNotEqual != compareSignalingEqual iseqsig compareSignalingGreater > compareSignalingGreaterEqual >= compareSignalingLess < compareSignalingLessEqual <= compareSignalingNotEqual !iseqsig compareSignalingNotGreater !(x>y) compareSignalingLessUnordered !(x=>y) compareSignalingNotLess !(x<y) compareSignalingGreaterUnorder !(x<=y) compareQuietGreater isgreater compareQuietGreaterEqual isgreaterequal compareQuietLess isless compareQuietLessEqual islessequal compareQuietUnordered isunordered compareQuietNotGreater !isgreater compareQuietLessUnordered !isgreaterequal compareQuietNotLess !isless compareQuietGreaterUnordered !islessequal compareQuietOrdered !isunordered I have two questions: 1. Is this list normative, and was it modified later (I have only found a 2012 draft)? 2. All the functions are available as GCC type-generic built-ins (yeah!), except there is no __builtin_ iseqsig (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77928). Is there a fundamental problem with creating one, and could someone help there? Thanks, FX
On Thu, Sep 01, 2022 at 10:04:58AM +0200, FX wrote:
> Fortran 2018 introduced intrinsic functions for all the IEEE-754 comparison operations, compareQuiet* and compareSignaling* I want to introduce those into the Fortran front-end, and make them emit the right code. But cannot find the correspondance between IEEE-754 nomenclature and GCC internal representation.
>
> I understand that the middle-end representation was mostly created with C in mind, so I assume that the correspondance is that used by the C standard. That helps me to some extent, as I can find draft documents that seem to list the following table (page 8 of https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1615.pdf):
>
> compareQuietEqual ==
> compareQuietNotEqual !=
> compareSignalingEqual iseqsig
> compareSignalingGreater >
> compareSignalingGreaterEqual >=
> compareSignalingLess <
> compareSignalingLessEqual <=
> compareSignalingNotEqual !iseqsig
> compareSignalingNotGreater !(x>y)
> compareSignalingLessUnordered !(x=>y)
> compareSignalingNotLess !(x<y)
> compareSignalingGreaterUnorder !(x<=y)
> compareQuietGreater isgreater
> compareQuietGreaterEqual isgreaterequal
> compareQuietLess isless
> compareQuietLessEqual islessequal
> compareQuietUnordered isunordered
> compareQuietNotGreater !isgreater
> compareQuietLessUnordered !isgreaterequal
> compareQuietNotLess !isless
> compareQuietGreaterUnordered !islessequal
> compareQuietOrdered !isunordered
>
>
> I have two questions:
>
> 1. Is this list normative, and was it modified later (I have only found a 2012 draft)?
>
> 2. All the functions are available as GCC type-generic built-ins (yeah!),
> except there is no __builtin_ iseqsig
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77928). Is there a
> fundamental problem with creating one, and could someone help there?
IMHO until that one is implemented you can just use
tx = x, ty = y, tx>=ty && tx<=ty
(in GENERIC just SAVE_EXPR<x> >= SAVE_EXPR<y> && SAVE_EXPR<x> <= SAVE_EXPR<y>
PowerPC backend is still broken, not just for that but for most other cases
above, it doesn't violate just Fortran requirements, but C too.
Jakub
Hi Jakub,
>> 2. All the functions are available as GCC type-generic built-ins (yeah!),
>> except there is no __builtin_ iseqsig
>> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77928). Is there a
>> fundamental problem with creating one, and could someone help there?
>
> IMHO until that one is implemented you can just use
> tx = x, ty = y, tx>=ty && tx<=ty
> (in GENERIC just SAVE_EXPR<x> >= SAVE_EXPR<y> && SAVE_EXPR<x> <= SAVE_EXPR<y>
If it’s just that (optimization aside), I probably can create a C built-in. It would need to be:
1. defined in builtins.def
2. lowered in builtins.cc
3. type-checked in c-family/c-common.cc
4. documented in doc/extend.texi
5. tested in fp-test.cc
6. covered in the testsuite
Is that right?
Thanks,
FX
PS: I see that reclassify is not covered in fp-test.cc, is that file obsolete?
On Thu, Sep 01, 2022 at 11:04:03AM +0200, FX wrote:
> Hi Jakub,
>
> >> 2. All the functions are available as GCC type-generic built-ins (yeah!),
> >> except there is no __builtin_ iseqsig
> >> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77928). Is there a
> >> fundamental problem with creating one, and could someone help there?
> >
> > IMHO until that one is implemented you can just use
> > tx = x, ty = y, tx>=ty && tx<=ty
> > (in GENERIC just SAVE_EXPR<x> >= SAVE_EXPR<y> && SAVE_EXPR<x> <= SAVE_EXPR<y>
>
> If it’s just that (optimization aside), I probably can create a C built-in. It would need to be:
>
> 1. defined in builtins.def
> 2. lowered in builtins.cc
> 3. type-checked in c-family/c-common.cc
> 4. documented in doc/extend.texi
> 5. tested in fp-test.cc
> 6. covered in the testsuite
>
> Is that right?
Dunno if we really need a builtin for this, especially if it is lowered
to that x >= y && x <= y early, will defer to Joseph.
Because if it is for better code generation only, IMNSHO we want to optimize
even when users write it that way by hand and so want to pattern recognize
that during instruction selection before expansion (isel pass) or during
expansion if target can do that.
E.g. x86 with AVX can do that:
where the 4 booleans are A>B, A<B, A=B, Unordered and Yes/No is whether
signal is raised when one or both of the operands are QNaN (it is raised
always if at least one is SNaN):
EQ_OQ (EQ) 0H Equal (ordered, non-signaling) False False True False No
LT_OS (LT) 1H Less-than (ordered, signaling) False True False False Yes
LE_OS (LE) 2H Less-than-or-equal (ordered, signaling) False True True False Yes
UNORD_Q (UNORD) 3H Unordered (non-signaling) False False False True No
NEQ_UQ (NEQ) 4H Not-equal (unordered, non-signaling) True True False True No
NLT_US (NLT) 5H Not-less-than (unordered, signaling) True False True True Yes
NLE_US (NLE) 6H Not-less-than-or-equal (unordered, signaling) True False False True Yes
ORD_Q (ORD) 7H Ordered (non-signaling) True True True False No
EQ_UQ 8H Equal (unordered, non-signaling) False False True True No
NGE_US (NGE) 9H Not-greater-than-or-equal (unordered, signaling) False True False True Yes
NGT_US (NGT) AH Not-greater-than (unordered, signaling) False True True True Yes
FALSE_OQ (FALSE) BH False (ordered, non-signaling) False False False False No
NEQ_OQ CH Not-equal (ordered, non-signaling) True True False False No
GE_OS (GE) DH Greater-than-or-equal (ordered, signaling) True False True False Yes
GT_OS (GT) EH Greater-than (ordered, signaling) True False False False Yes
TRUE_UQ (TRUE) FH True (unordered, non-signaling) True True True True No
EQ_OS 10H Equal (ordered, signaling) False False True False Yes
LT_OQ 11H Less-than (ordered, non-signaling) False True False False No
LE_OQ 12H Less-than-or-equal (ordered, non-signaling) False True True False No
UNORD_S 13H Unordered (signaling) False False False True Yes
NEQ_US 14H Not-equal (unordered, signaling) True True False True Yes
NLT_UQ 15H Not-less-than (unordered, non-signaling) True False True True No
NLE_UQ 16H Not-less-than-or-equal (unordered, non-signaling) True False False True No
ORD_S 17H Ordered (signaling) True True True False Yes
EQ_US 18H Equal (unordered, signaling) False False True True Yes
NGE_UQ 19H Not-greater-than-or-equal (unordered, non-signaling) False True False True No
NGT_UQ 1AH Not-greater-than (unordered, non-signaling) False True True True No
FALSE_OS 1BH False (ordered, signaling) False False False False Yes
NEQ_OS 1CH Not-equal (ordered, signaling) True True False False Yes
GE_OQ 1DH Greater-than-or-equal (ordered, non-signaling) True False True False No
GT_OQ 1EH Greater-than (ordered, non-signaling) True False False False No
TRUE_US 1FH True (unordered, signaling) True True True True Yes
So x >= y && x <= y can be handled using vcmpeq_ossd or similar instructions.
Jakub
[-- Attachment #1: Type: text/plain, Size: 1013 bytes --] Hi, > Dunno if we really need a builtin for this, especially if it is lowered > to that x >= y && x <= y early, will defer to Joseph. I think it’d be nice to have one for consistency, as the other standard floating-point functions are there. It would also make things slightly easier for our Fortran implementation, although admittedly we can do without. A tentative patch is attached, it seems to work well on simple examples, but for test coverage the hard part is going to be that the comparisons seem to be optimised away very easily into their non-signaling versions. Basically, if I do: float x = __builtin_nanf(""); printf("%d\n", __builtin_iseqsig(__builtin_nanf(""), __builtin_inff())); printf("%d\n", __builtin_iseqsig(x, __builtin_inff())); With -O0 -fno-unsafe-math-optimizations -frounding-math -fsignaling-nans: first one does not raise invalid, second one does. With -O2 -fno-unsafe-math-optimizations -frounding-math -fsignaling-nans: no invalid raised at all. FX [-- Attachment #2: iseqsig.diff --] [-- Type: application/octet-stream, Size: 4360 bytes --] diff --git a/gcc/builtins.cc b/gcc/builtins.cc index f1f7c0ce337..bf6bf2809d8 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -171,6 +171,7 @@ static tree fold_builtin_fabs (location_t, tree, tree); static tree fold_builtin_abs (location_t, tree, tree); static tree fold_builtin_unordered_cmp (location_t, tree, tree, tree, enum tree_code, enum tree_code); +static tree fold_builtin_iseqsig (location_t, tree, tree); static tree fold_builtin_varargs (location_t, tree, tree*, int); static tree fold_builtin_strpbrk (location_t, tree, tree, tree, tree); @@ -9404,6 +9405,42 @@ fold_builtin_unordered_cmp (location_t loc, tree fndecl, tree arg0, tree arg1, fold_build2_loc (loc, code, type, arg0, arg1)); } +/* Fold a call to __builtin_iseqsig(). ARG0 and ARG1 are the arguments. + After choosing the wider floating-point type for the comparison, + the code is folded to: + SAVE_EXPR<ARG0> >= SAVE_EXPR<ARG1> && SAVE_EXPR<ARG0> <= SAVE_EXPR<ARG1> */ + +static tree +fold_builtin_iseqsig (location_t loc, tree arg0, tree arg1) +{ + tree type0, type1; + enum tree_code code0, code1; + tree cmp1, cmp2, cmp_type = NULL_TREE; + + type0 = TREE_TYPE (arg0); + type1 = TREE_TYPE (arg1); + + code0 = TREE_CODE (type0); + code1 = TREE_CODE (type1); + + if (code0 == REAL_TYPE && code1 == REAL_TYPE) + /* Choose the wider of two real types. */ + cmp_type = TYPE_PRECISION (type0) >= TYPE_PRECISION (type1) + ? type0 : type1; + else if (code0 == REAL_TYPE && code1 == INTEGER_TYPE) + cmp_type = type0; + else if (code0 == INTEGER_TYPE && code1 == REAL_TYPE) + cmp_type = type1; + + arg0 = builtin_save_expr (fold_convert_loc (loc, cmp_type, arg0)); + arg1 = builtin_save_expr (fold_convert_loc (loc, cmp_type, arg1)); + + cmp1 = fold_build2_loc (loc, GE_EXPR, integer_type_node, arg0, arg1); + cmp2 = fold_build2_loc (loc, LE_EXPR, integer_type_node, arg0, arg1); + + return fold_build2_loc (loc, TRUTH_AND_EXPR, integer_type_node, cmp1, cmp2); +} + /* Fold __builtin_{,s,u}{add,sub,mul}{,l,ll}_overflow, either into normal arithmetics if it can never overflow, or into internal functions that return both result of arithmetics and overflowed boolean flag in @@ -9791,6 +9828,9 @@ fold_builtin_2 (location_t loc, tree expr, tree fndecl, tree arg0, tree arg1) arg0, arg1, UNORDERED_EXPR, NOP_EXPR); + case BUILT_IN_ISEQSIG: + return fold_builtin_iseqsig (loc, arg0, arg1); + /* We do the folding for va_start in the expander. */ case BUILT_IN_VA_START: break; @@ -11303,6 +11343,7 @@ is_inexpensive_builtin (tree decl) case BUILT_IN_ISLESSEQUAL: case BUILT_IN_ISLESSGREATER: case BUILT_IN_ISUNORDERED: + case BUILT_IN_ISEQSIG: case BUILT_IN_VA_ARG_PACK: case BUILT_IN_VA_ARG_PACK_LEN: case BUILT_IN_VA_COPY: diff --git a/gcc/builtins.def b/gcc/builtins.def index f0236316850..8fab9dc3f1b 100644 --- a/gcc/builtins.def +++ b/gcc/builtins.def @@ -908,6 +908,7 @@ DEF_GCC_BUILTIN (BUILT_IN_ISLESS, "isless", BT_FN_INT_VAR, ATTR_CONST_NOT DEF_GCC_BUILTIN (BUILT_IN_ISLESSEQUAL, "islessequal", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) DEF_GCC_BUILTIN (BUILT_IN_ISLESSGREATER, "islessgreater", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) DEF_GCC_BUILTIN (BUILT_IN_ISUNORDERED, "isunordered", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) +DEF_GCC_BUILTIN (BUILT_IN_ISEQSIG, "iseqsig", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) DEF_GCC_BUILTIN (BUILT_IN_ISSIGNALING, "issignaling", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) DEF_LIB_BUILTIN (BUILT_IN_LABS, "labs", BT_FN_LONG_LONG, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_C99_BUILTIN (BUILT_IN_LLABS, "llabs", BT_FN_LONGLONG_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST) diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc index 1eb842e1c7b..44d30436e47 100644 --- a/gcc/c-family/c-common.cc +++ b/gcc/c-family/c-common.cc @@ -6330,6 +6330,7 @@ check_builtin_function_arguments (location_t loc, vec<location_t> arg_loc, case BUILT_IN_ISLESSEQUAL: case BUILT_IN_ISLESSGREATER: case BUILT_IN_ISUNORDERED: + case BUILT_IN_ISEQSIG: if (builtin_function_validate_nargs (loc, fndecl, nargs, 2)) { enum tree_code code0, code1;
On Thu, Sep 01, 2022 at 10:19:59AM +0200, Jakub Jelinek via Gcc wrote:
> On Thu, Sep 01, 2022 at 10:04:58AM +0200, FX wrote:
> > 2. All the functions are available as GCC type-generic built-ins (yeah!),
> > except there is no __builtin_ iseqsig
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77928). Is there a
> > fundamental problem with creating one, and could someone help there?
>
> IMHO until that one is implemented you can just use
> tx = x, ty = y, tx>=ty && tx<=ty
> (in GENERIC just SAVE_EXPR<x> >= SAVE_EXPR<y> && SAVE_EXPR<x> <= SAVE_EXPR<y>
> PowerPC backend is still broken, not just for that but for most other cases
> above, it doesn't violate just Fortran requirements, but C too.
See my talk at the GCC cauldron 2019 (Montréal) for what needs to be
done in the generic handling for us to be able to fix what the rs6000
backend does here.
Segher
On Thu, 1 Sep 2022, FX via Gcc wrote:
> 1. Is this list normative, and was it modified later (I have only found
> a 2012 draft)?
See N3047 Annex F for the current bindings (there have been a lot of
changes to the C2x working draft after N3047 in the course of editorial
review, but I don't think any of them affect the IEEE bindings for
comparisons).
--
Joseph S. Myers
joseph@codesourcery.com
On Thu, 1 Sep 2022, FX via Gcc wrote:
> A tentative patch is attached, it seems to work well on simple examples,
> but for test coverage the hard part is going to be that the comparisons
> seem to be optimised away very easily into their non-signaling versions.
> Basically, if I do:
Presumably that can be reproduced without depending on the new built-in
function? In which case it's an existing bug somewhere in the optimizers.
--
Joseph S. Myers
joseph@codesourcery.com
On Thu, 1 Sep 2022, Joseph Myers wrote:
> On Thu, 1 Sep 2022, FX via Gcc wrote:
>
>> A tentative patch is attached, it seems to work well on simple examples,
>> but for test coverage the hard part is going to be that the comparisons
>> seem to be optimised away very easily into their non-signaling versions.
>> Basically, if I do:
>
> Presumably that can be reproduced without depending on the new built-in
> function? In which case it's an existing bug somewhere in the optimizers.
(simplify
(cmp @0 REAL_CST@1)
[...]
(if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
&& !tree_expr_signaling_nan_p (@1)
&& !tree_expr_maybe_signaling_nan_p (@0))
{ constant_boolean_node (cmp == NE_EXPR, type); })
only tries to preserve a comparison with sNaN, but not with qNaN. There
are probably other issues since various gcc devs used to have a different
opinion on the meaning of -ftrapping-math.
--
Marc Glisse
On Thu, 1 Sep 2022, Marc Glisse via Gcc wrote:
> On Thu, 1 Sep 2022, Joseph Myers wrote:
>
> > On Thu, 1 Sep 2022, FX via Gcc wrote:
> >
> > > A tentative patch is attached, it seems to work well on simple examples,
> > > but for test coverage the hard part is going to be that the comparisons
> > > seem to be optimised away very easily into their non-signaling versions.
> > > Basically, if I do:
> >
> > Presumably that can be reproduced without depending on the new built-in
> > function? In which case it's an existing bug somewhere in the optimizers.
>
> (simplify
> (cmp @0 REAL_CST@1)
> [...]
> (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
> && !tree_expr_signaling_nan_p (@1)
> && !tree_expr_maybe_signaling_nan_p (@0))
> { constant_boolean_node (cmp == NE_EXPR, type); })
>
> only tries to preserve a comparison with sNaN, but not with qNaN. There are
So that needs to take more care about what comparison operations are
involved. Since such an optimization is fine for quiet comparisons such
as ==, != or isless, but not for signaling comparisons such as < <= > >=
(subject to any question of splitting up -ftrapping-math into more
fine-grained options allowing different transformations).
--
Joseph S. Myers
joseph@codesourcery.com
> Presumably that can be reproduced without depending on the new built-in
> function? In which case it's an existing bug somewhere in the optimizers.
Yes:
$ cat a.c
#include <math.h>
#include <stdio.h>
#include <fenv.h>
void foo (void) {
if (fetestexcept (FE_INVALID) & FE_INVALID)
printf("Invalid raised\n");
feclearexcept (FE_ALL_EXCEPT);
}
static inline int iseqsig(float x, float y) { return (x >= y && x <= y); }
int main (void) {
float x = __builtin_nanf("");
float y;
printf("%d\n", iseqsig(__builtin_nanf(""), 1.));
foo();
printf("%d\n", iseqsig(x, __builtin_inff()));
foo();
}
$ ./bin/gcc a.c -lm -fno-unsafe-math-optimizations -frounding-math -fsignaling-nans -O0 && ./a.out
0
Invalid raised
0
Invalid raised
$ ./bin/gcc a.c -lm -fno-unsafe-math-optimizations -frounding-math -fsignaling-nans -O1 && ./a.out
0
0
Do you want me to file a bug report?
FX
On Thu, 1 Sep 2022, FX via Gcc wrote:
> Do you want me to file a bug report?
Yes.
--
Joseph S. Myers
joseph@codesourcery.com
For the record, this is now PR 106805 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106805 FX
Hi Joseph, I have a Fortran patch ready to submit, but before I do so I’d like to know: do you support or oppose a __builtin_iseqsig()? Jakub said he was against, but deferred to you on that. For me, it makes the Fortran front-end part slightly simpler, so if it is decided to go that route I’ll propose a middle-end + C patch first. But I do not need it absolutely. Thanks, FX
> See N3047 Annex F for the current bindings (there have been a lot of
> changes to the C2x working draft after N3047 in the course of editorial
> review, but I don't think any of them affect the IEEE bindings for
> comparisons).
Thanks for the pointer, it is very helpful.
The next thing I need to tackle for Fortran is the implementation of functions that perform maxNum, maxNumMag, minNum, and minNumMag.
Am I correct in assuming that maxNum and minNum correspond to fmin and fmax? Are there builtins for maxNumMag and minNumMag? Or does anyone know what the “canonical” way to perform it is? I do not want to mess up corners cases, which is so easy to do…
Thanks again,
FX
On Thu, 1 Sep 2022, FX via Gcc wrote:
> I have a Fortran patch ready to submit, but before I do so I’d like to
> know: do you support or oppose a __builtin_iseqsig()?
I support having such a built-in function.
--
Joseph S. Myers
joseph@codesourcery.com
On Thu, 1 Sep 2022, FX via Gcc wrote: > The next thing I need to tackle for Fortran is the implementation of > functions that perform maxNum, maxNumMag, minNum, and minNumMag. Am I > correct in assuming that maxNum and minNum correspond to fmin and fmax? Yes (note that maxNum and minNum were removed in IEEE 754-2019, but they're still what fmax and fmin correspond to; the new minimum / maximum operations in IEEE 754-2019 are provided by new functions in C2x). > Are there builtins for maxNumMag and minNumMag? Or does anyone know what > the “canonical” way to perform it is? I do not want to mess up corners > cases, which is so easy to do… TS 18661-1 defined functions fmaxmag and fminmag for those; we don't have built-in functions for them, and C2x does not include those functions given that those operations were also removed in IEEE 754-2019. -- Joseph S. Myers joseph@codesourcery.com