Re: [PATCH] middle-end, c++, i386, libgcc, v2: std::bfloat16_t and __bf16 arithmetic support

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Uros Bizjak <ubizjak@gmail.com>
To: Jason Merrill <jason@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
	"Joseph S. Myers" <joseph@codesourcery.com>,
	 Richard Biener <rguenther@suse.de>,
	Jeff Law <jeffreyalaw@gmail.com>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] middle-end, c++, i386, libgcc, v2: std::bfloat16_t and __bf16 arithmetic support
Date: Thu, 13 Oct 2022 23:11:53 +0200	[thread overview]
Message-ID: <CAFULd4bEc3v64wXJcYL0-NMED0P48K98j_mQjSuiVYp+PrKK2Q@mail.gmail.com> (raw)
In-Reply-To: <5598547f-ce63-6b4d-31e4-a15f57b8f224@redhat.com>

On Thu, Oct 13, 2022 at 9:38 PM Jason Merrill <jason@redhat.com> wrote:
>
> On 10/13/22 12:50, Jakub Jelinek wrote:
> > Hi!
> >
> > On Wed, Oct 05, 2022 at 04:02:25PM -0400, Jason Merrill wrote:
> >>> As I wrote earlier, I think we need at least one, __builtin_nans variant
> >>> which would be used in libstdc++
> >>> std::numeric_limits<std::bfloat16_t>::signaling_NaN() implementation.
> >>> I think
> >>> std::numeric_limits<std::bfloat16_t>::infinity() can be implemented as
> >>> return (__bf16) __builtin_huge_valf ();
> >>> and similarly
> >>> std::numeric_limits<std::bfloat16_t>::quiet_NaN() as
> >>> return (__bf16) __builtin_nanf ("");
> >>> but
> >>> return (__bf16) __builtin_nansf ("");
> >>> would loose the signaling NaN on the conversion and raise exception,
> >>> and as the method is constexpr,
> >>> union { unsigned short a; __bf16 b; } u = { 0x7f81 };
> >>> return u.b;
> >>> wouldn't work.  I can certainly restrict the builtins to the single
> >>> one, but wonder whether the suffix for that builtin shouldn't be chosen
> >>> such that eventually we could add more builtins if we need to
> >>> and don't run into the log with bf16 suffix vs. logb with f16 suffix
> >>> ambiguity.
> >>> As you said, most of the libstdc++ overloads for std::bfloat16_t then
> >>> can use float builtins or library calls under the hood, but std::nextafter
> >>> is another case where I think we'll need to have something bfloat16_t
> >>> specific, because float ulp isn't bfloat16_t ulp, the latter is much larger.
> >>
> >> Makes sense.
> >
> > So, this updated version of the patch adds just a single __builtin_nansf16b
> > builtin (or do you want __builtin_nansbf16?).
>
> 16b sounds fine.
>
> >>> Based on what Joseph wrote, I'll add bf16/BF16 suffix support for C too
> >>> in the next iteration (always with pedwarn in that case).
> >
> > And implements bf16/BF16 suffixes for C too.
> >
> >>> I'm afraid too many places rely on all modes of a certain class to be
> >>> visible when walking from "narrowest" to "widest" mode, say
> >>> FOR_EACH_MODE_IN_CLASS/FOR_EACH_MODE/FOR_EACH_MODE_UNTIL/FOR_EACH_WIDER_MODE
> >>> etc. wouldn't work at all if GET_MODE_WIDER_MODE (BFmode) == SFmode
> >>> && GET_MODE_WIDER_MODE (HFmode) == SFmode.
> >>
> >> Yes, it seems they need to change now that their assumptions have been
> >> violated.  I suppose FOR_EACH_MODE_IN_CLASS would need to change to not use
> >> get_wider, and users of FOR_EACH_MODE/FOR_EACH_MODE_UNTIL need to decide
> >> whether they want an iteration that uses get_wider (likely with a new name)
> >> or not.
> >
> > And now that the GET_MODE_WIDER_MODE vs. GET_MODE_NEXT_MODE patch is in,
> > is updated on top of those changes.
> >
> > So far lightly tested on x86_64-linux, ok for trunk if it passes full
> > bootstrap/regtest on both x86_64-linux and i686-linux?
>
> LGTM, but a i386 maintainer should review it as well.

OK with two changes  to cbranch and cstore expanders, as explained inline.

Thanks,
Uros.

> > 2022-10-13  Jakub Jelinek  <jakub@redhat.com>
> >
> > gcc/
> >       * tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
> >       * tree.h (bfloat16_type_node): Define.
> >       * tree.cc (excess_precision_type): Promote bfloat16_type_mode
> >       like float16_type_mode.
> >       (build_common_tree_nodes): Initialize bfloat16_type_node if
> >       BFmode is supported.
> >       * expmed.h (maybe_expand_shift): Declare.
> >       * expmed.cc (maybe_expand_shift): No longer static.
> >       * expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
> >       conversions.  If there is no optab, handle BF -> {DF,XF,TF,HF}
> >       conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
> >       -ffast-math generic implementation for BF -> SF and SF -> BF
> >       conversions.
> >       * builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New.
> >       * builtins.def (BUILT_IN_NANSF16B): New builtin.
> >       * fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B.
> >       * config/i386/i386.cc (classify_argument): Handle E_BCmode.
> >       (ix86_libgcc_floating_mode_supported_p): Also return true for BFmode
> >       for -msse2.
> >       (ix86_mangle_type): Mangle BFmode as DF16b.
> >       (ix86_invalid_conversion, ix86_invalid_unary_op,
> >       ix86_invalid_binary_op): Remove.
> >       (TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP,
> >       TARGET_INVALID_BINARY_OP): Don't redefine.
> >       * config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove.
> >       (ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than
> >       ix86_bf16_type_node, only create it if still NULL.
> >       * config/i386/i386-builtin-types.def (BFLOAT16): Likewise.
> >       * config/i386/i386.md (cbranchbf4, cstorebf4): New expanders.
> > gcc/c-family/
> >       * c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node,
> >       predefine __BFLT16_*__ macros and for C++23 also
> >       __STDCPP_BFLOAT16_T__.  Predefine bfloat16_type_node related
> >       macros for -fbuilding-libgcc.
> >       * c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16.
> > gcc/c/
> >       * c-typeck.cc (convert_arguments): Don't promote __bf16 to
> >       double.
> > gcc/cp/
> >       * cp-tree.h (extended_float_type_p): Return true for
> >       bfloat16_type_node.
> >       * typeck.cc (cp_compare_floating_point_conversion_ranks): Set
> >       extended{1,2} if mv{1,2} is bfloat16_type_node.  Adjust comment.
> > gcc/testsuite/
> >       * lib/target-supports.exp (check_effective_target_bfloat16,
> >       check_effective_target_bfloat16_runtime, add_options_for_bfloat16):
> >       New.
> >       * gcc.dg/torture/bfloat16-basic.c: New test.
> >       * gcc.dg/torture/bfloat16-builtin.c: New test.
> >       * gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test.
> >       * gcc.dg/torture/bfloat16-complex.c: New test.
> >       * gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable
> >       from bfloat16-builtin-issignaling-1.c.
> >       * gcc.dg/torture/floatn-basic.h: Allow to be includable from
> >       bfloat16-basic.c.
> >       * gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected
> >       diagnostics.
> >       * gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise.
> >       * gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
> >       * g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
> > libcpp/
> >       * include/cpplib.h (CPP_N_BFLOAT16): Define.
> >       * expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for
> >       C++.
> > libgcc/
> >       * config/i386/t-softfp (softfp_extensions): Add bfsf.
> >       (softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf.
> >       (CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c,
> >       CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add
> >       -msse2.
> >       * config/i386/libgcc-glibc.ver (GCC_13.0.0): Export
> >       __extendbfsf2 and __trunc{s,d,x,t,h}fbf2.
> >       * config/i386/sfp-machine.h (_FP_NANSIGN_B): Define.
> >       * config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define.
> >       * config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define.
> >       * soft-fp/brain.h: New file.
> >       * soft-fp/truncsfbf2.c: New file.
> >       * soft-fp/truncdfbf2.c: New file.
> >       * soft-fp/truncxfbf2.c: New file.
> >       * soft-fp/trunctfbf2.c: New file.
> >       * soft-fp/trunchfbf2.c: New file.
> >       * soft-fp/truncbfhf2.c: New file.
> >       * soft-fp/extendbfsf2.c: New file.
> > libiberty/
> >       * cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment.
> >       * cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t
> >       entry.
> >       (cplus_demangle_type): Demangle DF16b.
> >       * testsuite/demangle-expected (_Z3xxxDF16b): New test.
> >
> > --- gcc/tree-core.h.jj        2022-10-10 09:31:57.683981308 +0200
> > +++ gcc/tree-core.h   2022-10-13 16:57:08.953775013 +0200
> > @@ -665,6 +665,9 @@ enum tree_index {
> >     TI_DOUBLE_TYPE,
> >     TI_LONG_DOUBLE_TYPE,
> >
> > +  /* __bf16 type if supported (used in C++ as std::bfloat16_t).  */
> > +  TI_BFLOAT16_TYPE,
> > +
> >     /* The _FloatN and _FloatNx types must be consecutive, and in the
> >        same sequence as the corresponding complex types, which must also
> >        be consecutive; _FloatN must come before _FloatNx; the order must
> > --- gcc/tree.h.jj     2022-10-10 09:31:57.766980149 +0200
> > +++ gcc/tree.h        2022-10-13 17:22:14.728207071 +0200
> > @@ -4291,6 +4291,7 @@ tree_strip_any_location_wrapper (tree ex
> >   #define float_type_node                     global_trees[TI_FLOAT_TYPE]
> >   #define double_type_node            global_trees[TI_DOUBLE_TYPE]
> >   #define long_double_type_node               global_trees[TI_LONG_DOUBLE_TYPE]
> > +#define bfloat16_type_node           global_trees[TI_BFLOAT16_TYPE]
> >
> >   /* Nodes for particular _FloatN and _FloatNx types in sequence.  */
> >   #define FLOATN_TYPE_NODE(IDX)               global_trees[TI_FLOATN_TYPE_FIRST + (IDX)]
> > --- gcc/tree.cc.jj    2022-10-10 09:31:57.743980470 +0200
> > +++ gcc/tree.cc       2022-10-13 16:57:08.956774972 +0200
> > @@ -7711,7 +7711,7 @@ excess_precision_type (tree type)
> >       = (flag_excess_precision == EXCESS_PRECISION_FAST
> >          ? EXCESS_PRECISION_TYPE_FAST
> >          : (flag_excess_precision == EXCESS_PRECISION_FLOAT16
> > -       ? EXCESS_PRECISION_TYPE_FLOAT16 :EXCESS_PRECISION_TYPE_STANDARD));
> > +       ? EXCESS_PRECISION_TYPE_FLOAT16 : EXCESS_PRECISION_TYPE_STANDARD));
> >
> >     enum flt_eval_method target_flt_eval_method
> >       = targetm.c.excess_precision (requested_type);
> > @@ -7736,6 +7736,9 @@ excess_precision_type (tree type)
> >     machine_mode float16_type_mode = (float16_type_node
> >                                   ? TYPE_MODE (float16_type_node)
> >                                   : VOIDmode);
> > +  machine_mode bfloat16_type_mode = (bfloat16_type_node
> > +                                  ? TYPE_MODE (bfloat16_type_node)
> > +                                  : VOIDmode);
> >     machine_mode float_type_mode = TYPE_MODE (float_type_node);
> >     machine_mode double_type_mode = TYPE_MODE (double_type_node);
> >
> > @@ -7747,16 +7750,19 @@ excess_precision_type (tree type)
> >       switch (target_flt_eval_method)
> >         {
> >         case FLT_EVAL_METHOD_PROMOTE_TO_FLOAT:
> > -         if (type_mode == float16_type_mode)
> > +         if (type_mode == float16_type_mode
> > +             || type_mode == bfloat16_type_mode)
> >             return float_type_node;
> >           break;
> >         case FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE:
> >           if (type_mode == float16_type_mode
> > +             || type_mode == bfloat16_type_mode
> >               || type_mode == float_type_mode)
> >             return double_type_node;
> >           break;
> >         case FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE:
> >           if (type_mode == float16_type_mode
> > +             || type_mode == bfloat16_type_mode
> >               || type_mode == float_type_mode
> >               || type_mode == double_type_mode)
> >             return long_double_type_node;
> > @@ -7774,16 +7780,19 @@ excess_precision_type (tree type)
> >       switch (target_flt_eval_method)
> >         {
> >         case FLT_EVAL_METHOD_PROMOTE_TO_FLOAT:
> > -         if (type_mode == float16_type_mode)
> > +         if (type_mode == float16_type_mode
> > +             || type_mode == bfloat16_type_mode)
> >             return complex_float_type_node;
> >           break;
> >         case FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE:
> >           if (type_mode == float16_type_mode
> > +             || type_mode == bfloat16_type_mode
> >               || type_mode == float_type_mode)
> >             return complex_double_type_node;
> >           break;
> >         case FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE:
> >           if (type_mode == float16_type_mode
> > +             || type_mode == bfloat16_type_mode
> >               || type_mode == float_type_mode
> >               || type_mode == double_type_mode)
> >             return complex_long_double_type_node;
> > @@ -9462,6 +9471,17 @@ build_common_tree_nodes (bool signed_cha
> >         SET_TYPE_MODE (FLOATN_NX_TYPE_NODE (i), mode);
> >       }
> >     float128t_type_node = float128_type_node;
> > +#ifdef HAVE_BFmode
> > +  if (REAL_MODE_FORMAT (BFmode) == &arm_bfloat_half_format
> > +      && targetm.scalar_mode_supported_p (BFmode)
> > +      && targetm.libgcc_floating_mode_supported_p (BFmode))
> > +    {
> > +      bfloat16_type_node = make_node (REAL_TYPE);
> > +      TYPE_PRECISION (bfloat16_type_node) = GET_MODE_PRECISION (BFmode);
> > +      layout_type (bfloat16_type_node);
> > +      SET_TYPE_MODE (bfloat16_type_node, BFmode);
> > +    }
> > +#endif
> >
> >     float_ptr_type_node = build_pointer_type (float_type_node);
> >     double_ptr_type_node = build_pointer_type (double_type_node);
> > --- gcc/expmed.h.jj   2022-10-03 18:00:53.046735271 +0200
> > +++ gcc/expmed.h      2022-10-13 16:57:08.957774958 +0200
> > @@ -707,6 +707,8 @@ extern rtx expand_variable_shift (enum t
> >                                 rtx, tree, rtx, int);
> >   extern rtx expand_shift (enum tree_code, machine_mode, rtx, poly_int64, rtx,
> >                        int);
> > +extern rtx maybe_expand_shift (enum tree_code, machine_mode, rtx, int, rtx,
> > +                            int);
> >   #ifdef GCC_OPTABS_H
> >   extern rtx expand_divmod (int, enum tree_code, machine_mode, rtx, rtx,
> >                         rtx, int, enum optab_methods = OPTAB_LIB_WIDEN);
> > --- gcc/expmed.cc.jj  2022-10-13 16:22:17.755496384 +0200
> > +++ gcc/expmed.cc     2022-10-13 16:57:08.957774958 +0200
> > @@ -2705,7 +2705,7 @@ expand_shift (enum tree_code code, machi
> >
> >   /* Likewise, but return 0 if that cannot be done.  */
> >
> > -static rtx
> > +rtx
> >   maybe_expand_shift (enum tree_code code, machine_mode mode, rtx shifted,
> >                   int amount, rtx target, int unsignedp)
> >   {
> > --- gcc/expr.cc.jj    2022-10-06 17:43:47.941502119 +0200
> > +++ gcc/expr.cc       2022-10-13 16:57:09.022774066 +0200
> > @@ -344,7 +344,11 @@ convert_mode_scalar (rtx to, rtx from, i
> >         gcc_assert ((GET_MODE_PRECISION (from_mode)
> >                  != GET_MODE_PRECISION (to_mode))
> >                 || (DECIMAL_FLOAT_MODE_P (from_mode)
> > -                   != DECIMAL_FLOAT_MODE_P (to_mode)));
> > +                   != DECIMAL_FLOAT_MODE_P (to_mode))
> > +               || (REAL_MODE_FORMAT (from_mode) == &arm_bfloat_half_format
> > +                   && REAL_MODE_FORMAT (to_mode) == &ieee_half_format)
> > +               || (REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format
> > +                   && REAL_MODE_FORMAT (from_mode) == &ieee_half_format));
> >
> >         if (GET_MODE_PRECISION (from_mode) == GET_MODE_PRECISION (to_mode))
> >       /* Conversion between decimal float and binary float, same size.  */
> > @@ -364,6 +368,150 @@ convert_mode_scalar (rtx to, rtx from, i
> >         return;
> >       }
> >
> > +#ifdef HAVE_SFmode
> > +      if (REAL_MODE_FORMAT (from_mode) == &arm_bfloat_half_format
> > +       && REAL_MODE_FORMAT (SFmode) == &ieee_single_format)
> > +     {
> > +       if (GET_MODE_PRECISION (to_mode) > GET_MODE_PRECISION (SFmode))
> > +         {
> > +           /* To cut down on libgcc size, implement
> > +              BFmode -> {DF,XF,TF}mode conversions by
> > +              BFmode -> SFmode -> {DF,XF,TF}mode conversions.  */
> > +           rtx temp = gen_reg_rtx (SFmode);
> > +           convert_mode_scalar (temp, from, unsignedp);
> > +           convert_mode_scalar (to, temp, unsignedp);
> > +           return;
> > +         }
> > +       if (REAL_MODE_FORMAT (to_mode) == &ieee_half_format)
> > +         {
> > +           /* Similarly, implement BFmode -> HFmode as
> > +              BFmode -> SFmode -> HFmode conversion where SFmode
> > +              has superset of BFmode values.  We don't need
> > +              to handle sNaNs by raising exception and turning
> > +              into into qNaN though, as that can be done in the
> > +              SFmode -> HFmode conversion too.  */
> > +           rtx temp = gen_reg_rtx (SFmode);
> > +           int save_flag_finite_math_only = flag_finite_math_only;
> > +           flag_finite_math_only = true;
> > +           convert_mode_scalar (temp, from, unsignedp);
> > +           flag_finite_math_only = save_flag_finite_math_only;
> > +           convert_mode_scalar (to, temp, unsignedp);
> > +           return;
> > +         }
> > +       if (to_mode == SFmode
> > +           && !HONOR_NANS (from_mode)
> > +           && !HONOR_NANS (to_mode)
> > +           && optimize_insn_for_speed_p ())
> > +         {
> > +           /* If we don't expect sNaNs, for BFmode -> SFmode we can just
> > +              shift the bits up.  */
> > +           machine_mode fromi_mode, toi_mode;
> > +           if (int_mode_for_size (GET_MODE_BITSIZE (from_mode),
> > +                                  0).exists (&fromi_mode)
> > +               && int_mode_for_size (GET_MODE_BITSIZE (to_mode),
> > +                                     0).exists (&toi_mode))
> > +             {
> > +               start_sequence ();
> > +               rtx fromi = lowpart_subreg (fromi_mode, from, from_mode);
> > +               rtx tof = NULL_RTX;
> > +               if (fromi)
> > +                 {
> > +                   rtx toi = gen_reg_rtx (toi_mode);
> > +                   convert_mode_scalar (toi, fromi, 1);
> > +                   toi
> > +                     = maybe_expand_shift (LSHIFT_EXPR, toi_mode, toi,
> > +                                           GET_MODE_PRECISION (to_mode)
> > +                                           - GET_MODE_PRECISION (from_mode),
> > +                                           NULL_RTX, 1);
> > +                   if (toi)
> > +                     {
> > +                       tof = lowpart_subreg (to_mode, toi, toi_mode);
> > +                       if (tof)
> > +                         emit_move_insn (to, tof);
> > +                     }
> > +                 }
> > +               insns = get_insns ();
> > +               end_sequence ();
> > +               if (tof)
> > +                 {
> > +                   emit_insn (insns);
> > +                   return;
> > +                 }
> > +             }
> > +         }
> > +     }
> > +      if (REAL_MODE_FORMAT (from_mode) == &ieee_single_format
> > +       && REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format
> > +       && !HONOR_NANS (from_mode)
> > +       && !HONOR_NANS (to_mode)
> > +       && !flag_rounding_math
> > +       && optimize_insn_for_speed_p ())
> > +     {
> > +       /* If we don't expect qNaNs nor sNaNs and can assume rounding
> > +          to nearest, we can expand the conversion inline as
> > +          (fromi + 0x7fff + ((fromi >> 16) & 1)) >> 16.  */
> > +       machine_mode fromi_mode, toi_mode;
> > +       if (int_mode_for_size (GET_MODE_BITSIZE (from_mode),
> > +                              0).exists (&fromi_mode)
> > +           && int_mode_for_size (GET_MODE_BITSIZE (to_mode),
> > +                                 0).exists (&toi_mode))
> > +         {
> > +           start_sequence ();
> > +           rtx fromi = lowpart_subreg (fromi_mode, from, from_mode);
> > +           rtx tof = NULL_RTX;
> > +           do
> > +             {
> > +               if (!fromi)
> > +                 break;
> > +               int shift = (GET_MODE_PRECISION (from_mode)
> > +                            - GET_MODE_PRECISION (to_mode));
> > +               rtx temp1
> > +                 = maybe_expand_shift (RSHIFT_EXPR, fromi_mode, fromi,
> > +                                       shift, NULL_RTX, 1);
> > +               if (!temp1)
> > +                 break;
> > +               rtx temp2
> > +                 = expand_binop (fromi_mode, and_optab, temp1, const1_rtx,
> > +                                 NULL_RTX, 1, OPTAB_DIRECT);
> > +               if (!temp2)
> > +                 break;
> > +               rtx temp3
> > +                 = expand_binop (fromi_mode, add_optab, fromi,
> > +                                 gen_int_mode ((HOST_WIDE_INT_1U
> > +                                                << (shift - 1)) - 1,
> > +                                               fromi_mode), NULL_RTX,
> > +                                 1, OPTAB_DIRECT);
> > +               if (!temp3)
> > +                 break;
> > +               rtx temp4
> > +                 = expand_binop (fromi_mode, add_optab, temp3, temp2,
> > +                                 NULL_RTX, 1, OPTAB_DIRECT);
> > +               if (!temp4)
> > +                 break;
> > +               rtx temp5 = maybe_expand_shift (RSHIFT_EXPR, fromi_mode,
> > +                                               temp4, shift, NULL_RTX, 1);
> > +               if (!temp5)
> > +                 break;
> > +               rtx temp6 = lowpart_subreg (toi_mode, temp5, fromi_mode);
> > +               if (!temp6)
> > +                 break;
> > +               tof = lowpart_subreg (to_mode, force_reg (toi_mode, temp6),
> > +                                     toi_mode);
> > +               if (tof)
> > +                 emit_move_insn (to, tof);
> > +             }
> > +           while (0);
> > +           insns = get_insns ();
> > +           end_sequence ();
> > +           if (tof)
> > +             {
> > +               emit_insn (insns);
> > +               return;
> > +             }
> > +         }
> > +     }
> > +#endif
> > +
> >         /* Otherwise use a libcall.  */
> >         libcall = convert_optab_libfunc (tab, to_mode, from_mode);
> >
> > --- gcc/builtin-types.def.jj  2022-10-03 18:00:52.658740505 +0200
> > +++ gcc/builtin-types.def     2022-10-13 17:09:52.930317869 +0200
> > @@ -82,6 +82,9 @@ DEF_PRIMITIVE_TYPE (BT_UNWINDWORD, (*lan
> >   DEF_PRIMITIVE_TYPE (BT_FLOAT, float_type_node)
> >   DEF_PRIMITIVE_TYPE (BT_DOUBLE, double_type_node)
> >   DEF_PRIMITIVE_TYPE (BT_LONGDOUBLE, long_double_type_node)
> > +DEF_PRIMITIVE_TYPE (BT_BFLOAT16, (bfloat16_type_node
> > +                               ? bfloat16_type_node
> > +                               : error_mark_node))
> >   DEF_PRIMITIVE_TYPE (BT_FLOAT16, (float16_type_node
> >                                ? float16_type_node
> >                                : error_mark_node))
> > @@ -264,6 +267,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_FLOAT_CONST_S
> >   DEF_FUNCTION_TYPE_1 (BT_FN_DOUBLE_CONST_STRING, BT_DOUBLE, BT_CONST_STRING)
> >   DEF_FUNCTION_TYPE_1 (BT_FN_LONGDOUBLE_CONST_STRING,
> >                    BT_LONGDOUBLE, BT_CONST_STRING)
> > +DEF_FUNCTION_TYPE_1 (BT_FN_BFLOAT16_CONST_STRING, BT_BFLOAT16, BT_CONST_STRING)
> >   DEF_FUNCTION_TYPE_1 (BT_FN_FLOAT16_CONST_STRING, BT_FLOAT16, BT_CONST_STRING)
> >   DEF_FUNCTION_TYPE_1 (BT_FN_FLOAT32_CONST_STRING, BT_FLOAT32, BT_CONST_STRING)
> >   DEF_FUNCTION_TYPE_1 (BT_FN_FLOAT64_CONST_STRING, BT_FLOAT64, BT_CONST_STRING)
> > --- gcc/builtins.def.jj       2022-10-03 18:00:52.679740221 +0200
> > +++ gcc/builtins.def  2022-10-13 17:09:05.633962625 +0200
> > @@ -514,6 +514,7 @@ DEF_GCC_BUILTIN        (BUILT_IN_NANSF,
> >   DEF_GCC_BUILTIN        (BUILT_IN_NANSL, "nansl", BT_FN_LONGDOUBLE_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
> >   DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_NANS, "nans", NAN_TYPE, ATTR_CONST_NOTHROW_NONNULL)
> >   #undef NAN_TYPE
> > +DEF_GCC_BUILTIN        (BUILT_IN_NANSF16B, "nansf16b", BT_FN_BFLOAT16_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
> >   DEF_GCC_BUILTIN        (BUILT_IN_NANSD32, "nansd32", BT_FN_DFLOAT32_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
> >   DEF_GCC_BUILTIN        (BUILT_IN_NANSD64, "nansd64", BT_FN_DFLOAT64_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
> >   DEF_GCC_BUILTIN        (BUILT_IN_NANSD128, "nansd128", BT_FN_DFLOAT128_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
> > --- gcc/fold-const-call.cc.jj 2022-09-03 09:35:41.107989686 +0200
> > +++ gcc/fold-const-call.cc    2022-10-13 17:20:59.579229947 +0200
> > @@ -1301,6 +1301,7 @@ fold_const_call (combined_fn fn, tree ty
> >
> >       CASE_CFN_NANS:
> >       CASE_FLT_FN_FLOATN_NX (CFN_BUILT_IN_NANS):
> > +    case CFN_BUILT_IN_NANSF16B:
> >       case CFN_BUILT_IN_NANSD32:
> >       case CFN_BUILT_IN_NANSD64:
> >       case CFN_BUILT_IN_NANSD128:
> > --- gcc/config/i386/i386.cc.jj        2022-10-03 18:00:52.942736674 +0200
> > +++ gcc/config/i386/i386.cc   2022-10-13 16:57:09.092773105 +0200
> > @@ -2423,6 +2423,7 @@ classify_argument (machine_mode mode, co
> >         classes[1] = X86_64_SSEUP_CLASS;
> >         return 2;
> >       case E_HCmode:
> > +    case E_BCmode:
> >         classes[0] = X86_64_SSE_CLASS;
> >         if (!(bit_offset % 64))
> >       return 1;
> > @@ -22428,7 +22429,7 @@ ix86_libgcc_floating_mode_supported_p (s
> >        be defined by the C front-end for AVX512FP16 intrinsics.  We will
> >        issue an error in ix86_expand_move for HFmode if AVX512FP16 isn't
> >        enabled.  */
> > -  return ((mode == HFmode && TARGET_SSE2)
> > +  return (((mode == HFmode || mode == BFmode) && TARGET_SSE2)
> >         ? true
> >         : default_libgcc_floating_mode_supported_p (mode));
> >   }
> > @@ -22731,7 +22732,7 @@ ix86_mangle_type (const_tree type)
> >     switch (TYPE_MODE (type))
> >       {
> >       case E_BFmode:
> > -      return "u6__bf16";
> > +      return "DF16b";
> >       case E_HFmode:
> >         /* _Float16 is "DF16_".
> >        Align with clang's decision in https://reviews.llvm.org/D33719. */
> > @@ -22747,55 +22748,6 @@ ix86_mangle_type (const_tree type)
> >       }
> >   }
> >
> > -/* Return the diagnostic message string if conversion from FROMTYPE to
> > -   TOTYPE is not allowed, NULL otherwise.  */
> > -
> > -static const char *
> > -ix86_invalid_conversion (const_tree fromtype, const_tree totype)
> > -{
> > -  if (element_mode (fromtype) != element_mode (totype))
> > -    {
> > -      /* Do no allow conversions to/from BFmode scalar types.  */
> > -      if (TYPE_MODE (fromtype) == BFmode)
> > -     return N_("invalid conversion from type %<__bf16%>");
> > -      if (TYPE_MODE (totype) == BFmode)
> > -     return N_("invalid conversion to type %<__bf16%>");
> > -    }
> > -
> > -  /* Conversion allowed.  */
> > -  return NULL;
> > -}
> > -
> > -/* Return the diagnostic message string if the unary operation OP is
> > -   not permitted on TYPE, NULL otherwise.  */
> > -
> > -static const char *
> > -ix86_invalid_unary_op (int op, const_tree type)
> > -{
> > -  /* Reject all single-operand operations on BFmode except for &.  */
> > -  if (element_mode (type) == BFmode && op != ADDR_EXPR)
> > -    return N_("operation not permitted on type %<__bf16%>");
> > -
> > -  /* Operation allowed.  */
> > -  return NULL;
> > -}
> > -
> > -/* Return the diagnostic message string if the binary operation OP is
> > -   not permitted on TYPE1 and TYPE2, NULL otherwise.  */
> > -
> > -static const char *
> > -ix86_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1,
> > -                        const_tree type2)
> > -{
> > -  /* Reject all 2-operand operations on BFmode.  */
> > -  if (element_mode (type1) == BFmode
> > -      || element_mode (type2) == BFmode)
> > -    return N_("operation not permitted on type %<__bf16%>");
> > -
> > -  /* Operation allowed.  */
> > -  return NULL;
> > -}
> > -
> >   static GTY(()) tree ix86_tls_stack_chk_guard_decl;
> >
> >   static tree
> > @@ -24853,15 +24805,6 @@ ix86_libgcc_floating_mode_supported_p
> >   #undef TARGET_MANGLE_TYPE
> >   #define TARGET_MANGLE_TYPE ix86_mangle_type
> >
> > -#undef TARGET_INVALID_CONVERSION
> > -#define TARGET_INVALID_CONVERSION ix86_invalid_conversion
> > -
> > -#undef TARGET_INVALID_UNARY_OP
> > -#define TARGET_INVALID_UNARY_OP ix86_invalid_unary_op
> > -
> > -#undef TARGET_INVALID_BINARY_OP
> > -#define TARGET_INVALID_BINARY_OP ix86_invalid_binary_op
> > -
> >   #undef TARGET_STACK_PROTECT_GUARD
> >   #define TARGET_STACK_PROTECT_GUARD ix86_stack_protect_guard
> >
> > --- gcc/config/i386/i386-builtins.cc.jj       2022-10-03 18:00:52.918736997 +0200
> > +++ gcc/config/i386/i386-builtins.cc  2022-10-13 16:57:09.119772735 +0200
> > @@ -126,7 +126,6 @@ BDESC_VERIFYS (IX86_BUILTIN_MAX,
> >   static GTY(()) tree ix86_builtin_type_tab[(int) IX86_BT_LAST_CPTR + 1];
> >
> >   tree ix86_float16_type_node = NULL_TREE;
> > -tree ix86_bf16_type_node = NULL_TREE;
> >   tree ix86_bf16_ptr_type_node = NULL_TREE;
> >
> >   /* Retrieve an element from the above table, building some of
> > @@ -1372,16 +1371,18 @@ ix86_register_float16_builtin_type (void
> >   static void
> >   ix86_register_bf16_builtin_type (void)
> >   {
> > -  ix86_bf16_type_node = make_node (REAL_TYPE);
> > -  TYPE_PRECISION (ix86_bf16_type_node) = 16;
> > -  SET_TYPE_MODE (ix86_bf16_type_node, BFmode);
> > -  layout_type (ix86_bf16_type_node);
> > +  if (bfloat16_type_node == NULL_TREE)
> > +    {
> > +      bfloat16_type_node = make_node (REAL_TYPE);
> > +      TYPE_PRECISION (bfloat16_type_node) = 16;
> > +      SET_TYPE_MODE (bfloat16_type_node, BFmode);
> > +      layout_type (bfloat16_type_node);
> > +    }
> >
> >     if (!maybe_get_identifier ("__bf16") && TARGET_SSE2)
> >       {
> > -      lang_hooks.types.register_builtin_type (ix86_bf16_type_node,
> > -                                         "__bf16");
> > -      ix86_bf16_ptr_type_node = build_pointer_type (ix86_bf16_type_node);
> > +      lang_hooks.types.register_builtin_type (bfloat16_type_node, "__bf16");
> > +      ix86_bf16_ptr_type_node = build_pointer_type (bfloat16_type_node);
> >       }
> >   }
> >
> > --- gcc/config/i386/i386-builtin-types.def.jj 2022-10-03 18:00:52.894737321 +0200
> > +++ gcc/config/i386/i386-builtin-types.def    2022-10-13 16:57:09.139772460 +0200
> > @@ -69,7 +69,7 @@ DEF_PRIMITIVE_TYPE (UINT16, short_unsign
> >   DEF_PRIMITIVE_TYPE (INT64, long_long_integer_type_node)
> >   DEF_PRIMITIVE_TYPE (UINT64, long_long_unsigned_type_node)
> >   DEF_PRIMITIVE_TYPE (FLOAT16, ix86_float16_type_node)
> > -DEF_PRIMITIVE_TYPE (BFLOAT16, ix86_bf16_type_node)
> > +DEF_PRIMITIVE_TYPE (BFLOAT16, bfloat16_type_node)
> >   DEF_PRIMITIVE_TYPE (FLOAT, float_type_node)
> >   DEF_PRIMITIVE_TYPE (DOUBLE, double_type_node)
> >   DEF_PRIMITIVE_TYPE (FLOAT80, float80_type_node)
> > --- gcc/config/i386/i386.md.jj        2022-10-11 15:57:05.005762022 +0200
> > +++ gcc/config/i386/i386.md   2022-10-13 16:57:09.187771801 +0200
> > @@ -1644,6 +1644,48 @@ (define_expand "cbranch<mode>4"
> >     DONE;
> >   })
> >
> > +(define_expand "cbranchbf4"
> > +  [(set (reg:CC FLAGS_REG)
> > +     (compare:CC (match_operand:BF 1 "cmp_fp_expander_operand")
> > +                 (match_operand:BF 2 "cmp_fp_expander_operand")))
> > +   (set (pc) (if_then_else
> > +           (match_operator 0 "comparison_operator"
> > +            [(reg:CC FLAGS_REG)
> > +             (const_int 0)])
> > +           (label_ref (match_operand 3))
> > +           (pc)))]
> > +  ""
> > +{
> > +  rtx op1 = gen_lowpart (HImode, operands[1]);
> > +  if (CONST_INT_P (op1))
> > +    op1 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode,
> > +                                       operands[1], BFmode);
> > +  else
> > +    {
> > +      rtx t1 = gen_reg_rtx (SImode);
> > +      emit_insn (gen_zero_extendhisi2 (t1, op1));
> > +      emit_insn (gen_ashlsi3 (t1, t1, GEN_INT (16)));
> > +      op1 = gen_lowpart (SFmode, t1);
> > +    }
> > +  rtx op2 = gen_lowpart (HImode, operands[2]);
> > +  if (CONST_INT_P (op2))
> > +    op2 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode,
> > +                                       operands[2], BFmode);
> > +  else
> > +    {
> > +      rtx t2 = gen_reg_rtx (SImode);
> > +      emit_insn (gen_zero_extendhisi2 (t2, op2));
> > +      emit_insn (gen_ashlsi3 (t2, t2, GEN_INT (16)));
> > +      op2 = gen_lowpart (SFmode, t2);
> > +    }
> > +  do_compare_rtx_and_jump (op1, op2, GET_CODE (operands[0]), 0,
> > +                        SFmode, NULL_RTX, NULL,
> > +                        as_a <rtx_code_label *> (operands[3]),
> > +                        /* Unfortunately this isn't propagated.  */
> > +                        profile_probability::even ());

You could use ix86_expand_branch instead of do_compare_rtx_and_jump
here. This would expand in SFmode, so insn condition from cbranchsf4
should be copied here:

  "TARGET_80387 || (SSE_FLOAT_MODE_P (SFmode) && TARGET_SSE_MATH)"

Additionally, ix86_fp_comparison_operator predicate should be used for
operator0. Basically, just copy predicates from cbranchsf4 as we are
effectively expanding the SFmode compare & branch.

> > +  DONE;
> > +})
> > +
> >   (define_expand "cstorehf4"
> >     [(set (reg:CC FLAGS_REG)
> >       (compare:CC (match_operand:HF 2 "cmp_fp_expander_operand")
> > @@ -1659,6 +1701,45 @@ (define_expand "cstorehf4"
> >     DONE;
> >   })
> >
> > +(define_expand "cstorebf4"
> > +  [(set (reg:CC FLAGS_REG)
> > +     (compare:CC (match_operand:BF 2 "cmp_fp_expander_operand")
> > +                 (match_operand:BF 3 "cmp_fp_expander_operand")))
> > +   (set (match_operand:QI 0 "register_operand")
> > +     (match_operator 1 "comparison_operator"
> > +       [(reg:CC FLAGS_REG)
> > +        (const_int 0)]))]
> > +  ""
> > +{
> > +  rtx op1 = gen_lowpart (HImode, operands[2]);
> > +  if (CONST_INT_P (op1))
> > +    op1 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode,
> > +                                       operands[2], BFmode);
> > +  else
> > +    {
> > +      rtx t1 = gen_reg_rtx (SImode);
> > +      emit_insn (gen_zero_extendhisi2 (t1, op1));
> > +      emit_insn (gen_ashlsi3 (t1, t1, GEN_INT (16)));
> > +      op1 = gen_lowpart (SFmode, t1);
> > +    }
> > +  rtx op2 = gen_lowpart (HImode, operands[3]);
> > +  if (CONST_INT_P (op2))
> > +    op2 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode,
> > +                                       operands[3], BFmode);
> > +  else
> > +    {
> > +      rtx t2 = gen_reg_rtx (SImode);
> > +      emit_insn (gen_zero_extendhisi2 (t2, op2));
> > +      emit_insn (gen_ashlsi3 (t2, t2, GEN_INT (16)));
> > +      op2 = gen_lowpart (SFmode, t2);
> > +    }

Similar to cbranch above, use ix86_expand_setcc and copy predicates
from cstoresf4.

Uros.

> > +  rtx res = emit_store_flag_force (operands[0], GET_CODE (operands[1]),
> > +                                op1, op2, SFmode, 0, 1);
> > +  if (!rtx_equal_p (res, operands[0]))
> > +    emit_move_insn (operands[0], res);
> > +  DONE;
> > +})
> > +
> >   (define_expand "cstore<mode>4"
> >     [(set (reg:CC FLAGS_REG)
> >       (compare:CC (match_operand:MODEF 2 "cmp_fp_expander_operand")
> > --- gcc/c-family/c-cppbuiltin.cc.jj   2022-10-13 08:41:04.718165419 +0200
> > +++ gcc/c-family/c-cppbuiltin.cc      2022-10-13 17:51:07.722665421 +0200
> > @@ -1260,6 +1260,13 @@ c_cpp_builtins (cpp_reader *pfile)
> >         builtin_define_float_constants (prefix, ggc_strdup (csuffix), "%s",
> >                                     csuffix, FLOATN_NX_TYPE_NODE (i));
> >       }
> > +  if (bfloat16_type_node)
> > +    {
> > +      if (c_dialect_cxx () && cxx_dialect > cxx20)
> > +     cpp_define (pfile, "__STDCPP_BFLOAT16_T__=1");
> > +      builtin_define_float_constants ("BFLT16", "BF16", "%s",
> > +                                   "BF16", bfloat16_type_node);
> > +    }
> >
> >     /* For float.h.  */
> >     if (targetm.decimal_float_supported_p ())
> > @@ -1370,6 +1377,12 @@ c_cpp_builtins (cpp_reader *pfile)
> >             suffix[0] = 'l';
> >             memcpy (float_h_prefix, "LDBL", 5);
> >           }
> > +       else if (bfloat16_type_node
> > +                && mode == TYPE_MODE (bfloat16_type_node))
> > +         {
> > +           memcpy (suffix, "bf16", 5);
> > +           memcpy (float_h_prefix, "BFLT16", 7);
> > +         }
> >         else
> >           {
> >             bool found_suffix = false;
> > @@ -1396,22 +1409,28 @@ c_cpp_builtins (cpp_reader *pfile)
> >         machine_mode float16_type_mode = (float16_type_node
> >                                           ? TYPE_MODE (float16_type_node)
> >                                           : VOIDmode);
> > +       machine_mode bfloat16_type_mode = (bfloat16_type_node
> > +                                          ? TYPE_MODE (bfloat16_type_node)
> > +                                          : VOIDmode);
> >         switch (targetm.c.excess_precision
> >                   (EXCESS_PRECISION_TYPE_IMPLICIT))
> >           {
> >           case FLT_EVAL_METHOD_UNPREDICTABLE:
> >           case FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE:
> >             excess_precision = (mode == float16_type_mode
> > +                               || mode == bfloat16_type_mode
> >                                 || mode == TYPE_MODE (float_type_node)
> >                                 || mode == TYPE_MODE (double_type_node));
> >             break;
> >
> >           case FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE:
> >             excess_precision = (mode == float16_type_mode
> > +                               || mode == bfloat16_type_mode
> >                                 || mode == TYPE_MODE (float_type_node));
> >             break;
> >           case FLT_EVAL_METHOD_PROMOTE_TO_FLOAT:
> > -           excess_precision = mode == float16_type_mode;
> > +           excess_precision = (mode == float16_type_mode
> > +                               || mode == bfloat16_type_mode);
> >             break;
> >           case FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16:
> >             excess_precision = false;
> > --- gcc/c-family/c-lex.cc.jj  2022-10-13 16:21:52.548842666 +0200
> > +++ gcc/c-family/c-lex.cc     2022-10-13 16:59:51.778540099 +0200
> > @@ -1000,6 +1000,22 @@ interpret_float (const cpp_token *token,
> >         pedwarn (input_location, OPT_Wpedantic,
> >                  "non-standard suffix on floating constant");
> >         }
> > +    else if ((flags & CPP_N_BFLOAT16) != 0)
> > +      {
> > +     type = bfloat16_type_node;
> > +     if (type == NULL_TREE)
> > +       {
> > +         error ("unsupported non-standard suffix on floating constant");
> > +         return error_mark_node;
> > +       }
> > +     if (!c_dialect_cxx ())
> > +       pedwarn (input_location, OPT_Wpedantic,
> > +                "non-standard suffix on floating constant");
> > +     else if (cxx_dialect < cxx23)
> > +       pedwarn (input_location, OPT_Wpedantic,
> > +                "%<bf16%> or %<BF16%> suffix on floating constant only "
> > +                "available with %<-std=c++2b%> or %<-std=gnu++2b%>");
> > +      }
> >       else if ((flags & CPP_N_WIDTH) == CPP_N_LARGE)
> >         type = long_double_type_node;
> >       else if ((flags & CPP_N_WIDTH) == CPP_N_SMALL
> > --- gcc/c/c-typeck.cc.jj      2022-10-06 17:43:47.900502672 +0200
> > +++ gcc/c/c-typeck.cc 2022-10-13 16:57:09.226771266 +0200
> > @@ -3678,6 +3678,9 @@ convert_arguments (location_t loc, vec<l
> >               promote_float_arg = false;
> >               break;
> >             }
> > +       /* Don't promote __bf16 either.  */
> > +       if (TYPE_MAIN_VARIANT (valtype) == bfloat16_type_node)
> > +         promote_float_arg = false;
> >       }
> >
> >         if (type != NULL_TREE)
> > --- gcc/cp/cp-tree.h.jj       2022-10-13 16:21:52.600841952 +0200
> > +++ gcc/cp/cp-tree.h  2022-10-13 16:57:09.241771060 +0200
> > @@ -8741,6 +8741,8 @@ extended_float_type_p (tree type)
> >     for (int i = 0; i < NUM_FLOATN_NX_TYPES; ++i)
> >       if (type == FLOATN_TYPE_NODE (i))
> >         return true;
> > +  if (type == bfloat16_type_node)
> > +    return true;
> >     return false;
> >   }
> >
> > --- gcc/cp/typeck.cc.jj       2022-10-13 16:21:52.642841375 +0200
> > +++ gcc/cp/typeck.cc  2022-10-13 16:57:09.269770676 +0200
> > @@ -293,6 +293,10 @@ cp_compare_floating_point_conversion_ran
> >         if (mv2 == FLOATN_NX_TYPE_NODE (i))
> >       extended2 = i + 1;
> >       }
> > +  if (mv1 == bfloat16_type_node)
> > +    extended1 = true;
> > +  if (mv2 == bfloat16_type_node)
> > +    extended2 = true;
> >     if (extended2 && !extended1)
> >       {
> >         int ret = cp_compare_floating_point_conversion_ranks (t2, t1);
> > @@ -390,7 +394,9 @@ cp_compare_floating_point_conversion_ran
> >     if (cnt > 1 && mv2 == long_double_type_node)
> >       return -2;
> >     /* Otherwise, they have equal rank, but extended types
> > -     (other than std::bfloat16_t) have higher subrank.  */
> > +     (other than std::bfloat16_t) have higher subrank.
> > +     std::bfloat16_t shouldn't have equal rank to any standard
> > +     floating point type.  */
> >     return 1;
> >   }
> >
> > --- gcc/testsuite/lib/target-supports.exp.jj  2022-10-11 14:50:14.472773574 +0200
> > +++ gcc/testsuite/lib/target-supports.exp     2022-10-13 16:57:09.270770662 +0200
> > @@ -3416,6 +3416,22 @@ proc check_effective_target_base_quadflo
> >       return 1
> >   }
> >
> > +# Return 1 if the target supports the __bf16 type, 0 otherwise.
> > +
> > +proc check_effective_target_bfloat16 {} {
> > +    return [check_no_compiler_messages_nocache bfloat16 object {
> > +     __bf16 foo (__bf16 x) { return x + x; }
> > +    } [add_options_for_bfloat16 ""]]
> > +}
> > +
> > +proc check_effective_target_bfloat16_runtime {} {
> > +    return [check_effective_target_bfloat16]
> > +}
> > +
> > +proc add_options_for_bfloat16 { flags } {
> > +    return "$flags"
> > +}
> > +
> >   # Return 1 if the target supports all four forms of fused multiply-add
> >   # (fma, fms, fnma, and fnms) for both float and double.
> >
> > --- gcc/testsuite/gcc.dg/torture/bfloat16-basic.c.jj  2022-10-13 16:57:09.271770648 +0200
> > +++ gcc/testsuite/gcc.dg/torture/bfloat16-basic.c     2022-10-13 17:32:28.531884882 +0200
> > @@ -0,0 +1,11 @@
> > +/* Test __bf16.  */
> > +/* { dg-do run } */
> > +/* { dg-options "" } */
> > +/* { dg-add-options bfloat16 } */
> > +/* { dg-require-effective-target bfloat16_runtime } */
> > +
> > +#define TYPE __bf16
> > +#define CST(C) CONCAT (C, bf16)
> > +#define CSTU(C) CONCAT (C, BF16)
> > +
> > +#include "floatn-basic.h"
> > --- gcc/testsuite/gcc.dg/torture/bfloat16-builtin.c.jj        2022-10-13 16:57:09.271770648 +0200
> > +++ gcc/testsuite/gcc.dg/torture/bfloat16-builtin.c   2022-10-13 18:09:24.288913634 +0200
> > @@ -0,0 +1,47 @@
> > +/* Test __bf16 built-in functions.  */
> > +/* { dg-do run } */
> > +/* { dg-options "" } */
> > +/* { dg-add-options bfloat16 } */
> > +/* { dg-add-options ieee } */
> > +/* { dg-require-effective-target bfloat16_runtime } */
> > +
> > +extern void exit (int);
> > +extern void abort (void);
> > +
> > +extern __bf16 test_type;
> > +extern __typeof (__builtin_nansf16b ("")) test_type;
> > +
> > +volatile __bf16 inf_cst = (__bf16) __builtin_inff ();
> > +volatile __bf16 huge_val_cst = (__bf16) __builtin_huge_valf ();
> > +volatile __bf16 nan_cst = (__bf16) __builtin_nanf ("");
> > +volatile __bf16 nans_cst = __builtin_nansf16b ("");
> > +volatile __bf16 neg0 = -0.0bf16, neg1 = -1.0bf16, one = 1.0;
> > +
> > +int
> > +main (void)
> > +{
> > +  volatile __bf16 r;
> > +  if (!__builtin_isinf (inf_cst))
> > +    abort ();
> > +  if (!__builtin_isinf (huge_val_cst))
> > +    abort ();
> > +  if (inf_cst != huge_val_cst)
> > +    abort ();
> > +  if (!__builtin_isnan (nan_cst))
> > +    abort ();
> > +  if (!__builtin_isnan (nans_cst))
> > +    abort ();
> > +  r = __builtin_fabsf (neg1);
> > +  if (r != 1.0bf16)
> > +    abort ();
> > +  r = __builtin_copysignf (one, neg0);
> > +  if (r != neg1)
> > +    abort ();
> > +  r = __builtin_copysignf (inf_cst, neg1);
> > +  if (r != -huge_val_cst)
> > +    abort ();
> > +  r = __builtin_copysignf (-inf_cst, one);
> > +  if (r != huge_val_cst)
> > +    abort ();
> > +  exit (0);
> > +}
> > --- gcc/testsuite/gcc.dg/torture/bfloat16-builtin-issignaling-1.c.jj  2022-10-13 16:57:09.271770648 +0200
> > +++ gcc/testsuite/gcc.dg/torture/bfloat16-builtin-issignaling-1.c     2022-10-13 17:40:15.067555349 +0200
> > @@ -0,0 +1,21 @@
> > +/* Test __bf16 __builtin_issignaling.  */
> > +/* { dg-do run } */
> > +/* { dg-options "" } */
> > +/* { dg-add-options bfloat16 } */
> > +/* { dg-add-options ieee } */
> > +/* { dg-require-effective-target bfloat16_runtime } */
> > +/* { dg-additional-options "-fsignaling-nans" } */
> > +/* Workaround for PR57484 on ia32: */
> > +/* { dg-additional-options "-msse2 -mfpmath=sse" { target { ia32 && sse2_runtime } } } */
> > +
> > +#define CONCATX(X, Y) X ## Y
> > +#define CONCAT(X, Y) CONCATX (X, Y)
> > +
> > +#define TYPE __bf16
> > +#define CST(C) CONCAT (C, bf16)
> > +#define FN(F) CONCAT (F, f16b)
> > +#define NAN(x) ((__bf16) __builtin_nanf (x))
> > +#define INF ((__bf16) __builtin_inff ())
> > +#define EXT 0
> > +
> > +#include "builtin-issignaling-1.c"
> > --- gcc/testsuite/gcc.dg/torture/bfloat16-complex.c.jj        2022-10-13 16:57:09.271770648 +0200
> > +++ gcc/testsuite/gcc.dg/torture/bfloat16-complex.c   2022-10-13 17:46:43.259267724 +0200
> > @@ -0,0 +1,61 @@
> > +/* Test __bf16 complex arithmetic.  */
> > +/* { dg-do run } */
> > +/* { dg-options "" } */
> > +/* { dg-add-options bfloat16 } */
> > +/* { dg-require-effective-target bfloat16_runtime } */
> > +
> > +extern void exit (int);
> > +extern void abort (void);
> > +
> > +volatile __bf16 a = 1.0bf16;
> > +typedef _Complex float __cbf16 __attribute__((__mode__(__BC__)));
> > +volatile __cbf16 b = __builtin_complex (2.0bf16, 3.0bf16);
> > +volatile __cbf16 c = __builtin_complex (2.0bf16, 3.0bf16);
> > +volatile __cbf16 d = __builtin_complex (2.0bf16, 3.0bf16);
> > +
> > +__cbf16
> > +fn (__cbf16 arg)
> > +{
> > +  return arg / 4;
> > +}
> > +
> > +int
> > +main (void)
> > +{
> > +  volatile __cbf16 r;
> > +  if (b != c)
> > +    abort ();
> > +  if (b != d)
> > +    abort ();
> > +  r = a + b;
> > +  if (__real__ r != 3.0bf16 || __imag__ r != 3.0bf16)
> > +    abort ();
> > +  r += d;
> > +  if (__real__ r != 5.0bf16 || __imag__ r != 6.0bf16)
> > +    abort ();
> > +  r -= a;
> > +  if (__real__ r != 4.0bf16 || __imag__ r != 6.0bf16)
> > +    abort ();
> > +  r /= (a + a);
> > +  if (__real__ r != 2.0bf16 || __imag__ r != 3.0bf16)
> > +    abort ();
> > +  r *= (a + a);
> > +  if (__real__ r != 4.0bf16 || __imag__ r != 6.0bf16)
> > +    abort ();
> > +  r -= b;
> > +  if (__real__ r != 2.0bf16 || __imag__ r != 3.0bf16)
> > +    abort ();
> > +  r *= r;
> > +  if (__real__ r != -5.0bf16 || __imag__ r != 12.0bf16)
> > +    abort ();
> > +  /* Division may not be exact, so round result before comparing.  */
> > +  r /= b;
> > +  r += __builtin_complex (100.0bf16, 100.0bf16);
> > +  r -= __builtin_complex (100.0bf16, 100.0bf16);
> > +  if (r != b)
> > +    abort ();
> > +  r = fn (r);
> > +  if (__real__ r != 0.5bf16 || __imag__ r != 0.75bf16)
> > +    abort ();
> > +  exit (0);
> > +}
> > --- gcc/testsuite/gcc.dg/torture/builtin-issignaling-1.c.jj   2022-10-03 18:00:53.118734300 +0200
> > +++ gcc/testsuite/gcc.dg/torture/builtin-issignaling-1.c      2022-10-13 17:39:19.387313780 +0200
> > @@ -4,7 +4,7 @@
> >   /* Workaround for PR57484 on ia32: */
> >   /* { dg-additional-options "-msse2 -mfpmath=sse" { target { ia32 && sse2_runtime } } } */
> >
> > -#ifndef EXT
> > +#if !defined(EXT) && !defined(TYPE)
> >   int
> >   f1 (void)
> >   {
> > @@ -41,31 +41,42 @@ f6 (long double x)
> >     return __builtin_issignaling (x);
> >   }
> >   #else
> > -#define CONCATX(X, Y) X ## Y
> > -#define CONCAT(X, Y) CONCATX (X, Y)
> > -#define CONCAT3(X, Y, Z) CONCAT (CONCAT (X, Y), Z)
> > -#define CONCAT4(W, X, Y, Z) CONCAT (CONCAT (CONCAT (W, X), Y), Z)
> > -
> > -#if EXT
> > -# define TYPE CONCAT3 (_Float, WIDTH, x)
> > -# define CST(C) CONCAT4 (C, f, WIDTH, x)
> > -# define FN(F) CONCAT4 (F, f, WIDTH, x)
> > -#else
> > -# define TYPE CONCAT (_Float, WIDTH)
> > -# define CST(C) CONCAT3 (C, f, WIDTH)
> > -# define FN(F) CONCAT3 (F, f, WIDTH)
> > +#ifndef TYPE
> > +# define CONCATX(X, Y) X ## Y
> > +# define CONCAT(X, Y) CONCATX (X, Y)
> > +# define CONCAT3(X, Y, Z) CONCAT (CONCAT (X, Y), Z)
> > +# define CONCAT4(W, X, Y, Z) CONCAT (CONCAT (CONCAT (W, X), Y), Z)
> > +
> > +# if EXT
> > +#  define TYPE CONCAT3 (_Float, WIDTH, x)
> > +#  define CST(C) CONCAT4 (C, f, WIDTH, x)
> > +#  define FN(F) CONCAT4 (F, f, WIDTH, x)
> > +# else
> > +#  define TYPE CONCAT (_Float, WIDTH)
> > +#  define CST(C) CONCAT3 (C, f, WIDTH)
> > +#  define FN(F) CONCAT3 (F, f, WIDTH)
> > +# endif
> > +#endif
> > +#ifndef NANS
> > +# define NANS(x) FN (__builtin_nans) (x)
> > +#endif
> > +#ifndef NAN
> > +# define NAN(x) FN (__builtin_nan) (x)
> > +#endif
> > +#ifndef INF
> > +# define INF FN (__builtin_inf) ()
> >   #endif
> >
> >   int
> >   f1 (void)
> >   {
> > -  return __builtin_issignaling (FN (__builtin_nans) (""));
> > +  return __builtin_issignaling (NANS (""));
> >   }
> >
> >   int
> >   f2 (void)
> >   {
> > -  return __builtin_issignaling (FN (__builtin_nan) (""));
> > +  return __builtin_issignaling (NAN (""));
> >   }
> >
> >   int
> > @@ -118,10 +129,10 @@ main ()
> >     if (!f6 (z))
> >       __builtin_abort ();
> >   #else
> > -  if (f4 (w) || !f4 (FN (__builtin_nans) ("0x123")) || f4 (CST (42.0)) || f4 (FN (__builtin_nan) ("0x234"))
> > -      || f4 (FN (__builtin_inf) ()) || f4 (-FN (__builtin_inf) ()) || f4 (CST (-42.0)) || f4 (CST (-0.0)) || f4 (CST (0.0)))
> > +  if (f4 (w) || !f4 (NANS ("0x123")) || f4 (CST (42.0)) || f4 (NAN ("0x234"))
> > +      || f4 (INF) || f4 (-INF) || f4 (CST (-42.0)) || f4 (CST (-0.0)) || f4 (CST (0.0)))
> >       __builtin_abort ();
> > -  w = FN (__builtin_nans) ("");
> > +  w = NANS ("");
> >     asm volatile ("" : : : "memory");
> >     if (!f4 (w))
> >       __builtin_abort ();
> > --- gcc/testsuite/gcc.dg/torture/floatn-basic.h.jj    2022-10-03 18:00:53.118734300 +0200
> > +++ gcc/testsuite/gcc.dg/torture/floatn-basic.h       2022-10-13 16:57:09.285770456 +0200
> > @@ -9,14 +9,16 @@
> >   #define CONCAT3(X, Y, Z) CONCAT (CONCAT (X, Y), Z)
> >   #define CONCAT4(W, X, Y, Z) CONCAT (CONCAT (CONCAT (W, X), Y), Z)
> >
> > -#if EXT
> > -# define TYPE CONCAT3 (_Float, WIDTH, x)
> > -# define CST(C) CONCAT4 (C, f, WIDTH, x)
> > -# define CSTU(C) CONCAT4 (C, F, WIDTH, x)
> > -#else
> > -# define TYPE CONCAT (_Float, WIDTH)
> > -# define CST(C) CONCAT3 (C, f, WIDTH)
> > -# define CSTU(C) CONCAT3 (C, F, WIDTH)
> > +#ifndef TYPE
> > +# if EXT
> > +#  define TYPE CONCAT3 (_Float, WIDTH, x)
> > +#  define CST(C) CONCAT4 (C, f, WIDTH, x)
> > +#  define CSTU(C) CONCAT4 (C, F, WIDTH, x)
> > +# else
> > +#  define TYPE CONCAT (_Float, WIDTH)
> > +#  define CST(C) CONCAT3 (C, f, WIDTH)
> > +#  define CSTU(C) CONCAT3 (C, F, WIDTH)
> > +# endif
> >   #endif
> >
> >   extern void exit (int);
> > --- gcc/testsuite/gcc.target/i386/vect-bfloat16-typecheck_2.c.jj      2022-10-03 18:00:53.137734043 +0200
> > +++ gcc/testsuite/gcc.target/i386/vect-bfloat16-typecheck_2.c 2022-10-13 16:57:09.306770168 +0200
> > @@ -45,19 +45,19 @@ __m256bf16 footest (__m256bf16 vector0)
> >     __m256bf16 vector2_1 = {};
> >     __m256bf16 vector2_2 = { glob_bfloat };
> >     __m256bf16 vector2_3 = { glob_bfloat, glob_bfloat, glob_bfloat, glob_bfloat };
> > -  __m256bf16 vector2_4 = { 0 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m256bf16 vector2_5 = { 0.1 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m256bf16 vector2_6 = { is_a_float16 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m256bf16 vector2_7 = { is_a_float }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m256bf16 vector2_8 = { is_an_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m256bf16 vector2_9 = { is_a_short_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m256bf16 vector2_10 = { 0.0, 0, is_a_short_int, is_a_float }; /* { dg-error "invalid conversion to type '__bf16'" } */
> > -
> > -  __v8si initi_2_1 = { glob_bfloat };   /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __m256 initi_2_2 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __m256h initi_2_3 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __m256i initi_2_5 = { glob_bfloat };   /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __v16hi initi_2_6 = { glob_bfloat };   /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  __m256bf16 vector2_4 = { 0 };
> > +  __m256bf16 vector2_5 = { 0.1 };
> > +  __m256bf16 vector2_6 = { is_a_float16 };
> > +  __m256bf16 vector2_7 = { is_a_float };
> > +  __m256bf16 vector2_8 = { is_an_int };
> > +  __m256bf16 vector2_9 = { is_a_short_int };
> > +  __m256bf16 vector2_10 = { 0.0, 0, is_a_short_int, is_a_float };
> > +
> > +  __v8si initi_2_1 = { glob_bfloat };
> > +  __m256 initi_2_2 = { glob_bfloat };
> > +  __m256h initi_2_3 = { glob_bfloat };
> > +  __m256i initi_2_5 = { glob_bfloat };
> > +  __v16hi initi_2_6 = { glob_bfloat };
> >
> >     /* Assignments to/from vectors.  */
> >
> > @@ -79,25 +79,25 @@ __m256bf16 footest (__m256bf16 vector0)
> >     /* Assignments to/from elements.  */
> >
> >     vector2_3[0] = glob_bfloat;
> > -  vector2_3[0] = is_an_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = is_a_short_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = is_a_float; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = is_a_float16; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = 0; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = 0.1; /* { dg-error {invalid conversion to type '__bf16'} } */
> > +  vector2_3[0] = is_an_int;
> > +  vector2_3[0] = is_a_short_int;
> > +  vector2_3[0] = is_a_float;
> > +  vector2_3[0] = is_a_float16;
> > +  vector2_3[0] = 0;
> > +  vector2_3[0] = 0.1;
> >
> >     glob_bfloat = vector2_3[0];
> > -  is_an_int = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_short_int = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_float = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_float16 = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  is_an_int = vector2_3[0];
> > +  is_a_short_int = vector2_3[0];
> > +  is_a_float = vector2_3[0];
> > +  is_a_float16 = vector2_3[0];
> >
> >     /* Compound literals.  */
> >
> >     (__m256bf16) {};
> >
> > -  (__m256bf16) { 0 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__m256bf16) { 0.1 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > +  (__m256bf16) { 0 };
> > +  (__m256bf16) { 0.1 };
> >     (__m256bf16) { is_a_float_vec }; /* { dg-error {incompatible types when initializing type '__bf16' using type '__m256'} } */
> >     (__m256bf16) { is_an_int_vec }; /* { dg-error {incompatible types when initializing type '__bf16' using type '__v8si'} } */
> >     (__m256bf16) { is_a_long_int_pair }; /* { dg-error {incompatible types when initializing type '__bf16' using type '__m256i'} } */
> > @@ -176,16 +176,16 @@ __m256bf16 footest (__m256bf16 vector0)
> >     bfloat_ptr = &bfloat_ptr3[1];
> >
> >     /* Simple comparison.  */
> > -  vector0 > glob_bfloat_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  glob_bfloat_vec == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > is_a_float_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  is_a_float_vec == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > 0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  0 == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > 0.1; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  0.1 == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > is_an_int_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  is_an_int_vec == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  vector0 > glob_bfloat_vec;
> > +  glob_bfloat_vec == vector0;
> > +  vector0 > is_a_float_vec; /* { dg-error {comparing vectors with different element types} } */
> > +  is_a_float_vec == vector0; /* { dg-error {comparing vectors with different element types} } */
> > +  vector0 > 0;
> > +  0 == vector0;
> > +  vector0 > 0.1; /* { dg-error {conversion of scalar 'double' to vector '__m256bf16'} } */
> > +  0.1 == vector0; /* { dg-error {conversion of scalar 'double' to vector '__m256bf16'} } */
> > +  vector0 > is_an_int_vec; /* { dg-error {comparing vectors with different element types} } */
> > +  is_an_int_vec == vector0; /* { dg-error {comparing vectors with different element types} } */
> >
> >     /* Pointer comparison.  */
> >
> > @@ -224,24 +224,24 @@ __m256bf16 footest (__m256bf16 vector0)
> >
> >     /* Unary operators.  */
> >
> > -  +vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  -vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  ~vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  !vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  +vector0;
> > +  -vector0;
> > +  ~vector0; /* { dg-error {wrong type argument to bit-complement} } */
> > +  !vector0; /* { dg-error {wrong type argument to unary exclamation mark} } */
> >     *vector0; /* { dg-error {invalid type argument of unary '\*'} } */
> > -  __real vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  __imag vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  ++vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  --vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0++; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0--; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  __real vector0; /* { dg-error {wrong type argument to __real} } */
> > +  __imag vector0; /* { dg-error {wrong type argument to __imag} } */
> > +  ++vector0;
> > +  --vector0;
> > +  vector0++;
> > +  vector0--;
> >
> >     /* Binary arithmetic operations.  */
> >
> > -  vector0 = glob_bfloat_vec + *bfloat_ptr; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 = glob_bfloat_vec + 0.1; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 = glob_bfloat_vec + 0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 = glob_bfloat_vec + is_a_float_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  vector0 = glob_bfloat_vec + *bfloat_ptr;
> > +  vector0 = glob_bfloat_vec + 0.1; /* { dg-error {conversion of scalar 'double' to vector '__m256bf16'} } */
> > +  vector0 = glob_bfloat_vec + 0;
> > +  vector0 = glob_bfloat_vec + is_a_float_vec; /* { dg-error {invalid operands to binary \+} } */
> >
> >     return vector0;
> >   }
> > --- gcc/testsuite/gcc.target/i386/sse2-bfloat16-scalar-typecheck.c.jj 2022-10-03 18:00:53.136734057 +0200
> > +++ gcc/testsuite/gcc.target/i386/sse2-bfloat16-scalar-typecheck.c    2022-10-13 16:57:09.327769880 +0200
> > @@ -12,8 +12,8 @@ double is_a_double;
> >
> >   float *float_ptr;
> >
> > -__bf16 foo1 (void) { return (__bf16) 0x1234; } /* { dg-error {invalid conversion to type '__bf16'} } */
> > -__bf16 foo2 (void) { return (__bf16) (short) 0x1234; } /* { dg-error {invalid conversion to type '__bf16'} } */
> > +__bf16 foo1 (void) { return (__bf16) 0x1234; }
> > +__bf16 foo2 (void) { return (__bf16) (short) 0x1234; }
> >
> >   __bf16 footest (__bf16 scalar0)
> >   {
> > @@ -22,87 +22,87 @@ __bf16 footest (__bf16 scalar0)
> >
> >     __bf16 scalar1_1;
> >     __bf16 scalar1_2 = glob_bfloat;
> > -  __bf16 scalar1_3 = 0;   /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar1_4 = 0.1; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar1_5 = is_a_float; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar1_6 = is_an_int;  /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar1_7 = is_a_float16; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar1_8 = is_a_double; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar1_9 = is_a_short_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -
> > -  int initi_1_1 = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  float initi_1_2 = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  _Float16 initi_1_3 = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  short initi_1_4 = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  double initi_1_5 = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  __bf16 scalar1_3 = 0;
> > +  __bf16 scalar1_4 = 0.1;
> > +  __bf16 scalar1_5 = is_a_float;
> > +  __bf16 scalar1_6 = is_an_int;
> > +  __bf16 scalar1_7 = is_a_float16;
> > +  __bf16 scalar1_8 = is_a_double;
> > +  __bf16 scalar1_9 = is_a_short_int;
> > +
> > +  int initi_1_1 = glob_bfloat;
> > +  float initi_1_2 = glob_bfloat;
> > +  _Float16 initi_1_3 = glob_bfloat;
> > +  short initi_1_4 = glob_bfloat;
> > +  double initi_1_5 = glob_bfloat;
> >
> >     __bf16 scalar2_1 = {};
> >     __bf16 scalar2_2 = { glob_bfloat };
> > -  __bf16 scalar2_3 = { 0 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar2_4 = { 0.1 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar2_5 = { is_a_float }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar2_6 = { is_an_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar2_7 = { is_a_float16 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar2_8 = { is_a_double }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 scalar2_9 = { is_a_short_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -
> > -  int initi_2_1 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  float initi_2_2 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  _Float16 initi_2_3 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  short initi_2_4 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  double initi_2_5 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  __bf16 scalar2_3 = { 0 };
> > +  __bf16 scalar2_4 = { 0.1 };
> > +  __bf16 scalar2_5 = { is_a_float };
> > +  __bf16 scalar2_6 = { is_an_int };
> > +  __bf16 scalar2_7 = { is_a_float16 };
> > +  __bf16 scalar2_8 = { is_a_double };
> > +  __bf16 scalar2_9 = { is_a_short_int };
> > +
> > +  int initi_2_1 = { glob_bfloat };
> > +  float initi_2_2 = { glob_bfloat };
> > +  _Float16 initi_2_3 = { glob_bfloat };
> > +  short initi_2_4 = { glob_bfloat };
> > +  double initi_2_5 = { glob_bfloat };
> >
> >     /* Assignments.  */
> >
> >     glob_bfloat = glob_bfloat;
> > -  glob_bfloat = 0;   /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  glob_bfloat = 0.1; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  glob_bfloat = is_a_float; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  glob_bfloat = is_an_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  glob_bfloat = is_a_float16; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  glob_bfloat = is_a_double; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  glob_bfloat = is_a_short_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -
> > -  is_an_int = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_float = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_float16 = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_double = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_short_int = glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  glob_bfloat = 0;
> > +  glob_bfloat = 0.1;
> > +  glob_bfloat = is_a_float;
> > +  glob_bfloat = is_an_int;
> > +  glob_bfloat = is_a_float16;
> > +  glob_bfloat = is_a_double;
> > +  glob_bfloat = is_a_short_int;
> > +
> > +  is_an_int = glob_bfloat;
> > +  is_a_float = glob_bfloat;
> > +  is_a_float16 = glob_bfloat;
> > +  is_a_double = glob_bfloat;
> > +  is_a_short_int = glob_bfloat;
> >
> >     /* Casting.  */
> >
> >     (void) glob_bfloat;
> >     (__bf16) glob_bfloat;
> >
> > -  (int) glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (float) glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (_Float16) glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (double) glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (short) glob_bfloat; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -
> > -  (__bf16) is_an_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) is_a_float; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) is_a_float16; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) is_a_double; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) is_a_short_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > +  (int) glob_bfloat;
> > +  (float) glob_bfloat;
> > +  (_Float16) glob_bfloat;
> > +  (double) glob_bfloat;
> > +  (short) glob_bfloat;
> > +
> > +  (__bf16) is_an_int;
> > +  (__bf16) is_a_float;
> > +  (__bf16) is_a_float16;
> > +  (__bf16) is_a_double;
> > +  (__bf16) is_a_short_int;
> >
> >     /* Compound literals.  */
> >
> >     (__bf16) {};
> >     (__bf16) { glob_bfloat };
> > -  (__bf16) { 0 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) { 0.1 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) { is_a_float }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) { is_an_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) { is_a_float16 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) { is_a_double }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__bf16) { is_a_short_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -
> > -  (int) { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (float) { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (_Float16) { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (double) { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  (short) { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  (__bf16) { 0 };
> > +  (__bf16) { 0.1 };
> > +  (__bf16) { is_a_float };
> > +  (__bf16) { is_an_int };
> > +  (__bf16) { is_a_float16 };
> > +  (__bf16) { is_a_double };
> > +  (__bf16) { is_a_short_int };
> > +
> > +  (int) { glob_bfloat };
> > +  (float) { glob_bfloat };
> > +  (_Float16) { glob_bfloat };
> > +  (double) { glob_bfloat };
> > +  (short) { glob_bfloat };
> >
> >     /* Arrays and Structs.  */
> >
> > @@ -145,16 +145,16 @@ __bf16 footest (__bf16 scalar0)
> >     bfloat_ptr = &bfloat_ptr3[1];
> >
> >     /* Simple comparison.  */
> > -  scalar0 > glob_bfloat; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  glob_bfloat == scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 > is_a_float; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  is_a_float == scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 > 0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  0 == scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 > 0.1; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  0.1 == scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 > is_an_int; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  is_an_int == scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  scalar0 > glob_bfloat;
> > +  glob_bfloat == scalar0;
> > +  scalar0 > is_a_float;
> > +  is_a_float == scalar0;
> > +  scalar0 > 0;
> > +  0 == scalar0;
> > +  scalar0 > 0.1;
> > +  0.1 == scalar0;
> > +  scalar0 > is_an_int;
> > +  is_an_int == scalar0;
> >
> >     /* Pointer comparison.  */
> >
> > @@ -174,41 +174,41 @@ __bf16 footest (__bf16 scalar0)
> >     /* Conditional expressions.  */
> >
> >     0 ? scalar0 : scalar0;
> > -  0 ? scalar0 : is_a_float; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  0 ? is_a_float : scalar0; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  0 ? scalar0 : 0; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  0 ? 0 : scalar0; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  0 ? 0.1 : scalar0; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  0 ? scalar0 : 0.1; /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  0 ? scalar0 : is_a_float;
> > +  0 ? is_a_float : scalar0;
> > +  0 ? scalar0 : 0;
> > +  0 ? 0 : scalar0;
> > +  0 ? 0.1 : scalar0;
> > +  0 ? scalar0 : 0.1;
> >     0 ? bfloat_ptr : bfloat_ptr2;
> >     0 ? bfloat_ptr : float_ptr; /* { dg-warning {pointer type mismatch in conditional expression} } */
> >     0 ? float_ptr : bfloat_ptr; /* { dg-warning {pointer type mismatch in conditional expression} } */
> >
> > -  scalar0 ? scalar0 : scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 ? is_a_float : scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 ? scalar0 : is_a_float; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 ? is_a_float : is_a_float; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  scalar0 ? scalar0 : scalar0;
> > +  scalar0 ? is_a_float : scalar0;
> > +  scalar0 ? scalar0 : is_a_float;
> > +  scalar0 ? is_a_float : is_a_float;
> >
> >     /* Unary operators.  */
> >
> > -  +scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  -scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  ~scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  !scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  +scalar0;
> > +  -scalar0;
> > +  ~scalar0; /* { dg-error {wrong type argument to bit-complement} } */
> > +  !scalar0;
> >     *scalar0; /* { dg-error {invalid type argument of unary '\*'} } */
> > -  __real scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  __imag scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  ++scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  --scalar0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0++; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0--; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  __real scalar0;
> > +  __imag scalar0;
> > +  ++scalar0;
> > +  --scalar0;
> > +  scalar0++;
> > +  scalar0--;
> >
> >     /* Binary arithmetic operations.  */
> >
> > -  scalar0 = glob_bfloat + *bfloat_ptr; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 = glob_bfloat + 0.1; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 = glob_bfloat + 0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  scalar0 = glob_bfloat + is_a_float; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  scalar0 = glob_bfloat + *bfloat_ptr;
> > +  scalar0 = glob_bfloat + 0.1;
> > +  scalar0 = glob_bfloat + 0;
> > +  scalar0 = glob_bfloat + is_a_float;
> >
> >     return scalar0;
> >   }
> > --- gcc/testsuite/gcc.target/i386/vect-bfloat16-typecheck_1.c.jj      2022-10-03 18:00:53.136734057 +0200
> > +++ gcc/testsuite/gcc.target/i386/vect-bfloat16-typecheck_1.c 2022-10-13 16:57:09.344769646 +0200
> > @@ -48,20 +48,20 @@ __m128bf16 footest (__m128bf16 vector0)
> >     __m128bf16 vector2_1 = {};
> >     __m128bf16 vector2_2 = { glob_bfloat };
> >     __m128bf16 vector2_3 = { glob_bfloat, glob_bfloat, glob_bfloat, glob_bfloat };
> > -  __m128bf16 vector2_4 = { 0 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m128bf16 vector2_5 = { 0.1 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m128bf16 vector2_6 = { is_a_float16 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m128bf16 vector2_7 = { is_a_float }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m128bf16 vector2_8 = { is_an_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m128bf16 vector2_9 = { is_a_short_int }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __m128bf16 vector2_10 = { 0.0, 0, is_a_short_int, is_a_float }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -
> > -  __v8si initi_2_1 = { glob_bfloat };   /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __m256 initi_2_2 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __m128h initi_2_3 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __m128 initi_2_4 = { glob_bfloat }; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __v4si initi_2_5 = { glob_bfloat };   /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  __v4hi initi_2_6 = { glob_bfloat };   /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  __m128bf16 vector2_4 = { 0 };
> > +  __m128bf16 vector2_5 = { 0.1 };
> > +  __m128bf16 vector2_6 = { is_a_float16 };
> > +  __m128bf16 vector2_7 = { is_a_float };
> > +  __m128bf16 vector2_8 = { is_an_int };
> > +  __m128bf16 vector2_9 = { is_a_short_int };
> > +  __m128bf16 vector2_10 = { 0.0, 0, is_a_short_int, is_a_float };
> > +
> > +  __v8si initi_2_1 = { glob_bfloat };
> > +  __m256 initi_2_2 = { glob_bfloat };
> > +  __m128h initi_2_3 = { glob_bfloat };
> > +  __m128 initi_2_4 = { glob_bfloat };
> > +  __v4si initi_2_5 = { glob_bfloat };
> > +  __v4hi initi_2_6 = { glob_bfloat };
> >
> >     /* Assignments to/from vectors.  */
> >
> > @@ -85,25 +85,25 @@ __m128bf16 footest (__m128bf16 vector0)
> >     /* Assignments to/from elements.  */
> >
> >     vector2_3[0] = glob_bfloat;
> > -  vector2_3[0] = is_an_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = is_a_short_int; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = is_a_float; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = is_a_float16; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = 0; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  vector2_3[0] = 0.1; /* { dg-error {invalid conversion to type '__bf16'} } */
> > +  vector2_3[0] = is_an_int;
> > +  vector2_3[0] = is_a_short_int;
> > +  vector2_3[0] = is_a_float;
> > +  vector2_3[0] = is_a_float16;
> > +  vector2_3[0] = 0;
> > +  vector2_3[0] = 0.1;
> >
> >     glob_bfloat = vector2_3[0];
> > -  is_an_int = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_short_int = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_float = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > -  is_a_float16 = vector2_3[0]; /* { dg-error {invalid conversion from type '__bf16'} } */
> > +  is_an_int = vector2_3[0];
> > +  is_a_short_int = vector2_3[0];
> > +  is_a_float = vector2_3[0];
> > +  is_a_float16 = vector2_3[0];
> >
> >     /* Compound literals.  */
> >
> >     (__m128bf16) {};
> >
> > -  (__m128bf16) { 0 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  (__m128bf16) { 0.1 }; /* { dg-error {invalid conversion to type '__bf16'} } */
> > +  (__m128bf16) { 0 };
> > +  (__m128bf16) { 0.1 };
> >     (__m128bf16) { is_a_float_vec }; /* { dg-error {incompatible types when initializing type '__bf16' using type '__m256'} } */
> >     (__m128bf16) { is_an_int_vec }; /* { dg-error {incompatible types when initializing type '__bf16' using type '__v8si'} } */
> >     (__m128bf16) { is_a_float_pair }; /* { dg-error {incompatible types when initializing type '__bf16' using type '__m128'} } */
> > @@ -186,16 +186,16 @@ __m128bf16 footest (__m128bf16 vector0)
> >     bfloat_ptr = &bfloat_ptr3[1];
> >
> >     /* Simple comparison.  */
> > -  vector0 > glob_bfloat_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  glob_bfloat_vec == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > is_a_float_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  is_a_float_vec == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > 0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  0 == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > 0.1; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  0.1 == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 > is_an_int_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  is_an_int_vec == vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  vector0 > glob_bfloat_vec;
> > +  glob_bfloat_vec == vector0;
> > +  vector0 > is_a_float_vec; /* { dg-error {comparing vectors with different element types} } */
> > +  is_a_float_vec == vector0; /* { dg-error {comparing vectors with different element types} } */
> > +  vector0 > 0;
> > +  0 == vector0;
> > +  vector0 > 0.1; /* { dg-error {conversion of scalar 'double' to vector '__m128bf16'} } */
> > +  0.1 == vector0; /* { dg-error {conversion of scalar 'double' to vector '__m128bf16'} } */
> > +  vector0 > is_an_int_vec; /* { dg-error {comparing vectors with different element types} } */
> > +  is_an_int_vec == vector0; /* { dg-error {comparing vectors with different element types} } */
> >
> >     /* Pointer comparison.  */
> >
> > @@ -234,24 +234,24 @@ __m128bf16 footest (__m128bf16 vector0)
> >
> >     /* Unary operators.  */
> >
> > -  +vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  -vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  ~vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  !vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  +vector0;
> > +  -vector0;
> > +  ~vector0; /* { dg-error {wrong type argument to bit-complement} } */
> > +  !vector0; /* { dg-error {wrong type argument to unary exclamation mark} } */
> >     *vector0; /* { dg-error {invalid type argument of unary '\*'} } */
> > -  __real vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  __imag vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  ++vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  --vector0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0++; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0--; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  __real vector0; /* { dg-error {wrong type argument to __real} } */
> > +  __imag vector0; /* { dg-error {wrong type argument to __imag} } */
> > +  ++vector0;
> > +  --vector0;
> > +  vector0++;
> > +  vector0--;
> >
> >     /* Binary arithmetic operations.  */
> >
> > -  vector0 = glob_bfloat_vec + *bfloat_ptr; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 = glob_bfloat_vec + 0.1; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 = glob_bfloat_vec + 0; /* { dg-error {operation not permitted on type '__bf16'} } */
> > -  vector0 = glob_bfloat_vec + is_a_float_vec; /* { dg-error {operation not permitted on type '__bf16'} } */
> > +  vector0 = glob_bfloat_vec + *bfloat_ptr;
> > +  vector0 = glob_bfloat_vec + 0.1; /* { dg-error {conversion of scalar 'double' to vector '__m128bf16'} } */
> > +  vector0 = glob_bfloat_vec + 0;
> > +  vector0 = glob_bfloat_vec + is_a_float_vec; /* { dg-error {invalid operands to binary \+} } */
> >
> >     return vector0;
> >   }
> > --- gcc/testsuite/g++.target/i386/bfloat_cpp_typecheck.C.jj   2022-10-03 18:00:53.109734421 +0200
> > +++ gcc/testsuite/g++.target/i386/bfloat_cpp_typecheck.C      2022-10-13 16:57:09.362769399 +0200
> > @@ -5,6 +5,6 @@ void foo (void)
> >   {
> >     __bf16 (); /* { dg-bogus {invalid conversion to type '__bf16'} } */
> >     __bf16 a = __bf16(); /* { dg-bogus {invalid conversion to type '__bf16'} } */
> > -  __bf16 (0x1234); /* { dg-error {invalid conversion to type '__bf16'} } */
> > -  __bf16 (0.1); /* { dg-error {invalid conversion to type '__bf16'} } */
> > +  __bf16 (0x1234); /* { dg-bogus {invalid conversion to type '__bf16'} } */
> > +  __bf16 (0.1); /* { dg-bogus {invalid conversion to type '__bf16'} } */
> >   }
> > --- libcpp/include/cpplib.h.jj        2022-10-03 18:00:53.251732506 +0200
> > +++ libcpp/include/cpplib.h   2022-10-13 16:57:09.384769097 +0200
> > @@ -1275,6 +1275,7 @@ struct cpp_num
> >   #define CPP_N_USERDEF       0x1000000 /* C++11 user-defined literal.  */
> >
> >   #define CPP_N_SIZE_T        0x2000000 /* C++23 size_t literal.  */
> > +#define CPP_N_BFLOAT16       0x4000000 /* std::bfloat16_t type.  */
> >
> >   #define CPP_N_WIDTH_FLOATN_NX       0xF0000000 /* _FloatN / _FloatNx value
> >                                             of N, divided by 16.  */
> > --- libcpp/expr.cc.jj 2022-10-03 18:00:53.221732910 +0200
> > +++ libcpp/expr.cc    2022-10-13 16:58:01.360055690 +0200
> > @@ -91,10 +91,10 @@ interpret_float_suffix (cpp_reader *pfil
> >     size_t orig_len = len;
> >     const uchar *orig_s = s;
> >     size_t flags;
> > -  size_t f, d, l, w, q, i, fn, fnx, fn_bits;
> > +  size_t f, d, l, w, q, i, fn, fnx, fn_bits, bf16;
> >
> >     flags = 0;
> > -  f = d = l = w = q = i = fn = fnx = fn_bits = 0;
> > +  f = d = l = w = q = i = fn = fnx = fn_bits = bf16 = 0;
> >
> >     /* The following decimal float suffixes, from TR 24732:2009, TS
> >        18661-2:2015 and C2X, are supported:
> > @@ -131,7 +131,8 @@ interpret_float_suffix (cpp_reader *pfil
> >        w, W - machine-specific type such as __float80 (GNU extension).
> >        q, Q - machine-specific type such as __float128 (GNU extension).
> >        fN, FN - _FloatN (TS 18661-3:2015).
> > -     fNx, FNx - _FloatNx (TS 18661-3:2015).  */
> > +     fNx, FNx - _FloatNx (TS 18661-3:2015).
> > +     bf16, BF16 - std::bfloat16_t (ISO C++23).  */
> >
> >     /* Process decimal float suffixes, which are two letters starting
> >        with d or D.  Order and case are significant.  */
> > @@ -239,6 +240,19 @@ interpret_float_suffix (cpp_reader *pfil
> >               fn++;
> >           }
> >         break;
> > +     case 'b': case 'B':
> > +       if (len > 2
> > +           /* Except for bf16 / BF16 where case is significant.  */
> > +           && s[1] == (s[0] == 'b' ? 'f' : 'F')
> > +           && s[2] == '1'
> > +           && s[3] == '6')
> > +         {
> > +           bf16++;
> > +           len -= 3;
> > +           s += 3;
> > +           break;
> > +         }
> > +       return 0;
> >       case 'd': case 'D': d++; break;
> >       case 'l': case 'L': l++; break;
> >       case 'w': case 'W': w++; break;
> > @@ -257,7 +271,7 @@ interpret_float_suffix (cpp_reader *pfil
> >        of N larger than can be represented in the return value.  The
> >        caller is responsible for rejecting _FloatN suffixes where
> >        _FloatN is not supported on the chosen target.  */
> > -  if (f + d + l + w + q + fn + fnx > 1 || i > 1)
> > +  if (f + d + l + w + q + fn + fnx + bf16 > 1 || i > 1)
> >       return 0;
> >     if (fn_bits > CPP_FLOATN_MAX)
> >       return 0;
> > @@ -295,6 +309,7 @@ interpret_float_suffix (cpp_reader *pfil
> >            q ? CPP_N_MD_Q :
> >            fn ? CPP_N_FLOATN | (fn_bits << CPP_FLOATN_SHIFT) :
> >            fnx ? CPP_N_FLOATNX | (fn_bits << CPP_FLOATN_SHIFT) :
> > +          bf16 ? CPP_N_BFLOAT16 :
> >            CPP_N_DEFAULT));
> >   }
> >
> > --- libgcc/config/i386/t-softfp.jj    2022-10-03 18:00:53.314731656 +0200
> > +++ libgcc/config/i386/t-softfp       2022-10-13 16:57:09.426768521 +0200
> > @@ -6,8 +6,9 @@ LIB2FUNCS_EXCLUDE += $(libgcc2-hf-functi
> >   libgcc2-hf-extras = $(addsuffix .c, $(libgcc2-hf-functions))
> >   LIB2ADD += $(addprefix $(srcdir)/config/i386/, $(libgcc2-hf-extras))
> >
> > -softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf
> > -softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf
> > +softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf bfsf
> > +softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf \
> > +                   tfbf xfbf dfbf sfbf hfbf
> >
> >   softfp_extras += eqhf2
> >
> > @@ -15,11 +16,17 @@ CFLAGS-extendhfsf2.c += -msse2
> >   CFLAGS-extendhfdf2.c += -msse2
> >   CFLAGS-extendhftf2.c += -msse2
> >   CFLAGS-extendhfxf2.c += -msse2
> > +CFLAGS-extendbfsf2.c += -msse2
> >
> >   CFLAGS-truncsfhf2.c += -msse2
> >   CFLAGS-truncdfhf2.c += -msse2
> >   CFLAGS-truncxfhf2.c += -msse2
> >   CFLAGS-trunctfhf2.c += -msse2
> > +CFLAGS-truncsfbf2.c += -msse2
> > +CFLAGS-truncdfbf2.c += -msse2
> > +CFLAGS-truncxfbf2.c += -msse2
> > +CFLAGS-trunctfbf2.c += -msse2
> > +CFLAGS-trunchfbf2.c += -msse2
> >
> >   CFLAGS-eqhf2.c += -msse2
> >   CFLAGS-_divhc3.c += -msse2
> > --- libgcc/config/i386/libgcc-glibc.ver.jj    2022-10-03 18:00:53.313731670 +0200
> > +++ libgcc/config/i386/libgcc-glibc.ver       2022-10-13 16:57:09.438768356 +0200
> > @@ -214,3 +214,13 @@ GCC_12.0.0 {
> >     __trunctfhf2
> >     __truncxfhf2
> >   }
> > +
> > +%inherit GCC_13.0.0 GCC_12.0.0
> > +GCC_13.0.0 {
> > +  __extendbfsf2
> > +  __truncdfbf2
> > +  __truncsfbf2
> > +  __trunctfbf2
> > +  __truncxfbf2
> > +  __trunchfbf2
> > +}
> > --- libgcc/config/i386/sfp-machine.h.jj       2022-10-03 18:00:53.313731670 +0200
> > +++ libgcc/config/i386/sfp-machine.h  2022-10-13 16:57:09.441768315 +0200
> > @@ -18,6 +18,7 @@ typedef int __gcc_CMPtype __attribute__
> >   #define _FP_QNANNEGATEDP 0
> >
> >   #define _FP_NANSIGN_H               1
> > +#define _FP_NANSIGN_B                1
> >   #define _FP_NANSIGN_S               1
> >   #define _FP_NANSIGN_D               1
> >   #define _FP_NANSIGN_E               1
> > --- libgcc/config/i386/64/sfp-machine.h.jj    2022-10-03 18:00:53.290731980 +0200
> > +++ libgcc/config/i386/64/sfp-machine.h       2022-10-13 16:57:09.451768178 +0200
> > @@ -14,6 +14,7 @@ typedef unsigned int UTItype __attribute
> >   #define _FP_DIV_MEAT_Q(R,X,Y)   _FP_DIV_MEAT_2_udiv(Q,R,X,Y)
> >
> >   #define _FP_NANFRAC_H               _FP_QNANBIT_H
> > +#define _FP_NANFRAC_B                _FP_QNANBIT_B
> >   #define _FP_NANFRAC_S               _FP_QNANBIT_S
> >   #define _FP_NANFRAC_D               _FP_QNANBIT_D
> >   #define _FP_NANFRAC_E               _FP_QNANBIT_E, 0
> > --- libgcc/config/i386/32/sfp-machine.h.jj    2022-10-03 18:00:53.290731980 +0200
> > +++ libgcc/config/i386/32/sfp-machine.h       2022-10-13 16:57:09.459768068 +0200
> > @@ -87,6 +87,7 @@
> >   #define _FP_DIV_MEAT_Q(R,X,Y)   _FP_DIV_MEAT_4_udiv(Q,R,X,Y)
> >
> >   #define _FP_NANFRAC_H               _FP_QNANBIT_H
> > +#define _FP_NANFRAC_B                _FP_QNANBIT_B
> >   #define _FP_NANFRAC_S               _FP_QNANBIT_S
> >   #define _FP_NANFRAC_D               _FP_QNANBIT_D, 0
> >   /* Even if XFmode is 12byte,  we have to pad it to
> > --- libgcc/soft-fp/brain.h.jj 2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/brain.h    2022-10-13 16:57:09.459768068 +0200
> > @@ -0,0 +1,172 @@
> > +/* Software floating-point emulation.
> > +   Definitions for Brain Floating Point format (bfloat16).
> > +   Copyright (C) 1997-2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#ifndef SOFT_FP_BRAIN_H
> > +#define SOFT_FP_BRAIN_H      1
> > +
> > +#if _FP_W_TYPE_SIZE < 32
> > +# error "Here's a nickel kid.  Go buy yourself a real computer."
> > +#endif
> > +
> > +#define _FP_FRACTBITS_B              (_FP_W_TYPE_SIZE)
> > +
> > +#define _FP_FRACTBITS_DW_B   (_FP_W_TYPE_SIZE)
> > +
> > +#define _FP_FRACBITS_B               8
> > +#define _FP_FRACXBITS_B              (_FP_FRACTBITS_B - _FP_FRACBITS_B)
> > +#define _FP_WFRACBITS_B              (_FP_WORKBITS + _FP_FRACBITS_B)
> > +#define _FP_WFRACXBITS_B     (_FP_FRACTBITS_B - _FP_WFRACBITS_B)
> > +#define _FP_EXPBITS_B                8
> > +#define _FP_EXPBIAS_B                127
> > +#define _FP_EXPMAX_B         255
> > +
> > +#define _FP_QNANBIT_B                ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2))
> > +#define _FP_QNANBIT_SH_B     ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2+_FP_WORKBITS))
> > +#define _FP_IMPLBIT_B                ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1))
> > +#define _FP_IMPLBIT_SH_B     ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1+_FP_WORKBITS))
> > +#define _FP_OVERFLOW_B               ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_B))
> > +
> > +#define _FP_WFRACBITS_DW_B   (2 * _FP_WFRACBITS_B)
> > +#define _FP_WFRACXBITS_DW_B  (_FP_FRACTBITS_DW_B - _FP_WFRACBITS_DW_B)
> > +#define _FP_HIGHBIT_DW_B     \
> > +  ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_DW_B - 1) % _FP_W_TYPE_SIZE)
> > +
> > +/* The implementation of _FP_MUL_MEAT_B and _FP_DIV_MEAT_B should be
> > +   chosen by the target machine.  */
> > +
> > +typedef float BFtype __attribute__ ((mode (BF)));
> > +
> > +union _FP_UNION_B
> > +{
> > +  BFtype flt;
> > +  struct _FP_STRUCT_LAYOUT
> > +  {
> > +#if __BYTE_ORDER == __BIG_ENDIAN
> > +    unsigned sign : 1;
> > +    unsigned exp  : _FP_EXPBITS_B;
> > +    unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0);
> > +#else
> > +    unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0);
> > +    unsigned exp  : _FP_EXPBITS_B;
> > +    unsigned sign : 1;
> > +#endif
> > +  } bits;
> > +};
> > +
> > +#define FP_DECL_B(X)         _FP_DECL (1, X)
> > +#define FP_UNPACK_RAW_B(X, val)      _FP_UNPACK_RAW_1 (B, X, (val))
> > +#define FP_UNPACK_RAW_BP(X, val)     _FP_UNPACK_RAW_1_P (B, X, (val))
> > +#define FP_PACK_RAW_B(val, X)        _FP_PACK_RAW_1 (B, (val), X)
> > +#define FP_PACK_RAW_BP(val, X)                       \
> > +  do                                         \
> > +    {                                                \
> > +      if (!FP_INHIBIT_RESULTS)                       \
> > +     _FP_PACK_RAW_1_P (B, (val), X);         \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_UNPACK_B(X, val)                  \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_UNPACK_RAW_1 (B, X, (val));                \
> > +      _FP_UNPACK_CANONICAL (B, 1, X);                \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_UNPACK_BP(X, val)                 \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_UNPACK_RAW_1_P (B, X, (val));              \
> > +      _FP_UNPACK_CANONICAL (B, 1, X);                \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_UNPACK_SEMIRAW_B(X, val)          \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_UNPACK_RAW_1 (B, X, (val));                \
> > +      _FP_UNPACK_SEMIRAW (B, 1, X);          \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_UNPACK_SEMIRAW_BP(X, val)         \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_UNPACK_RAW_1_P (B, X, (val));              \
> > +      _FP_UNPACK_SEMIRAW (B, 1, X);          \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_PACK_B(val, X)                    \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_PACK_CANONICAL (B, 1, X);          \
> > +      _FP_PACK_RAW_1 (B, (val), X);          \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_PACK_BP(val, X)                   \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_PACK_CANONICAL (B, 1, X);          \
> > +      if (!FP_INHIBIT_RESULTS)                       \
> > +     _FP_PACK_RAW_1_P (B, (val), X);         \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_PACK_SEMIRAW_B(val, X)            \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_PACK_SEMIRAW (B, 1, X);            \
> > +      _FP_PACK_RAW_1 (B, (val), X);          \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_PACK_SEMIRAW_BP(val, X)           \
> > +  do                                         \
> > +    {                                                \
> > +      _FP_PACK_SEMIRAW (B, 1, X);            \
> > +      if (!FP_INHIBIT_RESULTS)                       \
> > +     _FP_PACK_RAW_1_P (B, (val), X);         \
> > +    }                                                \
> > +  while (0)
> > +
> > +#define FP_TO_INT_B(r, X, rsz, rsg)  _FP_TO_INT (B, 1, (r), X, (rsz), (rsg))
> > +#define FP_TO_INT_ROUND_B(r, X, rsz, rsg)    \
> > +  _FP_TO_INT_ROUND (B, 1, (r), X, (rsz), (rsg))
> > +#define FP_FROM_INT_B(X, r, rs, rt)  _FP_FROM_INT (B, 1, X, (r), (rs), rt)
> > +
> > +/* BFmode arithmetic is not implemented.  */
> > +
> > +#define _FP_FRAC_HIGH_B(X)   _FP_FRAC_HIGH_1 (X)
> > +#define _FP_FRAC_HIGH_RAW_B(X)       _FP_FRAC_HIGH_1 (X)
> > +#define _FP_FRAC_HIGH_DW_B(X)        _FP_FRAC_HIGH_1 (X)
> > +
> > +#define FP_CMP_EQ_B(r, X, Y, ex)       _FP_CMP_EQ (B, 1, (r), X, Y, (ex))
> > +
> > +#endif /* !SOFT_FP_BRAIN_H */
> > --- libgcc/soft-fp/truncsfbf2.c.jj    2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/truncsfbf2.c       2022-10-13 16:57:09.460768054 +0200
> > @@ -0,0 +1,48 @@
> > +/* Software floating-point emulation.
> > +   Truncate IEEE single into bfloat16.
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#include "soft-fp.h"
> > +#include "brain.h"
> > +#include "single.h"
> > +
> > +BFtype
> > +__truncsfbf2 (SFtype a)
> > +{
> > +  FP_DECL_EX;
> > +  FP_DECL_S (A);
> > +  FP_DECL_B (R);
> > +  BFtype r;
> > +
> > +  FP_INIT_ROUNDMODE;
> > +  FP_UNPACK_SEMIRAW_S (A, a);
> > +  FP_TRUNC (B, S, 1, 1, R, A);
> > +  FP_PACK_SEMIRAW_B (r, R);
> > +  FP_HANDLE_EXCEPTIONS;
> > +
> > +  return r;
> > +}
> > --- libgcc/soft-fp/truncdfbf2.c.jj    2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/truncdfbf2.c       2022-10-13 16:57:09.460768054 +0200
> > @@ -0,0 +1,52 @@
> > +/* Software floating-point emulation.
> > +   Truncate IEEE double into bfloat16.
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#include "soft-fp.h"
> > +#include "brain.h"
> > +#include "double.h"
> > +
> > +BFtype
> > +__truncdfbf2 (DFtype a)
> > +{
> > +  FP_DECL_EX;
> > +  FP_DECL_D (A);
> > +  FP_DECL_B (R);
> > +  BFtype r;
> > +
> > +  FP_INIT_ROUNDMODE;
> > +  FP_UNPACK_SEMIRAW_D (A, a);
> > +#if _FP_W_TYPE_SIZE < _FP_FRACBITS_D
> > +  FP_TRUNC (B, D, 1, 2, R, A);
> > +#else
> > +  FP_TRUNC (B, D, 1, 1, R, A);
> > +#endif
> > +  FP_PACK_SEMIRAW_B (r, R);
> > +  FP_HANDLE_EXCEPTIONS;
> > +
> > +  return r;
> > +}
> > --- libgcc/soft-fp/truncxfbf2.c.jj    2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/truncxfbf2.c       2022-10-13 16:57:09.460768054 +0200
> > @@ -0,0 +1,52 @@
> > +/* Software floating-point emulation.
> > +   Truncate IEEE extended into bfloat16.
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#include "soft-fp.h"
> > +#include "brain.h"
> > +#include "extended.h"
> > +
> > +BFtype
> > +__truncxfbf2 (XFtype a)
> > +{
> > +  FP_DECL_EX;
> > +  FP_DECL_E (A);
> > +  FP_DECL_B (R);
> > +  BFtype r;
> > +
> > +  FP_INIT_ROUNDMODE;
> > +  FP_UNPACK_SEMIRAW_E (A, a);
> > +#if _FP_W_TYPE_SIZE < 64
> > +  FP_TRUNC (B, E, 1, 4, R, A);
> > +#else
> > +  FP_TRUNC (B, E, 1, 2, R, A);
> > +#endif
> > +  FP_PACK_SEMIRAW_B (r, R);
> > +  FP_HANDLE_EXCEPTIONS;
> > +
> > +  return r;
> > +}
> > --- libgcc/soft-fp/trunctfbf2.c.jj    2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/trunctfbf2.c       2022-10-13 16:57:09.460768054 +0200
> > @@ -0,0 +1,52 @@
> > +/* Software floating-point emulation.
> > +   Truncate IEEE quad into bfloat16.
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include "soft-fp.h"
> > +#include "brain.h"
> > +#include "quad.h"
> > +
> > +BFtype
> > +__trunctfbf2 (TFtype a)
> > +{
> > +  FP_DECL_EX;
> > +  FP_DECL_Q (A);
> > +  FP_DECL_B (R);
> > +  BFtype r;
> > +
> > +  FP_INIT_ROUNDMODE;
> > +  FP_UNPACK_SEMIRAW_Q (A, a);
> > +#if _FP_W_TYPE_SIZE < 64
> > +  FP_TRUNC (B, Q, 1, 4, R, A);
> > +#else
> > +  FP_TRUNC (B, Q, 1, 2, R, A);
> > +#endif
> > +  FP_PACK_SEMIRAW_B (r, R);
> > +  FP_HANDLE_EXCEPTIONS;
> > +
> > +  return r;
> > +}
> > --- libgcc/soft-fp/trunchfbf2.c.jj    2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/trunchfbf2.c       2022-10-13 16:57:09.460768054 +0200
> > @@ -0,0 +1,58 @@
> > +/* Software floating-point emulation.
> > +   Truncate IEEE half into bfloat16.
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#include "soft-fp.h"
> > +#include "brain.h"
> > +#include "half.h"
> > +#include "single.h"
> > +
> > +/* BFtype and HFtype are unordered, neither is a superset or subset
> > +   of each other.  Convert HFtype to SFtype (lossless) and then
> > +   truncate to BFtype.  */
> > +
> > +BFtype
> > +__trunchfbf2 (HFtype a)
> > +{
> > +  FP_DECL_EX;
> > +  FP_DECL_H (A);
> > +  FP_DECL_S (B);
> > +  FP_DECL_B (R);
> > +  SFtype b;
> > +  BFtype r;
> > +
> > +  FP_INIT_ROUNDMODE;
> > +  FP_UNPACK_RAW_H (A, a);
> > +  FP_EXTEND (S, H, 1, 1, B, A);
> > +  FP_PACK_RAW_S (b, B);
> > +  FP_UNPACK_SEMIRAW_S (B, b);
> > +  FP_TRUNC (B, S, 1, 1, R, B);
> > +  FP_PACK_SEMIRAW_B (r, R);
> > +  FP_HANDLE_EXCEPTIONS;
> > +
> > +  return r;
> > +}
> > --- libgcc/soft-fp/truncbfhf2.c.jj    2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/truncbfhf2.c       2022-10-13 16:57:09.460768054 +0200
> > @@ -0,0 +1,75 @@
> > +/* Software floating-point emulation.
> > +   Truncate bfloat16 into IEEE half.
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#include "soft-fp.h"
> > +#include "half.h"
> > +#include "brain.h"
> > +#include "single.h"
> > +
> > +/* BFtype and HFtype are unordered, neither is a superset or subset
> > +   of each other.  Convert BFtype to SFtype (lossless) and then
> > +   truncate to HFtype.  */
> > +
> > +HFtype
> > +__truncbfhf2 (BFtype a)
> > +{
> > +  FP_DECL_EX;
> > +  FP_DECL_H (A);
> > +  FP_DECL_S (B);
> > +  FP_DECL_B (R);
> > +  SFtype b;
> > +  HFtype r;
> > +
> > +  FP_INIT_ROUNDMODE;
> > +  /* Optimize BFtype to SFtype conversion to simple left shift
> > +     by 16 if possible, we don't need to raise exceptions on sNaN
> > +     here as the SFtype to HFtype truncation should do that too.  */
> > +  if (sizeof (BFtype) == 2
> > +      && sizeof (unsigned short) == 2
> > +      && sizeof (SFtype) == 4
> > +      && sizeof (unsigned int) == 4)
> > +    {
> > +      union { BFtype a; unsigned short b; } u1;
> > +      union { SFtype a; unsigned int b; } u2;
> > +      u1.a = a;
> > +      u2.b = (u1.b << 8) << 8;
> > +      b = u2.a;
> > +    }
> > +  else
> > +    {
> > +      FP_UNPACK_RAW_B (A, a);
> > +      FP_EXTEND (S, B, 1, 1, B, A);
> > +      FP_PACK_RAW_S (b, B);
> > +    }
> > +  FP_UNPACK_SEMIRAW_S (B, b);
> > +  FP_TRUNC (H, S, 1, 1, R, B);
> > +  FP_PACK_SEMIRAW_H (r, R);
> > +  FP_HANDLE_EXCEPTIONS;
> > +
> > +  return r;
> > +}
> > --- libgcc/soft-fp/extendbfsf2.c.jj   2022-10-13 16:57:09.460768054 +0200
> > +++ libgcc/soft-fp/extendbfsf2.c      2022-10-13 16:57:09.460768054 +0200
> > @@ -0,0 +1,49 @@
> > +/* Software floating-point emulation.
> > +   Return an bfloat16 converted to IEEE single
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   In addition to the permissions in the GNU Lesser General Public
> > +   License, the Free Software Foundation gives you unlimited
> > +   permission to link the compiled version of this file into
> > +   combinations with other programs, and to distribute those
> > +   combinations without any restriction coming from the use of this
> > +   file.  (The Lesser General Public License restrictions do apply in
> > +   other respects; for example, they cover modification of the file,
> > +   and distribution when not linked into a combine executable.)
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#define FP_NO_EXACT_UNDERFLOW
> > +#include "soft-fp.h"
> > +#include "brain.h"
> > +#include "single.h"
> > +
> > +SFtype
> > +__extendbfsf2 (BFtype a)
> > +{
> > +  FP_DECL_EX;
> > +  FP_DECL_B (A);
> > +  FP_DECL_S (R);
> > +  SFtype r;
> > +
> > +  FP_INIT_EXCEPTIONS;
> > +  FP_UNPACK_RAW_B (A, a);
> > +  FP_EXTEND (S, B, 1, 1, R, A);
> > +  FP_PACK_RAW_S (r, R);
> > +  FP_HANDLE_EXCEPTIONS;
> > +
> > +  return r;
> > +}
> > --- libiberty/cp-demangle.h.jj        2022-10-03 18:00:53.342731278 +0200
> > +++ libiberty/cp-demangle.h   2022-10-13 16:57:09.488767670 +0200
> > @@ -180,7 +180,7 @@ d_advance (struct d_info *di, int i)
> >   extern const struct demangle_operator_info cplus_demangle_operators[];
> >   #endif
> >
> > -#define D_BUILTIN_TYPE_COUNT (35)
> > +#define D_BUILTIN_TYPE_COUNT (36)
> >
> >   CP_STATIC_IF_GLIBCPP_V3
> >   const struct demangle_builtin_type_info
> > --- libiberty/cp-demangle.c.jj        2022-10-11 14:50:14.605771753 +0200
> > +++ libiberty/cp-demangle.c   2022-10-13 16:57:09.538766983 +0200
> > @@ -2487,6 +2487,7 @@ cplus_demangle_builtin_types[D_BUILTIN_T
> >     /* 33 */ { NL ("decltype(nullptr)"),      NL ("decltype(nullptr)"),
> >            D_PRINT_DEFAULT },
> >     /* 34 */ { NL ("_Float"), NL ("_Float"),          D_PRINT_FLOAT },
> > +  /* 35 */ { NL ("std::bfloat16_t"), NL ("std::bfloat16_t"), D_PRINT_FLOAT },
> >   };
> >
> >   CP_STATIC_IF_GLIBCPP_V3
> > @@ -2751,11 +2752,22 @@ cplus_demangle_type (struct d_info *di)
> >
> >       case 'F':
> >         /* DF<number>_ - _Float<number>.
> > -          DF<number>x - _Float<number>x.  */
> > +          DF<number>x - _Float<number>x
> > +          DF16b - std::bfloat16_t.  */
> >         {
> >           int arg = d_number (di);
> >           char buf[12];
> >           char suffix = 0;
> > +         if (d_peek_char (di) == 'b')
> > +           {
> > +             if (arg != 16)
> > +               return NULL;
> > +             d_advance (di, 1);
> > +             ret = d_make_builtin_type (di,
> > +                                        &cplus_demangle_builtin_types[35]);
> > +             di->expansion += ret->u.s_builtin.type->len;
> > +             break;
> > +           }
> >           if (d_peek_char (di) == 'x')
> >             suffix = 'x';
> >           if (!suffix && d_peek_char (di) != '_')
> > --- libiberty/testsuite/demangle-expected.jj  2022-10-11 14:50:14.618771575 +0200
> > +++ libiberty/testsuite/demangle-expected     2022-10-13 16:57:09.553766778 +0200
> > @@ -1249,6 +1249,10 @@ xxx
> >   _Z3xxxDF32xDF64xDF128xCDF32xVb
> >   xxx(_Float32x, _Float64x, _Float128x, _Float32x _Complex, bool volatile)
> >   xxx
> > +--format=auto --no-params
> > +_Z3xxxDF16b
> > +xxx(std::bfloat16_t)
> > +xxx
> >   # https://sourceware.org/bugzilla/show_bug.cgi?id=16817
> >   --format=auto --no-params
> >   _QueueNotification_QueueController__$4PPPPPPPM_A_INotice___Z
> >
> >
> >       Jakub
> >
>

next prev parent reply	other threads:[~2022-10-13 21:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-29 15:55 [RFC PATCH] c++, i386, arm, aarch64, libgcc: " Jakub Jelinek
2022-09-30 13:49 ` Jason Merrill
2022-09-30 14:08   ` Jakub Jelinek
2022-09-30 18:21     ` Joseph Myers
2022-09-30 18:38       ` Jakub Jelinek
2022-09-30 19:27         ` Jonathan Wakely
2022-10-04  9:06     ` [PATCH] middle-end, c++, i386, " Jakub Jelinek
2022-10-04 15:54       ` Joseph Myers
2022-10-04 21:50       ` Jason Merrill
2022-10-05 13:47         ` Jakub Jelinek
2022-10-05 20:02           ` Jason Merrill
2022-10-12  8:23             ` [PATCH] machmode: Introduce GET_MODE_NEXT_MODE with previous GET_MODE_WIDER_MODE meaning, add new GET_MODE_WIDER_MODE Jakub Jelinek
2022-10-12 10:15               ` Richard Sandiford
2022-10-12 11:07                 ` [PATCH] machmode, v2: " Jakub Jelinek
2022-10-12 11:49                   ` Richard Sandiford
2022-10-12 10:37               ` [PATCH] machmode: " Eric Botcazou
2022-10-12 10:57                 ` Jakub Jelinek
2022-10-13 16:50             ` [PATCH] middle-end, c++, i386, libgcc, v2: std::bfloat16_t and __bf16 arithmetic support Jakub Jelinek
2022-10-13 19:37               ` Jason Merrill
2022-10-13 21:11                 ` Uros Bizjak [this message]
2022-10-13 21:35                   ` Jakub Jelinek
2022-10-13 21:46                     ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4bEc3v64wXJcYL0-NMED0P48K98j_mQjSuiVYp+PrKK2Q@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jason@redhat.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=joseph@codesourcery.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).