Re: [PATCH] middle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic support

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jakub Jelinek <jakub@redhat.com>
To: Jason Merrill <jason@redhat.com>
Cc: "Joseph S. Myers" <joseph@codesourcery.com>,
	Richard Biener <rguenther@suse.de>,
	Jeff Law <jeffreyalaw@gmail.com>, Uros Bizjak <ubizjak@gmail.com>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] middle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic support
Date: Wed, 5 Oct 2022 15:47:36 +0200	[thread overview]
Message-ID: <Yz2K+FDJc8gYUj3E@tucnak> (raw)
In-Reply-To: <55062a15-79a1-f8cf-ed20-25ca8ff42abe@redhat.com>

On Tue, Oct 04, 2022 at 05:50:50PM -0400, Jason Merrill wrote:
> > Another question is the suffixes of the builtins.  For now I have added
> > bf16 suffix and enabled the builtins with !both_p, so one always needs to
> > use __builtin_* form for them.  None of the GCC builtins end with b,
> > so this isn't ambiguous with __builtin_*f16, but some libm functions do end
> > with b, in particular ilogb, logb and f{??,??x}sub.  ilogb and the subs
> > always have it, but is __builtin_logbf16 f16 suffixed logb or bf16 suffixed
> > log?  Shall the builtins use f16b suffixes instead like the mangling does?
> 
> Do we want bf16 builtins at all?  The impression I've gotten is that users
> want computation to happen in SFmode and only later truncate back to BFmode.

As I wrote earlier, I think we need at least one, __builtin_nans variant
which would be used in libstdc++
std::numeric_limits<std::bfloat16_t>::signaling_NaN() implementation.
I think
std::numeric_limits<std::bfloat16_t>::infinity() can be implemented as
return (__bf16) __builtin_huge_valf ();
and similarly
std::numeric_limits<std::bfloat16_t>::quiet_NaN() as
return (__bf16) __builtin_nanf ("");
but
return (__bf16) __builtin_nansf ("");
would loose the signaling NaN on the conversion and raise exception,
and as the method is constexpr,
union { unsigned short a; __bf16 b; } u = { 0x7f81 };
return u.b;
wouldn't work.  I can certainly restrict the builtins to the single
one, but wonder whether the suffix for that builtin shouldn't be chosen
such that eventually we could add more builtins if we need to
and don't run into the log with bf16 suffix vs. logb with f16 suffix
ambiguity.
As you said, most of the libstdc++ overloads for std::bfloat16_t then
can use float builtins or library calls under the hood, but std::nextafter
is another case where I think we'll need to have something bfloat16_t
specific, because float ulp isn't bfloat16_t ulp, the latter is much larger.

Based on what Joseph wrote, I'll add bf16/BF16 suffix support for C too
in the next iteration (always with pedwarn in that case).

> > @@ -5716,7 +5716,13 @@ emit_store_flag_1 (rtx target, enum rtx_
> >       {
> >        machine_mode optab_mode = mclass == MODE_CC ? CCmode : compare_mode;
> >        icode = optab_handler (cstore_optab, optab_mode);
> > -     if (icode != CODE_FOR_nothing)
> > +     if (icode != CODE_FOR_nothing
> > +	 /* Don't consider [BH]Fmode as usable wider mode, as neither is
> > +	    a subset or superset of the other.  */
> > +	 && (compare_mode == mode
> > +	     || !SCALAR_FLOAT_MODE_P (compare_mode)
> > +	     || maybe_ne (GET_MODE_PRECISION (compare_mode),
> > +			  GET_MODE_PRECISION (mode))))
> 
> Why do you need to do this here (and in prepare_cmp_insn, and similarly in
> can_compare_p)?  Shouldn't get_wider skip over modes that are not actually
> wider?

I'm afraid too many places rely on all modes of a certain class to be
visible when walking from "narrowest" to "widest" mode, say
FOR_EACH_MODE_IN_CLASS/FOR_EACH_MODE/FOR_EACH_MODE_UNTIL/FOR_EACH_WIDER_MODE
etc. wouldn't work at all if GET_MODE_WIDER_MODE (BFmode) == SFmode
&& GET_MODE_WIDER_MODE (HFmode) == SFmode.

Note, besides this GET_MODE_PRECISION (HFmode) == GET_MODE_PRECISION (BFmode)
case, another set of modes which have the same size are powerpc*
TFmode/IFmode/KFmode, but in that case it makes ugly hacks where it
artificially lowers the precision of 2 of them:
rs6000-modes.h:#define FLOAT_PRECISION_IFmode	128
rs6000-modes.h:#define FLOAT_PRECISION_TFmode	127
rs6000-modes.h:#define FLOAT_PRECISION_KFmode	126
(and the middle-end then has to work around that mess).  Doing something
similar wouldn't help the BFmode vs. HFmode case though, one of them would
have wider precision and so e.g. C FE would then prefer it, but more
importantly, as they are unordered modes where most of the optabs aren't
implemented it is bad to pick optabs for the "wider" mode to handle the
"narrower" one.  I think powerpc works because they define optabs for
all the 3 modes when those modes are usable.

	Jakub

next prev parent reply	other threads:[~2022-10-05 13:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-29 15:55 [RFC PATCH] c++, i386, arm, aarch64, " Jakub Jelinek
2022-09-30 13:49 ` Jason Merrill
2022-09-30 14:08   ` Jakub Jelinek
2022-09-30 18:21     ` Joseph Myers
2022-09-30 18:38       ` Jakub Jelinek
2022-09-30 19:27         ` Jonathan Wakely
2022-10-04  9:06     ` [PATCH] middle-end, c++, i386, " Jakub Jelinek
2022-10-04 15:54       ` Joseph Myers
2022-10-04 21:50       ` Jason Merrill
2022-10-05 13:47         ` Jakub Jelinek [this message]
2022-10-05 20:02           ` Jason Merrill
2022-10-12  8:23             ` [PATCH] machmode: Introduce GET_MODE_NEXT_MODE with previous GET_MODE_WIDER_MODE meaning, add new GET_MODE_WIDER_MODE Jakub Jelinek
2022-10-12 10:15               ` Richard Sandiford
2022-10-12 11:07                 ` [PATCH] machmode, v2: " Jakub Jelinek
2022-10-12 11:49                   ` Richard Sandiford
2022-10-12 10:37               ` [PATCH] machmode: " Eric Botcazou
2022-10-12 10:57                 ` Jakub Jelinek
2022-10-13 16:50             ` [PATCH] middle-end, c++, i386, libgcc, v2: std::bfloat16_t and __bf16 arithmetic support Jakub Jelinek
2022-10-13 19:37               ` Jason Merrill
2022-10-13 21:11                 ` Uros Bizjak
2022-10-13 21:35                   ` Jakub Jelinek
2022-10-13 21:46                     ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yz2K+FDJc8gYUj3E@tucnak \
    --to=jakub@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=joseph@codesourcery.com \
    --cc=rguenther@suse.de \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).