From: Carl Love <cel@us.ibm.com>
To: "Kewen.Lin" <linkw@linux.ibm.com>, cel@us.ibm.com
Cc: Peter Bergner <bergner@vnet.ibm.com>,
Segher Boessenkool <segher@kernel.crashing.org>,
gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values
Date: Thu, 08 Jun 2023 08:21:28 -0700 [thread overview]
Message-ID: <0b073824bb2df633af8c43e2baba2d6d5c80e17c.camel@us.ibm.com> (raw)
In-Reply-To: <a69fbe29-6ca8-da7b-a894-180494c80acd@linux.ibm.com>
Kewen:
On Wed, 2023-06-07 at 17:36 +0800, Kewen.Lin wrote:
> Hi,
>
> on 2023/6/7 03:54, Carl Love wrote:
> > On Mon, 2023-06-05 at 16:45 +0800, Kewen.Lin wrote:
> > > Hi Carl,
> > >
> > > on 2023/5/2 23:52, Carl Love via Gcc-patches wrote:
> > > > GCC maintainers:
> > > >
> > > > The following patch adds three buitins for inserting and
> > > > extracting
> > > > the
> > > > exponent and significand for an IEEE 128-bit floating point
> > > > values.
> > > > The builtins are valid for Power 9 and Power 10.
> > >
> > > We already have:
> > >
> > > unsigned long long int scalar_extract_exp (__ieee128 source);
> > > unsigned __int128 scalar_extract_sig (__ieee128 source);
> > > ieee_128 scalar_insert_exp (unsigned __int128 significand,
> > > unsigned long long int exponent);
> > > ieee_128 scalar_insert_exp (ieee_128 significand, unsigned long
> > > long
> > > int exponent);
> > >
> > > you need to say something about the requirements or the
> > > justification
> > > for
> > > adding more, for this patch itself, some comments are inline
> > > below,
> > > thanks!
> >
> > I implemented the patch based on a request for the builtins. It
> > didn't
> > include any justification so I reached out to Steve Monroe who
> > requested the builtins to understand why he wanted them. Here is
> > his
> > reply:
> >
> > Basically there is no clean and performant way to transfer
> > between a
> > vector type and the ieee128 scalar, despite the fact that both
> > reside in vector registers. Also a union transfer does not work
> > correctly on most GCC versions (and will likely break again in
> > the
> > next release). I offer the long sad history of the IBM long
> > double
> > float runtime.
>
> Thanks for clarifying this. As the proposed changes, I think he
> meant
> to say "Basically there is no clean and performant way to transfer
> between
> a vector type and the scalar **types**". :) Because the proposed
> changes
> are:
> scalar_extract_exp: unsigned long long => vector unsigned long long
> scalar_extract_sig: unsigned __int128 => vector unsigned __int128
> scalar_insert_exp: unsigned __int128 => vector unsigned __int128
> unsigned long long => vector unsigned long long.
>
> > Also there are __ieee128 operations that are provided by
> > builtins
> > for POWER9 but are not provided by libgcc (for POWER8).
> >
> > Finally I can prove that a softfloat __ieee128 implementation
> > using
> > VMX integer operations, out-performs the current libgcc
> > implementation using DW GPRs.
> >
> > The details are in the PVECLIB documentation
> > pveclib/vec__f128__ppc.h
> >
> >
> > > > The patch has been tested on both Power 9 and Power 10.
> > > >
> > > > Please let me know if this patch is acceptable for
> > > > mainline. Thanks.
> > > >
> > > > Carl
> > > >
> > > >
> > > > ----------------------
> > > > From a20cc81f98cce1140fc95775a7c25b55d1ca7cba Mon Sep 17
> > > > 00:00:00
> > > > 2001
> > > > From: Carl Love <cel@us.ibm.com>
> > > > Date: Wed, 12 Apr 2023 17:46:37 -0400
> > > > Subject: [PATCH] rs6000: Add builtins for IEEE 128-bit floating
> > > > point values
> > > >
> > > > Add support for the following builtins:
> > > >
> > > > __vector unsigned long long int __builtin_extractf128_exp
> > > > (__ieee128);
> > >
> > > Could you make the name similar to the existing one? The
> > > existing
> > > one
> > >
> > > unsigned long long int scalar_extract_exp (__ieee128 source);
> > >
> > > has nothing like f128 on its name, this variant is just to change
> > > the
> > > return type to vector type, how about scalar_extract_exp_to_vec?
> >
> > I changed the name __builtin_extractf128_exp to
> > __builtin_scalar_extract_exp_to_vec.
> >
> > > > __vector unsigned __int128 __builtin_extractf128_sig
> > > > (__ieee128);
> > >
> > > Ditto.
> >
> > I changed the name __builtin_extractf128_sig to
> > __builtin_scalar_extract_sig_to_vec.
> >
> > > > __ieee128 __builtin_insertf128_exp (__vector unsigned
> > > > __int128,
> > > > __vector unsigned long
> > > > long);
> > >
> > > This one can just overload the existing scalar_insert_exp?
> >
> > I tried making this one an overloaded version of
> > scalar_insert_exp. However, the overload with the vector arguments
> > isn't recognized when I put the overload definition at the end of
> > the
> > list of overloads. When I tried putting the vector version as the
> > first overloaded definition, I get an internal error
> > on __builtin_vsx_scalar_insert_exp_q which is has the same
> > arguments
> > types but as scalars not vectors. Best I can tell, there is an
> > issue
> > with mixing scalar and vector arguments in an overloaded builtin.
>
> No, it's not due to mixing scalar and vector arguments, I looked into
> this and found we have some special handling for this builtin in
> altivec_resolve_overloaded_builtin, see the code:
>
> case RS6000_OVLD_VEC_VSIE:
> {
> machine_mode arg1_mode = TYPE_MODE (types[0]);
>
> /* If supplied first argument is wider than 64 bits, resolve to
> 128-bit variant of built-in function. */
> if (GET_MODE_PRECISION (arg1_mode) > 64)
> {
> /* If first argument is of float variety, choose variant
> that expects __ieee128 argument. Otherwise, expect
> __int128 argument. */
> if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
> instance_code = RS6000_BIF_VSIEQPF;
> else
> instance_code = RS6000_BIF_VSIEQP;
> }
> ....
>
> When the 1st arg's precision is larger than 64, it just picks up
> RS6000_BIF_VSIEQPF or RS6000_BIF_VSIEQP, you need to update this for
> what you newly added. Something like (against your v2):
>
> const _Float128 __builtin_vsx_scalar_insert_exp_vqp (vuq, vull);
> - VSIEDP_VULL xsiexpqpf_f128_kf {}
> + VSIEQPV xsiexpqpf_f128_kf {}
>
>
> @@ -1934,6 +1934,8 @@ altivec_resolve_overloaded_builtin (location_t
> loc, tree fndecl,
> __int128 argument. */
> if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
> instance_code = RS6000_BIF_VSIEQPF;
> + else if (VECTOR_MODE_P (arg1_mode))
> + instance_code = RS6000_BIF_VSIEQPV;
> else
> instance_code = RS6000_BIF_VSIEQP;
> }
>
> diff --git a/gcc/config/rs6000/rs6000-overload.def
> b/gcc/config/rs6000/rs6000-overload.def
> index 102ead9f80b..84743114c8e 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -4515,8 +4515,8 @@
> VSIEQP
> _Float128 __builtin_vec_scalar_insert_exp (_Float128, unsigned
> long long);
> VSIEQPF
> - _Float128 __builtin_vsx_scalar_insert_exp_vqp (vuq, vull);
> - VSIEDP_VULL
> + _Float128 __builtin_vec_scalar_insert_exp (vuq, vull);
OK, I missed the special handling of the VSIE builtin in the function.
With the suggested changes the overloading works. Thanks for the help.
>
<snip>
> > > > +
> > > > (define_expand "vreplace_elt_<mode>"
> > > > [(set (match_operand:REPLACE_ELT 0 "register_operand")
> > > > (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1
> > > > "register_operand")
> > > > @@ -5016,6 +5051,15 @@
> > > > "xsxexpqp %0,%1"
> > > > [(set_attr "type" "vecmove")])
> > > >
> > > > +;; VSX Scalar to Vector Extract Exponent IEEE 128-bit floating
> > > > point format
> > > > +(define_insn "xsxexpqp_f128_<mode>"
> > > > + [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> > > > + (unspec:V2DI [(match_operand:IEEE128 1
> > > > "altivec_register_operand" "v")]
> > > > + UNSPEC_VSX_SXEXPDP))]
> > > > + "TARGET_P9_VECTOR"
> > > > + "xsxexpqp %0,%1"
> > > > + [(set_attr "type" "vecmove")])
> > >
> > > I think you can just merge this into the existing xsxexpqp_<mode>
> > > by
> > > extending
> > > with one mode like xsxexpqp_<IEEE128:mode>_<xxx:mode> for
> > > unspec:xxx.
> > >
> > > xxx can be one mode iterator for DI and V2DI.
> >
> > OK, I played with this, it is rather sophisticated syntax. I
> > didn't
> > seem to be able to get the syntax correct and thus get all this to
> > work. It also seemed to require updating some case statements for
> > the
> > instructions in various files. I opted to not change this and just
> > keep it simple. Not sure if that is OK or not?
> >
>
> I meant something like:
>
> (define_mode_iterator VSEEQP_DI [V2DI DI])
>
> ;; VSX Scalar Extract Exponent Quad-Precision
> (define_insn "xsxexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>"
> [(set (match_operand:VSEEQP_DI 0 "altivec_register_operand" "=v")
> (unspec:VSEEQP_DI
> [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> UNSPEC_VSX_SXEXPDP))]
> "TARGET_P9_VECTOR"
> "xsxexpqp %0,%1"
> [(set_attr "type" "vecmove")])
OK, I the define_insn part right. I was using an existing iterator
that included both DI and V2DI as well as two additional types. That
results in two unused patterns. Buit it looks like the issue I got
stuck on was not getting the tweaks in file rs6000-builtins.def all
correct. I thought the errors I was getting was due to incorrect
syntax in the define_insn, but were actually errors in rs6000-
builtins.def. Once I sorted that out, it works.
I did go with the define_mode iterator you suggested so we don't
generate additional unused patterns. Again, thanks for the help on
that.
>
> since what you need just one different mode from DI for the operand
> 0,
> using one mode iterator can make you not need to duplicate it with
> a different mode. With above, you can use xsxexpqp_kf_di for the
> previous xsxexpqp_kf, and xsxexpqp_kf_v2di for your proposed
> __builtin_scalar_extract_exp_to_vec.
>
> BR,
> Kewen
Carl
next prev parent reply other threads:[~2023-06-08 15:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-02 15:52 Carl Love
2023-06-05 8:45 ` Kewen.Lin
2023-06-06 19:54 ` [PATCH ver 2] " Carl Love
2023-06-06 19:54 ` [PATCH] " Carl Love
2023-06-07 9:36 ` Kewen.Lin
2023-06-08 15:21 ` Carl Love [this message]
2023-06-08 15:21 ` [PATCH ver 3] " Carl Love
2023-06-13 3:10 ` Kewen.Lin
2023-06-14 16:17 ` Carl Love
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b073824bb2df633af8c43e2baba2d6d5c80e17c.camel@us.ibm.com \
--to=cel@us.ibm.com \
--cc=bergner@vnet.ibm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=linkw@linux.ibm.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).