From: Hongtao Liu <crazylht@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: liuhongt <hongtao.liu@intel.com>,
GCC Patches <gcc-patches@gcc.gnu.org>,
Uros Bizjak <ubizjak@gmail.com>,
"H. J. Lu" <hjl.tools@gmail.com>
Subject: Re: [PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.
Date: Tue, 28 Sep 2021 10:07:54 +0800 [thread overview]
Message-ID: <CAMZc-byPAQN1AtwHNufERwgC0RVXAR03uGs6XuVQJjNOGT5i5A@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc1hqHom267SivKWgfoeGAwhrLRyD+zi6oEdw1okZX20Pg@mail.gmail.com>
On Mon, Sep 27, 2021 at 8:53 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Sep 24, 2021 at 1:26 PM liuhongt <hongtao.liu@intel.com> wrote:
> >
> > Hi:
> > Related discussion in [1] and PR.
> >
> > Bootstrapped and regtest on x86_64-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574330.html
> >
> > gcc/ChangeLog:
> >
> > PR target/102464
> > * config/i386/i386.c (ix86_optab_supported_p):
> > Return true for HFmode.
> > * match.pd: Simplify (_Float16) ceil ((double) x) to
> > __builtin_ceilf16 (a) when a is _Float16 type and
> > direct_internal_fn_supported_p.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr102464.c: New test.
> > ---
> > gcc/config/i386/i386.c | 20 +++++++-----
> > gcc/match.pd | 28 +++++++++++++++++
> > gcc/testsuite/gcc.target/i386/pr102464.c | 39 ++++++++++++++++++++++++
> > 3 files changed, 79 insertions(+), 8 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr102464.c
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index ba89e111d28..3767fe9806d 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -23582,20 +23582,24 @@ ix86_optab_supported_p (int op, machine_mode mode1, machine_mode,
> > return opt_type == OPTIMIZE_FOR_SPEED;
> >
> > case rint_optab:
> > - if (SSE_FLOAT_MODE_P (mode1)
> > - && TARGET_SSE_MATH
> > - && !flag_trapping_math
> > - && !TARGET_SSE4_1)
> > + if (mode1 == HFmode)
> > + return true;
> > + else if (SSE_FLOAT_MODE_P (mode1)
> > + && TARGET_SSE_MATH
> > + && !flag_trapping_math
> > + && !TARGET_SSE4_1)
> > return opt_type == OPTIMIZE_FOR_SPEED;
> > return true;
> >
> > case floor_optab:
> > case ceil_optab:
> > case btrunc_optab:
> > - if (SSE_FLOAT_MODE_P (mode1)
> > - && TARGET_SSE_MATH
> > - && !flag_trapping_math
> > - && TARGET_SSE4_1)
> > + if (mode1 == HFmode)
> > + return true;
> > + else if (SSE_FLOAT_MODE_P (mode1)
> > + && TARGET_SSE_MATH
> > + && !flag_trapping_math
> > + && TARGET_SSE4_1)
> > return true;
> > return opt_type == OPTIMIZE_FOR_SPEED;
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index a9791ceb74a..9ccec8b6ce3 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -6191,6 +6191,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > (froms (convert float_value_p@0))
> > (convert (tos @0)))))
> >
> > +#if GIMPLE
> > +(match float16_value_p
> > + @0
> > + (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == float16_type_node)))
> > +(for froms (BUILT_IN_TRUNCL BUILT_IN_TRUNC BUILT_IN_TRUNCF
> > + BUILT_IN_FLOORL BUILT_IN_FLOOR BUILT_IN_FLOORF
> > + BUILT_IN_CEILL BUILT_IN_CEIL BUILT_IN_CEILF
> > + BUILT_IN_ROUNDEVENL BUILT_IN_ROUNDEVEN BUILT_IN_ROUNDEVENF
> > + BUILT_IN_ROUNDL BUILT_IN_ROUND BUILT_IN_ROUNDF
> > + BUILT_IN_NEARBYINTL BUILT_IN_NEARBYINT BUILT_IN_NEARBYINTF
> > + BUILT_IN_RINTL BUILT_IN_RINT BUILT_IN_RINTF)
>
> we do have patterns that convert (truncl (convert floatval)) to
> (float)truncf (val),
> your's does (_Float16)trunc ((double) float16) -> truncF16 (float16), doesn't it
> make sense to have trunc ((double) float16) -> (double)trunfF16
> (float16) as well?
>
> Why do you conditionalize on GIMPLE here?
To avoid
error: ‘direct_internal_fn_supported_p’ was not declared in this scope
>
> That said, I wonder whether we can somehow address pattern explosion here,
> eliding the outer (convert ...) from the match would help a bit already.
>
> The related patterns use optimize && canonicalize_math_p as well btw., not
> sure whether either is appropriate here since there are no _Float16 math
> functions available.
Yes, that's why I didn't follow the existing pattern, i think we can
add optimize back to the condition, but not canonicalize_math_p ()
since there's no math function for _Float16.
Also w/o the outer (convert ..), it looks like a canonicalization to
transform ceil ((double) a) to (double) __builtin_ceilf16 (a) but not
an optimization.
>
> > + tos (IFN_TRUNC IFN_TRUNC IFN_TRUNC
> > + IFN_FLOOR IFN_FLOOR IFN_FLOOR
> > + IFN_CEIL IFN_CEIL IFN_CEIL
> > + IFN_ROUNDEVEN IFN_ROUNDEVEN IFN_ROUNDEVEN
> > + IFN_ROUND IFN_ROUND IFN_ROUND
> > + IFN_NEARBYINT IFN_NEARBYINT IFN_NEARBYINT
> > + IFN_RINT IFN_RINT IFN_RINT)
> > + /* (_Float16) round ((doube) x) -> __built_in_roundf16 (x), etc.,
> > + if x is a _Float16. */
> > + (simplify
> > + (convert (froms (convert float16_value_p@0)))
> > + (if (types_match (type, TREE_TYPE (@0))
> > + && direct_internal_fn_supported_p (as_internal_fn (tos),
> > + type, OPTIMIZE_FOR_BOTH))
> > + (tos @0))))
> > +#endif
> > +
> > (for froms (XFLOORL XCEILL XROUNDL XRINTL)
> > tos (XFLOOR XCEIL XROUND XRINT)
> > /* llfloorl(extend(x)) -> llfloor(x), etc., if x is a double. */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr102464.c b/gcc/testsuite/gcc.target/i386/pr102464.c
> > new file mode 100644
> > index 00000000000..e3e060ee80b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr102464.c
> > @@ -0,0 +1,39 @@
> > +/* PR target/102464. */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mavx512fp16" } */
> > +
> > +#define FOO(FUNC,SUFFIX) \
> > + _Float16 \
> > + foo_##FUNC##_##SUFFIX (_Float16 a) \
> > + { \
> > + return __builtin_##FUNC##SUFFIX (a); \
> > + }
> > +
> > +FOO (roundeven, f16);
> > +FOO (roundeven, f);
> > +FOO (roundeven, );
> > +FOO (roundeven, l);
> > +FOO (trunc, f16);
> > +FOO (trunc, f);
> > +FOO (trunc, );
> > +FOO (trunc, l);
> > +FOO (ceil, f16);
> > +FOO (ceil, f);
> > +FOO (ceil, );
> > +FOO (ceil, l);
> > +FOO (floor, f16);
> > +FOO (floor, f);
> > +FOO (floor, );
> > +FOO (floor, l);
> > +FOO (nearbyint, f16);
> > +FOO (nearbyint, f);
> > +FOO (nearbyint, );
> > +FOO (nearbyint, l);
> > +FOO (rint, f16);
> > +FOO (rint, f);
> > +FOO (rint, );
> > +FOO (rint, l);
> > +
> > +/* { dg-final { scan-assembler-not "vcvtsh2s\[sd\]" } } */
> > +/* { dg-final { scan-assembler-not "extendhfxf" } } */
> > +/* { dg-final { scan-assembler-times "vrndscalesh\[^\n\r\]*xmm\[0-9\]" 24 } } */
> > --
> > 2.27.0
> >
--
BR,
Hongtao
next prev parent reply other threads:[~2021-09-28 2:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-24 11:25 liuhongt
2021-09-24 16:30 ` Uros Bizjak
2021-09-27 12:53 ` Richard Biener
2021-09-28 2:07 ` Hongtao Liu [this message]
2021-09-28 7:45 ` Richard Biener
2021-09-28 20:50 ` Joseph Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZc-byPAQN1AtwHNufERwgC0RVXAR03uGs6XuVQJjNOGT5i5A@mail.gmail.com \
--to=crazylht@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=hongtao.liu@intel.com \
--cc=richard.guenther@gmail.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).