From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by sourceware.org (Postfix) with ESMTPS id 1D8CD3858405 for ; Mon, 27 Sep 2021 12:53:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1D8CD3858405 Received: by mail-ed1-x52f.google.com with SMTP id s17so50243965edd.8 for ; Mon, 27 Sep 2021 05:53:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Bo9M4OAh1ph9LYpp1S8wvWbsOAz2+YPUi1Y2bqFElVU=; b=no7K0iz46gUHefU22FuBZj00VpCg3wl6nYjh0gZbTNJc6uAjY/t5Imq7ziITe5mKAI VMgL0T14JUs7VY49mVZWRJRru0NVOLDizrG/uk92hMvimoxaWpb1jLxkGaR6sA8F0bxU v1ZXXBAQ3e6M+ETr6gOTkTyrHFcBw86BzLYnpcemH5NR+zovP0IPKZoxY8Y+6a6KVGWz rK0qToy2p1rugW3dAHa/29sJ8nR19HcKD8PkdhC15WwlU2flSqJ4oZOQFrwzHrmyOidz 63UB35/slxkpXh/0uP9+0Hk5kBA1mVumouSLaOnMVerg9/YOapd7ptD6AOsWIsGoax9p PzSA== X-Gm-Message-State: AOAM533G8LG/ZGD8bX6xZly16mHOpy6k3Sfw8G5yxaq+CR4AY4qRPYAC nMHeihw83vg+J9I+Mrlz/TBOQch7vHfZbyl4WHw= X-Google-Smtp-Source: ABdhPJyatLD90rCtfVX8NuRulhG5Kq1tfRGFpG0NuAQ54hwr+B9KHjneTudiPfTbzwDRWU/SA7pvqzK/fvuYzRUAw1w= X-Received: by 2002:a17:906:b104:: with SMTP id u4mr27470657ejy.201.1632747238072; Mon, 27 Sep 2021 05:53:58 -0700 (PDT) MIME-Version: 1.0 References: <20210924112552.2524168-1-hongtao.liu@intel.com> In-Reply-To: <20210924112552.2524168-1-hongtao.liu@intel.com> From: Richard Biener Date: Mon, 27 Sep 2021 14:53:47 +0200 Message-ID: Subject: Re: [PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available. To: liuhongt Cc: GCC Patches , Uros Bizjak , Hongtao Liu , "H. J. Lu" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2021 12:54:01 -0000 On Fri, Sep 24, 2021 at 1:26 PM liuhongt wrote: > > Hi: > Related discussion in [1] and PR. > > Bootstrapped and regtest on x86_64-linux-gnu{-m32,}. > Ok for trunk? > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574330.html > > gcc/ChangeLog: > > PR target/102464 > * config/i386/i386.c (ix86_optab_supported_p): > Return true for HFmode. > * match.pd: Simplify (_Float16) ceil ((double) x) to > __builtin_ceilf16 (a) when a is _Float16 type and > direct_internal_fn_supported_p. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr102464.c: New test. > --- > gcc/config/i386/i386.c | 20 +++++++----- > gcc/match.pd | 28 +++++++++++++++++ > gcc/testsuite/gcc.target/i386/pr102464.c | 39 ++++++++++++++++++++++++ > 3 files changed, 79 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr102464.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index ba89e111d28..3767fe9806d 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -23582,20 +23582,24 @@ ix86_optab_supported_p (int op, machine_mode mode1, machine_mode, > return opt_type == OPTIMIZE_FOR_SPEED; > > case rint_optab: > - if (SSE_FLOAT_MODE_P (mode1) > - && TARGET_SSE_MATH > - && !flag_trapping_math > - && !TARGET_SSE4_1) > + if (mode1 == HFmode) > + return true; > + else if (SSE_FLOAT_MODE_P (mode1) > + && TARGET_SSE_MATH > + && !flag_trapping_math > + && !TARGET_SSE4_1) > return opt_type == OPTIMIZE_FOR_SPEED; > return true; > > case floor_optab: > case ceil_optab: > case btrunc_optab: > - if (SSE_FLOAT_MODE_P (mode1) > - && TARGET_SSE_MATH > - && !flag_trapping_math > - && TARGET_SSE4_1) > + if (mode1 == HFmode) > + return true; > + else if (SSE_FLOAT_MODE_P (mode1) > + && TARGET_SSE_MATH > + && !flag_trapping_math > + && TARGET_SSE4_1) > return true; > return opt_type == OPTIMIZE_FOR_SPEED; > > diff --git a/gcc/match.pd b/gcc/match.pd > index a9791ceb74a..9ccec8b6ce3 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -6191,6 +6191,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (froms (convert float_value_p@0)) > (convert (tos @0))))) > > +#if GIMPLE > +(match float16_value_p > + @0 > + (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == float16_type_node))) > +(for froms (BUILT_IN_TRUNCL BUILT_IN_TRUNC BUILT_IN_TRUNCF > + BUILT_IN_FLOORL BUILT_IN_FLOOR BUILT_IN_FLOORF > + BUILT_IN_CEILL BUILT_IN_CEIL BUILT_IN_CEILF > + BUILT_IN_ROUNDEVENL BUILT_IN_ROUNDEVEN BUILT_IN_ROUNDEVENF > + BUILT_IN_ROUNDL BUILT_IN_ROUND BUILT_IN_ROUNDF > + BUILT_IN_NEARBYINTL BUILT_IN_NEARBYINT BUILT_IN_NEARBYINTF > + BUILT_IN_RINTL BUILT_IN_RINT BUILT_IN_RINTF) we do have patterns that convert (truncl (convert floatval)) to (float)truncf (val), your's does (_Float16)trunc ((double) float16) -> truncF16 (float16), doesn't it make sense to have trunc ((double) float16) -> (double)trunfF16 (float16) as well? Why do you conditionalize on GIMPLE here? That said, I wonder whether we can somehow address pattern explosion here, eliding the outer (convert ...) from the match would help a bit already. The related patterns use optimize && canonicalize_math_p as well btw., not sure whether either is appropriate here since there are no _Float16 math functions available. > + tos (IFN_TRUNC IFN_TRUNC IFN_TRUNC > + IFN_FLOOR IFN_FLOOR IFN_FLOOR > + IFN_CEIL IFN_CEIL IFN_CEIL > + IFN_ROUNDEVEN IFN_ROUNDEVEN IFN_ROUNDEVEN > + IFN_ROUND IFN_ROUND IFN_ROUND > + IFN_NEARBYINT IFN_NEARBYINT IFN_NEARBYINT > + IFN_RINT IFN_RINT IFN_RINT) > + /* (_Float16) round ((doube) x) -> __built_in_roundf16 (x), etc., > + if x is a _Float16. */ > + (simplify > + (convert (froms (convert float16_value_p@0))) > + (if (types_match (type, TREE_TYPE (@0)) > + && direct_internal_fn_supported_p (as_internal_fn (tos), > + type, OPTIMIZE_FOR_BOTH)) > + (tos @0)))) > +#endif > + > (for froms (XFLOORL XCEILL XROUNDL XRINTL) > tos (XFLOOR XCEIL XROUND XRINT) > /* llfloorl(extend(x)) -> llfloor(x), etc., if x is a double. */ > diff --git a/gcc/testsuite/gcc.target/i386/pr102464.c b/gcc/testsuite/gcc.target/i386/pr102464.c > new file mode 100644 > index 00000000000..e3e060ee80b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr102464.c > @@ -0,0 +1,39 @@ > +/* PR target/102464. */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +#define FOO(FUNC,SUFFIX) \ > + _Float16 \ > + foo_##FUNC##_##SUFFIX (_Float16 a) \ > + { \ > + return __builtin_##FUNC##SUFFIX (a); \ > + } > + > +FOO (roundeven, f16); > +FOO (roundeven, f); > +FOO (roundeven, ); > +FOO (roundeven, l); > +FOO (trunc, f16); > +FOO (trunc, f); > +FOO (trunc, ); > +FOO (trunc, l); > +FOO (ceil, f16); > +FOO (ceil, f); > +FOO (ceil, ); > +FOO (ceil, l); > +FOO (floor, f16); > +FOO (floor, f); > +FOO (floor, ); > +FOO (floor, l); > +FOO (nearbyint, f16); > +FOO (nearbyint, f); > +FOO (nearbyint, ); > +FOO (nearbyint, l); > +FOO (rint, f16); > +FOO (rint, f); > +FOO (rint, ); > +FOO (rint, l); > + > +/* { dg-final { scan-assembler-not "vcvtsh2s\[sd\]" } } */ > +/* { dg-final { scan-assembler-not "extendhfxf" } } */ > +/* { dg-final { scan-assembler-times "vrndscalesh\[^\n\r\]*xmm\[0-9\]" 24 } } */ > -- > 2.27.0 >