From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 109487 invoked by alias); 22 Aug 2019 14:26:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 109476 invoked by uid 89); 22 Aug 2019 14:26:07 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-19.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT,SPF_PASS autolearn=ham version=3.3.1 spammy=H*f:sk:CAFULd4, *insn, rtx_insn, Uros X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 22 Aug 2019 14:26:05 +0000 X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "Cc" Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E3C46AD45; Thu, 22 Aug 2019 14:26:02 +0000 (UTC) From: Martin Jambor To: Uros Bizjak , Tejas Joshi Cc: "gcc-patches\@gcc.gnu.org" , Jan Hubicka Cc: Subject: Re: [PATCH] i386: Roundeven expansion for SSE4.1+ In-Reply-To: References: User-Agent: Notmuch/0.29.1 (https://notmuchmail.org) Emacs/26.2 (x86_64-suse-linux-gnu) Date: Thu, 22 Aug 2019 14:43:00 -0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2019-08/txt/msg01572.txt.bz2 Hi, On Wed, Jul 31 2019, Uros Bizjak wrote: > On Wed, Jul 31, 2019 at 7:51 AM Tejas Joshi wrote: >> >> Hi. >> >> > > * gcc.target/i386/avx-vround-roundeven-1.c: New test. >> > > * gcc.target/i386/avx-vround-roundeven-2.c: New test. >> > >> > roundss and roundsd are sse4_1 instructions, also please change tests >> > to use -O2: >> >> I have made the following changes you suggested and changed the file names to: >> >> * gcc.target/i386/sse4_1-round-roundeven-1.c: New test. >> * gcc.target/i386/sse4_1-round-roundeven-2.c: New test. > > +++ b/gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-1.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-mavx" } */ > +/* { dg-options "-msse4.1" } */ > > -O2 -msse4.1 > > OK with the above change. > Given that Jeff approved the middle-end bits, I intend to commit the following on Tejas's behalf tomorrow after I commit https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01567.html (assuming there won't be any final objections to it). It is the combination of the patches in this thread with the final minor testcase issue fixed. I have bootstrapped and tested it on x86_64-linux and for good measure also on aarch64-linux. Thanks, Martin gcc/ChangeLog: 2019-08-22 Tejas Joshi Uros Bizjak * builtins.c (mathfn_built_in_2): Change CASE_MATHFN to CASE_MATHFN_FLOATN for roundeven. * config/i386/i386.c (ix86_i387_mode_needed): Add case I387_ROUNDEVEN. (ix86_mode_needed): Likewise. (ix86_mode_after): Likewise. (ix86_mode_entry): Likewise. (ix86_mode_exit): Likewise. (ix86_emit_mode_set): Likewise. (emit_i387_cw_initialization): Add case I387_CW_ROUNDEVEN. * config/i386/i386.h (ix86_stack_slot) : Add SLOT_CW_ROUNDEVEN. (ix86_entry): Add I387_ROUNDEVEN. (avx_u128_state): Add I387_CW_ANY. * config/i386/i386.md: Define UNSPEC_FRNDINT_ROUNDEVEN. (define_int_iterator): Likewise. (define_int_attr): Likewise for rounding_insn, rounding and ROUNDING. (define_constant): Define ROUND_ROUNDEVEN mode. (define_attr): Add roundeven mode for i387_cw. (2): Add condition for ROUND_ROUNDEVEN. * internal-fn.def (ROUNDEVEN): New builtin function. * optabs.def (roundeven_optab): New optab. gcc/testsuite/ChangeLog: 2019-08-22 Tejas Joshi * gcc.target/i386/sse4_1-round-roundeven-1.c: New test. * gcc.target/i386/sse4_1-round-roundeven-2.c: New test. --- gcc/builtins.c | 2 +- gcc/config/i386/i386.c | 16 +++++++++++++ gcc/config/i386/i386.h | 4 +++- gcc/config/i386/i386.md | 23 ++++++++++++------- gcc/internal-fn.def | 1 + gcc/optabs.def | 1 + gcc/reg-stack.c | 1 + .../i386/sse4_1-round-roundeven-1.c | 17 ++++++++++++++ .../i386/sse4_1-round-roundeven-2.c | 15 ++++++++++++ 9 files changed, 70 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-1.c create mode 100644 gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-2.c diff --git a/gcc/builtins.c b/gcc/builtins.c index 5149d901a96..e44866e2a60 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -2056,7 +2056,7 @@ mathfn_built_in_2 (tree type, combined_fn fn) CASE_MATHFN (REMQUO) CASE_MATHFN_FLOATN (RINT) CASE_MATHFN_FLOATN (ROUND) - CASE_MATHFN (ROUNDEVEN) + CASE_MATHFN_FLOATN (ROUNDEVEN) CASE_MATHFN (SCALB) CASE_MATHFN (SCALBLN) CASE_MATHFN (SCALBN) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 647bcbef050..46d19c88c1f 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13565,6 +13565,11 @@ ix86_i387_mode_needed (int entity, rtx_insn *insn) switch (entity) { + case I387_ROUNDEVEN: + if (mode == I387_CW_ROUNDEVEN) + return mode; + break; + case I387_TRUNC: if (mode == I387_CW_TRUNC) return mode; @@ -13599,6 +13604,7 @@ ix86_mode_needed (int entity, rtx_insn *insn) return ix86_dirflag_mode_needed (insn); case AVX_U128: return ix86_avx_u128_mode_needed (insn); + case I387_ROUNDEVEN: case I387_TRUNC: case I387_FLOOR: case I387_CEIL: @@ -13659,6 +13665,7 @@ ix86_mode_after (int entity, int mode, rtx_insn *insn) return mode; case AVX_U128: return ix86_avx_u128_mode_after (mode, insn); + case I387_ROUNDEVEN: case I387_TRUNC: case I387_FLOOR: case I387_CEIL: @@ -13711,6 +13718,7 @@ ix86_mode_entry (int entity) return ix86_dirflag_mode_entry (); case AVX_U128: return ix86_avx_u128_mode_entry (); + case I387_ROUNDEVEN: case I387_TRUNC: case I387_FLOOR: case I387_CEIL: @@ -13748,6 +13756,7 @@ ix86_mode_exit (int entity) return X86_DIRFLAG_ANY; case AVX_U128: return ix86_avx_u128_mode_exit (); + case I387_ROUNDEVEN: case I387_TRUNC: case I387_FLOOR: case I387_CEIL: @@ -13782,6 +13791,12 @@ emit_i387_cw_initialization (int mode) switch (mode) { + case I387_CW_ROUNDEVEN: + /* round to nearest */ + emit_insn (gen_andhi3 (reg, reg, GEN_INT (0x0c00))); + slot = SLOT_CW_ROUNDEVEN; + break; + case I387_CW_TRUNC: /* round toward zero (truncate) */ emit_insn (gen_iorhi3 (reg, reg, GEN_INT (0x0c00))); @@ -13828,6 +13843,7 @@ ix86_emit_mode_set (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED, if (mode == AVX_U128_CLEAN) emit_insn (gen_avx_vzeroupper ()); break; + case I387_ROUNDEVEN: case I387_TRUNC: case I387_FLOOR: case I387_CEIL: diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index e0a77e1fb25..0f0a6e346ac 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -2502,6 +2502,7 @@ enum ix86_stack_slot { SLOT_TEMP = 0, SLOT_CW_STORED, + SLOT_CW_ROUNDEVEN, SLOT_CW_TRUNC, SLOT_CW_FLOOR, SLOT_CW_CEIL, @@ -2513,6 +2514,7 @@ enum ix86_entity { X86_DIRFLAG = 0, AVX_U128, + I387_ROUNDEVEN, I387_TRUNC, I387_FLOOR, I387_CEIL, @@ -2548,7 +2550,7 @@ enum avx_u128_state #define NUM_MODES_FOR_MODE_SWITCHING \ { X86_DIRFLAG_ANY, AVX_U128_ANY, \ - I387_CW_ANY, I387_CW_ANY, I387_CW_ANY } + I387_CW_ANY, I387_CW_ANY, I387_CW_ANY, I387_CW_ANY } /* Avoid renaming of stack registers, as doing so in combination with diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 9951d46d8b2..7ad97882419 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -141,6 +141,7 @@ UNSPEC_FXAM ;; x87 Rounding + UNSPEC_FRNDINT_ROUNDEVEN UNSPEC_FRNDINT_FLOOR UNSPEC_FRNDINT_CEIL UNSPEC_FRNDINT_TRUNC @@ -303,7 +304,8 @@ ;; Constants to represent rounding modes in the ROUND instruction (define_constants - [(ROUND_FLOOR 0x1) + [(ROUND_ROUNDEVEN 0x0) + (ROUND_FLOOR 0x1) (ROUND_CEIL 0x2) (ROUND_TRUNC 0x3) (ROUND_MXCSR 0x4) @@ -779,7 +781,7 @@ ;; Defines rounding mode of an FP operation. -(define_attr "i387_cw" "trunc,floor,ceil,uninitialized,any" +(define_attr "i387_cw" "roundeven,floor,ceil,trunc,uninitialized,any" (const_string "any")) ;; Define attribute to indicate AVX insns with partial XMM register update. @@ -16212,7 +16214,8 @@ }) (define_int_iterator FRNDINT_ROUNDING - [UNSPEC_FRNDINT_FLOOR + [UNSPEC_FRNDINT_ROUNDEVEN + UNSPEC_FRNDINT_FLOOR UNSPEC_FRNDINT_CEIL UNSPEC_FRNDINT_TRUNC]) @@ -16222,21 +16225,24 @@ ;; Base name for define_insn (define_int_attr rounding_insn - [(UNSPEC_FRNDINT_FLOOR "floor") + [(UNSPEC_FRNDINT_ROUNDEVEN "roundeven") + (UNSPEC_FRNDINT_FLOOR "floor") (UNSPEC_FRNDINT_CEIL "ceil") (UNSPEC_FRNDINT_TRUNC "btrunc") (UNSPEC_FIST_FLOOR "floor") (UNSPEC_FIST_CEIL "ceil")]) (define_int_attr rounding - [(UNSPEC_FRNDINT_FLOOR "floor") + [(UNSPEC_FRNDINT_ROUNDEVEN "roundeven") + (UNSPEC_FRNDINT_FLOOR "floor") (UNSPEC_FRNDINT_CEIL "ceil") (UNSPEC_FRNDINT_TRUNC "trunc") (UNSPEC_FIST_FLOOR "floor") (UNSPEC_FIST_CEIL "ceil")]) (define_int_attr ROUNDING - [(UNSPEC_FRNDINT_FLOOR "FLOOR") + [(UNSPEC_FRNDINT_ROUNDEVEN "ROUNDEVEN") + (UNSPEC_FRNDINT_FLOOR "FLOOR") (UNSPEC_FRNDINT_CEIL "CEIL") (UNSPEC_FRNDINT_TRUNC "TRUNC") (UNSPEC_FIST_FLOOR "FLOOR") @@ -16299,8 +16305,9 @@ || TARGET_MIX_SSE_I387) && (flag_fp_int_builtin_inexact || !flag_trapping_math)) || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH - && (TARGET_SSE4_1 || flag_fp_int_builtin_inexact - || !flag_trapping_math))" + && (TARGET_SSE4_1 + || (ROUND_ != ROUND_ROUNDEVEN + && (flag_fp_int_builtin_inexact || !flag_trapping_math))))" { if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH && (TARGET_SSE4_1 || flag_fp_int_builtin_inexact || !flag_trapping_math)) diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 9461693bcd1..b5a6ca33223 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -238,6 +238,7 @@ DEF_INTERNAL_FLT_FLOATN_FN (FLOOR, ECF_CONST, floor, unary) DEF_INTERNAL_FLT_FLOATN_FN (NEARBYINT, ECF_CONST, nearbyint, unary) DEF_INTERNAL_FLT_FLOATN_FN (RINT, ECF_CONST, rint, unary) DEF_INTERNAL_FLT_FLOATN_FN (ROUND, ECF_CONST, round, unary) +DEF_INTERNAL_FLT_FLOATN_FN (ROUNDEVEN, ECF_CONST, roundeven, unary) DEF_INTERNAL_FLT_FLOATN_FN (TRUNC, ECF_CONST, btrunc, unary) /* Binary math functions. */ diff --git a/gcc/optabs.def b/gcc/optabs.def index 5283e6753f2..0860b38badb 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -271,6 +271,7 @@ OPTAB_D (fnms_optab, "fnms$a4") OPTAB_D (rint_optab, "rint$a2") OPTAB_D (round_optab, "round$a2") +OPTAB_D (roundeven_optab, "roundeven$a2") OPTAB_D (floor_optab, "floor$a2") OPTAB_D (ceil_optab, "ceil$a2") OPTAB_D (btrunc_optab, "btrunc$a2") diff --git a/gcc/reg-stack.c b/gcc/reg-stack.c index 710f14a9544..0f0089acdea 100644 --- a/gcc/reg-stack.c +++ b/gcc/reg-stack.c @@ -1817,6 +1817,7 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr regstack, rtx pat) case UNSPEC_FRNDINT: case UNSPEC_F2XM1: + case UNSPEC_FRNDINT_ROUNDEVEN: case UNSPEC_FRNDINT_FLOOR: case UNSPEC_FRNDINT_CEIL: case UNSPEC_FRNDINT_TRUNC: diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-1.c b/gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-1.c new file mode 100644 index 00000000000..36332630618 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ + +__attribute__((noinline, noclone)) double +f1 (double x) +{ + return __builtin_roundeven (x); +} + +__attribute__((noinline, noclone)) float +f2 (float x) +{ + return __builtin_roundevenf (x); +} + +/* { dg-final { scan-assembler-times "roundsd\[^\n\r\]*xmm" 1 } } */ +/* { dg-final { scan-assembler-times "roundss\[^\n\r\]*xmm" 1 } } */ diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-2.c b/gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-2.c new file mode 100644 index 00000000000..9505796dafb --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse4_1-round-roundeven-2.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-require-effective-target sse4 } */ +/* { dg-options "-O2 -msse4.1" } */ + +#include "sse4_1-check.h" +#include "sse4_1-round-roundeven-1.c" + +static void +sse4_1_test (void) +{ + if (f1 (0.5) != 0.0 || f1 (1.5) != 2.0 || f1 (-0.5) != 0.0 || f1 (-1.5) != -2.0) + abort (); + if (f2 (0.5f) != 0.0f || f2 (1.5f) != 2.0f || f2 (-0.5f) != 0.0f || f2 (-1.5f) != -2.0f) + abort (); +} -- 2.22.0