From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) by sourceware.org (Postfix) with ESMTPS id 6D957386C5B6 for ; Wed, 26 Jun 2024 13:51:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6D957386C5B6 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6D957386C5B6 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719409916; cv=none; b=Ri4mcM8J3QhmUIpwv623pgoiAePwo99hv5e/fiAc51Q7pbrDhYkRZtlt3c6jqQVNTnU1Ex52FfeXUdObhW2DADraIfySZWv/Nt9cuKIBXdqhrNav3LfQJR3Aval53tn5f9g+S+aW0OAYC0SWFtYmMgeqkxci9XAQgn3xZvbdEqk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719409916; c=relaxed/simple; bh=MSI24Gkbc3oYwNrGu22+K2qWzcE05E2ICLiCQmwK9YU=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=idjIwmNUAxRt3Ah+0v1u2AahWJuIJVOgpa1wS0e9pnWq+Hs80CtCO2I/OuTp6J8JrMrVdI/cpHTBnQ9Lbzc0pKm1W6kVDq5Gwji+chvCxtmeuzzofRdGO8s9UIqO8M2ZrY80TkpC0COs0XsNC0wU3/PJ4YplRHf9H2xhTOVeaj0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-52ce9ba0cedso4206658e87.2 for ; Wed, 26 Jun 2024 06:51:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719409911; x=1720014711; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7Krt3o2hAq3ogIei3U8oTOoXabnpgycb6rVQEZaKeTw=; b=Nifn69vOzx1fuWfKhzmJnExREEe1tzz4Gw3T5KKHhAhi9CnzIsdOEd1jMDSwlaL6c5 2X+IeXTltZXLEo1J41ImB2/eswehh8D+QP6hJva5AtihA0h4FrlL7pUq6QH0wT2AXmbW okSNj6tk47zPBygpAxXo+cxWUBcXoukV/nwnyRVHbqHh2xPBc9ihhiQ/UcnWNMPVAvUC lj6HNyEP7ZSId9ha0Lu9XyzEXzcIH9YnS9DSkPXfr+WmWcf6AqYLw3glBd0BcMP5Wcj+ ZFhFnqFuPOif4azQMVequC+u4e9VU7xRdkALxLzSm9o6Y649UauBdiTz2md/sE2kEAjL Tjxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719409911; x=1720014711; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7Krt3o2hAq3ogIei3U8oTOoXabnpgycb6rVQEZaKeTw=; b=aUDlw70JYHlCFr+UyQv1jxS+lfctKI6kqTCBcAL/0+UH68Pw/i8mswDZkLLOhSvio6 0lHG9pANagWvirsBgcqFnF1qnyK4s+vQtlq6b57Pyk71v5I34ew1y6rBKK9uSvzs695y SITE40h4o24neP8vl+J3h8VRDA1sD0H0ORhLlC0RYZx+eL9ujkj55SdXSOXr0AbxAbhu 9tM36yHma12yRHOdZARPteAce1AH5hMFx94J+BSMC/49p+XUbBX1fMDNX8t8Yh3QDL5a EUpvEVj9qLCL2ObIMelZVKJacfFAKqEGu146PMZGa1CsU1GEeHwA78BJN8dfyccP9IPV PTaA== X-Gm-Message-State: AOJu0YwxdXZiZ2FsPBOl+8BUnjTCXRvhSMy1v1XEN5vOhZgskhaBR8aM dvLFaw0axwbmZKUwA4fvXPbxAEB07yKS5C7lSaalaMjvp06uTwAbSI1ISZwhg5pk9Mv+V9rccTz KqQj7g6iW7Zth7XCjsIqANu4J6MebulMQ X-Google-Smtp-Source: AGHT+IEK2pOUfbYLY9lEJjSzqdAYh/orPTcgjXPuweMVXosUgCzPgLvK3szaIPFFhs/3UWcUpXFEOgISismZDy38rRc= X-Received: by 2002:a05:6512:3da5:b0:52c:e3bd:c70e with SMTP id 2adb3069b0e04-52ce3bdc793mr9446844e87.6.1719409910578; Wed, 26 Jun 2024 06:51:50 -0700 (PDT) MIME-Version: 1.0 References: <20240626014559.765149-1-pan2.li@intel.com> In-Reply-To: <20240626014559.765149-1-pan2.li@intel.com> From: Richard Biener Date: Wed, 26 Jun 2024 15:51:39 +0200 Message-ID: Subject: Re: [PATCH v1] Internal-fn: Support new IFN SAT_TRUNC for unsigned scalar int To: pan2.li@intel.com Cc: gcc-patches@gcc.gnu.org, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Jun 26, 2024 at 3:46=E2=80=AFAM wrote: > > From: Pan Li > > This patch would like to add the middle-end presentation for the > saturation truncation. Aka set the result of truncated value to > the max value when overflow. It will take the pattern similar > as below. > > Form 1: > #define DEF_SAT_U_TRUC_FMT_1(WT, NT) \ > NT __attribute__((noinline)) \ > sat_u_truc_##T##_fmt_1 (WT x) \ > { \ > bool overflow =3D x > (WT)(NT)(-1); \ > return ((NT)x) | (NT)-overflow; \ > } > > For example, truncated uint16_t to uint8_t, we have > > * SAT_TRUNC (254) =3D> 254 > * SAT_TRUNC (255) =3D> 255 > * SAT_TRUNC (256) =3D> 255 > * SAT_TRUNC (65536) =3D> 255 > > Given below SAT_TRUNC from uint64_t to uint32_t. > > DEF_SAT_U_TRUC_FMT_1 (uint64_t, uint32_t) > > Before this patch: > __attribute__((noinline)) > uint32_t sat_u_truc_T_fmt_1 (uint64_t x) > { > _Bool overflow; > unsigned int _1; > unsigned int _2; > unsigned int _3; > uint32_t _6; > > ;; basic block 2, loop depth 0 > ;; pred: ENTRY > overflow_5 =3D x_4(D) > 4294967295; > _1 =3D (unsigned int) x_4(D); > _2 =3D (unsigned int) overflow_5; > _3 =3D -_2; > _6 =3D _1 | _3; > return _6; > ;; succ: EXIT > > } > > After this patch: > __attribute__((noinline)) > uint32_t sat_u_truc_T_fmt_1 (uint64_t x) > { > uint32_t _6; > > ;; basic block 2, loop depth 0 > ;; pred: ENTRY > _6 =3D .SAT_TRUNC (x_4(D)); [tail call] > return _6; > ;; succ: EXIT > > } > > The below tests are passed for this patch: > *. The rv64gcv fully regression tests. > *. The rv64gcv build with glibc. > *. The x86 bootstrap tests. > *. The x86 fully regression tests. > > gcc/ChangeLog: > > * internal-fn.def (SAT_TRUNC): Add new signed IFN sat_trunc as > unary_convert. > * match.pd: Add new matching pattern for unsigned int sat_trunc. > * optabs.def (OPTAB_CL): Add unsigned and signed optab. > * tree-ssa-math-opts.cc (gimple_unsigend_integer_sat_trunc): Add > new decl for the matching pattern generated func. > (match_unsigned_saturation_trunc): Add new func impl to match > the .SAT_TRUNC. > (math_opts_dom_walker::after_dom_children): Add .SAT_TRUNC match > function under BIT_IOR_EXPR case. > * tree.cc (integer_half_truncated_all_ones_p): Add new func impl > to filter the truncated threshold. > * tree.h (integer_half_truncated_all_ones_p): Add new func decl. > > Signed-off-by: Pan Li > --- > gcc/internal-fn.def | 2 ++ > gcc/match.pd | 12 +++++++++++- > gcc/optabs.def | 3 +++ > gcc/tree-ssa-math-opts.cc | 32 ++++++++++++++++++++++++++++++++ > gcc/tree.cc | 22 ++++++++++++++++++++++ > gcc/tree.h | 6 ++++++ > 6 files changed, 76 insertions(+), 1 deletion(-) > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index a8c83437ada..915d329c05a 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -278,6 +278,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF= _NOTHROW, first, > DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, b= inary) > DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_SUB, ECF_CONST, first, sssub, ussub, b= inary) > > +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_TRUNC, ECF_CONST, first, sstrunc, ustr= unc, unary_convert) > + > DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) > DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) > DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) > diff --git a/gcc/match.pd b/gcc/match.pd > index 3d0689c9312..d4062434cc7 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -39,7 +39,8 @@ along with GCC; see the file COPYING3. If not see > HONOR_NANS > uniform_vector_p > expand_vec_cmp_expr_p > - bitmask_inv_cst_vector_p) > + bitmask_inv_cst_vector_p > + integer_half_truncated_all_ones_p) > > /* Operator lists. */ > (define_operator_list tcc_comparison > @@ -3210,6 +3211,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) > && types_match (type, @0, @1)))) > > +/* Unsigned saturation truncate, case 1 (), sizeof (WT) > sizeof (NT). > + SAT_U_TRUNC =3D (NT)x | (NT)(-(X > (WT)(NT)(-1))). */ > +(match (unsigend_integer_sat_trunc @0) unsigned > + (bit_ior:c (negate (convert (gt @0 integer_half_truncated_all_ones_p))) > + (convert @0)) > + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > + && tree_int_cst_lt (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (@0))))= )) This type size relation doesn't match integer_half_truncated_all_ones_p, that works based on TYPE_PRECISION. Don't you maybe want to scrap integer_half_truncated_all_ones_p as too restrictive and instead verify that TYPE_PRECISION (type) is less than the precision of @0 and that the INTEGER_CST compared against matches 'type's precision mask? > + > /* x > y && x !=3D XXX_MIN --> x > y > x > y && x =3D=3D XXX_MIN --> false . */ > (for eqne (eq ne) > diff --git a/gcc/optabs.def b/gcc/optabs.def > index bc2611abdc2..4eaffe96c19 100644 > --- a/gcc/optabs.def > +++ b/gcc/optabs.def > @@ -63,6 +63,9 @@ OPTAB_CX(fractuns_optab, "fractuns$Q$b$I$a2") > OPTAB_CL(satfract_optab, "satfract$b$Q$a2", SAT_FRACT, "satfract", gen_s= atfract_conv_libfunc) > OPTAB_CL(satfractuns_optab, "satfractuns$I$b$Q$a2", UNSIGNED_SAT_FRACT, = "satfractuns", gen_satfractuns_conv_libfunc) > > +OPTAB_CL(ustrunc_optab, "ustrunc$b$a2", US_TRUNCATE, "ustrunc", gen_satf= ract_conv_libfunc) > +OPTAB_CL(sstrunc_optab, "sstrunc$b$a2", SS_TRUNCATE, "sstrunc", gen_satf= ract_conv_libfunc) Those libfuncs do not exist so use NULL for them. > + > OPTAB_CD(sfixtrunc_optab, "fix_trunc$F$b$I$a2") > OPTAB_CD(ufixtrunc_optab, "fixuns_trunc$F$b$I$a2") > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > index 57085488722..64bc70c29b3 100644 > --- a/gcc/tree-ssa-math-opts.cc > +++ b/gcc/tree-ssa-math-opts.cc > @@ -4088,6 +4088,7 @@ arith_overflow_check_p (gimple *stmt, gimple *cast_= stmt, gimple *&use_stmt, > > extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)= ); > extern bool gimple_unsigned_integer_sat_sub (tree, tree*, tree (*)(tree)= ); > +extern bool gimple_unsigend_integer_sat_trunc (tree, tree*, tree (*)(tre= e)); > > static void > build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, internal_= fn fn, > @@ -4216,6 +4217,36 @@ match_unsigned_saturation_sub (gimple_stmt_iterato= r *gsi, gphi *phi) > ops[0], ops[1]); > } > > +/* > + * Try to match saturation unsigned sub. > + * uint16_t x_4(D); > + * uint8_t _6; > + * overflow_5 =3D x_4(D) > 255; > + * _1 =3D (unsigned char) x_4(D); > + * _2 =3D (unsigned char) overflow_5; > + * _3 =3D -_2; > + * _6 =3D _1 | _3; > + * =3D> > + * _6 =3D .SAT_TRUNC (x_4(D)); > + * */ > +static void > +match_unsigned_saturation_trunc (gimple_stmt_iterator *gsi, gassign *stm= t) > +{ > + tree ops[1]; > + tree lhs =3D gimple_assign_lhs (stmt); > + tree type =3D TREE_TYPE (lhs); > + > + if (gimple_unsigend_integer_sat_trunc (lhs, ops, NULL) > + && direct_internal_fn_supported_p (IFN_SAT_TRUNC, > + tree_pair (type, TREE_TYPE (ops[0]= )), > + OPTIMIZE_FOR_BOTH)) > + { > + gcall *call =3D gimple_build_call_internal (IFN_SAT_TRUNC, 1, ops[= 0]); > + gimple_call_set_lhs (call, lhs); > + gsi_replace (gsi, call, /* update_eh_info */ true); > + } > +} > + > /* Recognize for unsigned x > x =3D y - z; > if (x > y) > @@ -6188,6 +6219,7 @@ math_opts_dom_walker::after_dom_children (basic_blo= ck bb) > > case BIT_IOR_EXPR: > match_unsigned_saturation_add (&gsi, as_a (stmt)= ); > + match_unsigned_saturation_trunc (&gsi, as_a (stm= t)); > /* fall-through */ > case BIT_XOR_EXPR: > match_uaddc_usubc (&gsi, stmt, code); > diff --git a/gcc/tree.cc b/gcc/tree.cc > index 2d2d5b6db6e..4572e6fc42b 100644 > --- a/gcc/tree.cc > +++ b/gcc/tree.cc > @@ -2944,6 +2944,28 @@ integer_all_onesp (const_tree expr) > =3D=3D wi::to_wide (expr)); > } > > +/* Return true if EXPR is an integer constant of all ones with half > + truncated in precision. Or return false. For example: > + uint16_t a =3D 255; // true. > + uint16_t b =3D 0; // false. > + uint16_t c =3D 65545; // false. */ > +bool > +integer_half_truncated_all_ones_p (const_tree expr) > +{ > + if (TREE_CODE (expr) !=3D INTEGER_CST) > + return false; > + > + unsigned precision =3D TYPE_PRECISION (TREE_TYPE (expr)); > + > + gcc_assert (precision <=3D 64); > + > + unsigned trunc_prec =3D precision / 2; > + wide_int trunc_max =3D wi::uhwi ((uint64_t)-1 >> (64 - trunc_prec), pr= ecision); There is wi::mask which doesn't suffer from being limited to max 64bits. > + wide_int expr_int =3D wi::to_wide (expr, precision); > + > + return trunc_max =3D=3D expr_int; > +} > + > /* Return true if EXPR is the integer constant minus one, or a location > wrapper for such a constant. */ > > diff --git a/gcc/tree.h b/gcc/tree.h > index 28e8e71b036..0237826dd23 100644 > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -5205,6 +5205,12 @@ extern bool integer_each_onep (const_tree); > > extern bool integer_all_onesp (const_tree); > > +/* integer_half_truncated_all_ones_p (tree x) will return true if x is > + the integer constant that the half truncated bits are all 1. > + For example, uint16_t type with 255 constant integer will be true. *= / > + > +extern bool integer_half_truncated_all_ones_p (const_tree expr); > + > /* integer_minus_onep (tree x) is nonzero if X is an integer constant of > value -1. */ > > -- > 2.34.1 >