From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by sourceware.org (Postfix) with ESMTPS id 269E93849AF3 for ; Thu, 16 May 2024 08:10:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 269E93849AF3 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 269E93849AF3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::235 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715847034; cv=none; b=JMNh8dOTyS9Yf3uL0J5XiDPqrNpi/wi5mgYJEcV2ygSfvFc5WBAg58meAd+csQvB8TciK9ks/pNlQAoc4uifPLUOkQfPsEDI8J7wxjNcqpCwbuz/MGEscuqy7QCuOqGTL9p/tdq6Q5AJb7qIA1RnHLabAEkLbc8W5c94vEqjrPs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715847034; c=relaxed/simple; bh=CBdkh5Jwg79olv0a/jAzjESnjMpkcRVPwS/Rhq78zII=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=rk5G0b6EpyiepJMLj6uV3bt13rzz32feVBSqNRy3GkpOzW/1n1A4twssXjeRtU8Fskvj9RV+W21XXFAznTgmRbxcHE0Dl4B3BUHWyCz/fEF2+XEBBkQ5uEveYALuG3XM6KobgmCVM4dBraI0OFEHHRR20NIHlH+UalRHpIYEUQw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2e6f33150bcso4475941fa.2 for ; Thu, 16 May 2024 01:10:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715847025; x=1716451825; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=u6rAw40Kxhs+FYxMKXPd7n421a4FFF3XjPKrWKIobps=; b=ID+6s5GYn6i8KTilN83kBWwTY/VhXqnAPFUp5Ijkzhl998GueecjZcF5+VaNI89mwy r7yUQju3JnOiwJb05ZDr//a6NheAWFfw1Og0N8aU1WIK2puFdptSc6KQ2zCGEQn48nXs btX20zO0bKtrmCOyBMdHBVsu63d/aNVY18wrZb4sH1xbNW9gBria3PlUt3FznCkbtKRV xahMzYwTeb1/APX3pfmnK6e99MwTt8+Yol4ZURvTIn7oBCcc7wBA6HCDoe1blV0owPld 8Nn/n7QAQSAoVUtbDSKFpMLAkABsrSz0E3gO2qL8qPxK+u2ZFNUMAeEN7xJCwvV3mIyX dTCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715847025; x=1716451825; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u6rAw40Kxhs+FYxMKXPd7n421a4FFF3XjPKrWKIobps=; b=q4r9y+ZHf2Wue6H+Si58zD+ltCq7MVbHjhHKGKussnhg8zuolnazqlIbg8Tzy7mTAN icfWb90eAuWIy9erMKq1/rz94KQ5NutkgeoPhx2VS8FsUAwAL5ulq1WUvwtVbvZ+kyPF YbKz/xjfN+YPIx5KgEaBzfk+wfxiid4IKKZcmweChpkKbDe7D1eCKw6FXuMXRh1Frx3J t3FmNh1Z8nN5M/2Ylyk0NgcVi/l4fPtPWeTWdiPsGS9vOMvTPsHwDv9smMDHR1+9Vyu/ V9LtQUMQ1rYEkIJsNNFo8jz/XLeGI4Ir+KBidjW1hN4qKxpvRRf0/6EbK8yUAhI1PTqg pYLQ== X-Forwarded-Encrypted: i=1; AJvYcCV9t6aWwSiHP3y32xr9izaUwG4q6NqNW2FYjRvC787jo/M7ZnVULVgYD4iEyc9fs0a/Fkwo+RMDLYdIeBveYZUm5mfhaIePyQ== X-Gm-Message-State: AOJu0Yy6gcQ0DB05+iamddudSVdh3k+VFwKl+Ili26oAFwXrF4+ogF6X e5ZOUR91EVkmG10RgCRQ1I3j0m+cp7PehYIW182rOdbPtPmSkGx9HJOzsY7pBtd8SAOmc8PlXq8 sY1Jw+UNVhySiBB6TdkpF2KOX1xo= X-Google-Smtp-Source: AGHT+IHEw4WkI7J0iGPFSTcHZpOvOAUmwl8WPAyBB/YCZRkgg2lK4Z8TkEalzJLfesxhwjwtF6eKKedCYv/RN+PwkTA= X-Received: by 2002:a2e:6812:0:b0:2df:49b:27f1 with SMTP id 38308e7fff4ca-2e51fd2e03emr134578141fa.4.1715847025126; Thu, 16 May 2024 01:10:25 -0700 (PDT) MIME-Version: 1.0 References: <20240515021407.1287623-1-pan2.li@intel.com> In-Reply-To: From: Richard Biener Date: Thu, 16 May 2024 10:10:13 +0200 Message-ID: Subject: Re: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int To: "Li, Pan2" Cc: Tamar Christina , "gcc-patches@gcc.gnu.org" , "juzhe.zhong@rivai.ai" , "kito.cheng@gmail.com" , "Liu, Hongtao" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, May 15, 2024 at 1:36=E2=80=AFPM Li, Pan2 wrote: > > > LGTM but you'll need an OK from Richard, > > Thanks for working on this! > > Thanks Tamar for help and coaching, let's wait Richard for a while,=F0=9F= =98=8A! OK. Thanks for the patience, Richard. > Pan > > -----Original Message----- > From: Tamar Christina > Sent: Wednesday, May 15, 2024 5:12 PM > To: Li, Pan2 ; gcc-patches@gcc.gnu.org > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; richard.guenther@gmail.co= m; Liu, Hongtao > Subject: RE: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsi= gned scalar int > > Hi Pan, > > Thanks! > > > -----Original Message----- > > From: pan2.li@intel.com > > Sent: Wednesday, May 15, 2024 3:14 AM > > To: gcc-patches@gcc.gnu.org > > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; Tamar Christina > > ; richard.guenther@gmail.com; > > hongtao.liu@intel.com; Pan Li > > Subject: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsign= ed scalar > > int > > > > From: Pan Li > > > > This patch would like to add the middle-end presentation for the > > saturation add. Aka set the result of add to the max when overflow. > > It will take the pattern similar as below. > > > > SAT_ADD (x, y) =3D> (x + y) | (-(TYPE)((TYPE)(x + y) < x)) > > > > Take uint8_t as example, we will have: > > > > * SAT_ADD (1, 254) =3D> 255. > > * SAT_ADD (1, 255) =3D> 255. > > * SAT_ADD (2, 255) =3D> 255. > > * SAT_ADD (255, 255) =3D> 255. > > > > Given below example for the unsigned scalar integer uint64_t: > > > > uint64_t sat_add_u64 (uint64_t x, uint64_t y) > > { > > return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); > > } > > > > Before this patch: > > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > > { > > long unsigned int _1; > > _Bool _2; > > long unsigned int _3; > > long unsigned int _4; > > uint64_t _7; > > long unsigned int _10; > > __complex__ long unsigned int _11; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _11 =3D .ADD_OVERFLOW (x_5(D), y_6(D)); > > _1 =3D REALPART_EXPR <_11>; > > _10 =3D IMAGPART_EXPR <_11>; > > _2 =3D _10 !=3D 0; > > _3 =3D (long unsigned int) _2; > > _4 =3D -_3; > > _7 =3D _1 | _4; > > return _7; > > ;; succ: EXIT > > > > } > > > > After this patch: > > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > > { > > uint64_t _7; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _7 =3D .SAT_ADD (x_5(D), y_6(D)); [tail call] > > return _7; > > ;; succ: EXIT > > } > > > > The below tests are passed for this patch: > > 1. The riscv fully regression tests. > > 3. The x86 bootstrap tests. > > 4. The x86 fully regression tests. > > > > PR target/51492 > > PR target/112600 > > > > gcc/ChangeLog: > > > > * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD > > to the return true switch case(s). > > * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. > > * match.pd: Add unsigned SAT_ADD match(es). > > * optabs.def (OPTAB_NL): Remove fixed-point limitation for > > us/ssadd. > > * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New > > extern func decl generated in match.pd match. > > (match_saturation_arith): New func impl to match the saturation a= rith. > > (math_opts_dom_walker::after_dom_children): Try match saturation > > arith when IOR expr. > > > > LGTM but you'll need an OK from Richard, > > Thanks for working on this! > > Tamar > > > Signed-off-by: Pan Li > > --- > > gcc/internal-fn.cc | 1 + > > gcc/internal-fn.def | 2 ++ > > gcc/match.pd | 51 +++++++++++++++++++++++++++++++++++++++ > > gcc/optabs.def | 4 +-- > > gcc/tree-ssa-math-opts.cc | 32 ++++++++++++++++++++++++ > > 5 files changed, 88 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > > index 0a7053c2286..73045ca8c8c 100644 > > --- a/gcc/internal-fn.cc > > +++ b/gcc/internal-fn.cc > > @@ -4202,6 +4202,7 @@ commutative_binary_fn_p (internal_fn fn) > > case IFN_UBSAN_CHECK_MUL: > > case IFN_ADD_OVERFLOW: > > case IFN_MUL_OVERFLOW: > > + case IFN_SAT_ADD: > > case IFN_VEC_WIDEN_PLUS: > > case IFN_VEC_WIDEN_PLUS_LO: > > case IFN_VEC_WIDEN_PLUS_HI: > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > > index 848bb9dbff3..25badbb86e5 100644 > > --- a/gcc/internal-fn.def > > +++ b/gcc/internal-fn.def > > @@ -275,6 +275,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST > > | ECF_NOTHROW, first, > > DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, > > first, > > smulhrs, umulhrs, binary) > > > > +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, > > binary) > > + > > DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) > > DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) > > DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) > > diff --git a/gcc/match.pd b/gcc/match.pd > > index 07e743ae464..0f9c34fa897 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3043,6 +3043,57 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > || POINTER_TYPE_P (itype)) > > && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype)))))) > > > > +/* Unsigned Saturation Add */ > > +(match (usadd_left_part_1 @0 @1) > > + (plus:c @0 @1) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_left_part_2 @0 @1) > > + (realpart (IFN_ADD_OVERFLOW:c @0 @1)) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_right_part_1 @0 @1) > > + (negate (convert (lt (plus:c @0 @1) @0))) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_right_part_1 @0 @1) > > + (negate (convert (gt @0 (plus:c @0 @1)))) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_right_part_2 @0 @1) > > + (negate (convert (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) > > integer_zerop))) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +/* We cannot merge or overload usadd_left_part_1 and usadd_left_part_2 > > + because the sub part of left_part_2 cannot work with right_part_1. > > + For example, left_part_2 pattern focus one .ADD_OVERFLOW but the > > + right_part_1 has nothing to do with .ADD_OVERFLOW. */ > > + > > +/* Unsigned saturation add, case 1 (branchless): > > + SAT_U_ADD =3D (X + Y) | - ((X + Y) < X) or > > + SAT_U_ADD =3D (X + Y) | - (X > (X + Y)). */ > > +(match (unsigned_integer_sat_add @0 @1) > > + (bit_ior:c (usadd_left_part_1 @0 @1) (usadd_right_part_1 @0 @1))) > > + > > +/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW). *= / > > +(match (unsigned_integer_sat_add @0 @1) > > + (bit_ior:c (usadd_left_part_2 @0 @1) (usadd_right_part_2 @0 @1))) > > + > > /* x > y && x !=3D XXX_MIN --> x > y > > x > y && x =3D=3D XXX_MIN --> false . */ > > (for eqne (eq ne) > > diff --git a/gcc/optabs.def b/gcc/optabs.def > > index ad14f9328b9..3f2cb46aff8 100644 > > --- a/gcc/optabs.def > > +++ b/gcc/optabs.def > > @@ -111,8 +111,8 @@ OPTAB_NX(add_optab, "add$F$a3") > > OPTAB_NX(add_optab, "add$Q$a3") > > OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfun= c) > > OPTAB_VX(addv_optab, "add$F$a3") > > -OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', > > gen_signed_fixed_libfunc) > > -OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', > > gen_unsigned_fixed_libfunc) > > +OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', > > gen_signed_fixed_libfunc) > > +OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', > > gen_unsigned_fixed_libfunc) > > OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', gen_int_fp_fixed_li= bfunc) > > OPTAB_NX(sub_optab, "sub$F$a3") > > OPTAB_NX(sub_optab, "sub$Q$a3") > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > > index e8c804f09b7..62da1c5ee08 100644 > > --- a/gcc/tree-ssa-math-opts.cc > > +++ b/gcc/tree-ssa-math-opts.cc > > @@ -4086,6 +4086,36 @@ arith_overflow_check_p (gimple *stmt, gimple > > *cast_stmt, gimple *&use_stmt, > > return 0; > > } > > > > +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tre= e)); > > + > > +/* > > + * Try to match saturation arith pattern(s). > > + * 1. SAT_ADD (unsigned) > > + * _7 =3D _4 + _6; > > + * _8 =3D _4 > _7; > > + * _9 =3D (long unsigned int) _8; > > + * _10 =3D -_9; > > + * _12 =3D _7 | _10; > > + * =3D> > > + * _12 =3D .SAT_ADD (_4, _6); */ > > +static void > > +match_saturation_arith (gimple_stmt_iterator *gsi, gassign *stmt) > > +{ > > + gcall *call =3D NULL; > > + > > + tree ops[2]; > > + tree lhs =3D gimple_assign_lhs (stmt); > > + > > + if (gimple_unsigned_integer_sat_add (lhs, ops, NULL) > > + && direct_internal_fn_supported_p (IFN_SAT_ADD, TREE_TYPE (lhs), > > + OPTIMIZE_FOR_BOTH)) > > + { > > + call =3D gimple_build_call_internal (IFN_SAT_ADD, 2, ops[0], ops= [1]); > > + gimple_call_set_lhs (call, lhs); > > + gsi_replace (gsi, call, true); > > + } > > +} > > + > > /* Recognize for unsigned x > > x =3D y - z; > > if (x > y) > > @@ -6048,6 +6078,8 @@ math_opts_dom_walker::after_dom_children > > (basic_block bb) > > break; > > > > case BIT_IOR_EXPR: > > + match_saturation_arith (&gsi, as_a (stmt)); > > + /* fall-through */ > > case BIT_XOR_EXPR: > > match_uaddc_usubc (&gsi, stmt, code); > > break; > > -- > > 2.34.1 >