From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by sourceware.org (Postfix) with ESMTPS id 15C12384DD3E for ; Wed, 22 May 2024 13:24:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 15C12384DD3E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 15C12384DD3E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716384254; cv=none; b=nzGhj0YQUd/zxD7JJSChZLF/OKRibUKran7MlMjs5lQS7cmSRFrlyrUL8DIr2M5zZxJ0F0EU9npuIUoXPA7cpiPd6z2xCf03MiT9Jam1Q8DhAmKp1czYef3L++Zphj+2eAdsD7ptsh6phE7cUdwfzSFlgRItmxyJeuruns8Kvhc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716384254; c=relaxed/simple; bh=2ZjnSJPk5UBPAl9zTTS8wY/jpVuPtuerEfF7eDLca1s=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=LxBMDPFhKb8RCtKYuQ/546LBWkUPK1rWUQT0B1FVJb/hCV/8xildgrRVH/usmY5D68skAQ8MIBusJv2qCPC2cJdc79NNNzOeLKVRVz0JasjWzscxGJ/66NogpuwRLtKPVbtaaL6e5S0QPPKZEkrGhgsbS0JLqoU3w2AQgtlNaTk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-51ffff16400so10574443e87.2 for ; Wed, 22 May 2024 06:24:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716384250; x=1716989050; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TCVW4MKx2LTWjdAoTn72iCoUQWSCCj9gJeRLnN7op4c=; b=f9YxN7ib3hDJ9MlwWhwew24l9+lq5rGAV99QAU1VTzrvCfH4dzkJvC1B4pvWIl/kil djNZovd1TYIANEaAxQNAJLQTO325FtzImyZ4L3IzycfySoavCJRqYbe57qgN6WspOzBF Ko4eyggL0kHTvM/66kdInGfE5tTHya/uArxc/bJcPPMamOqFeNanJ4m8CMHzKlCp+ZnU 1pXv6DfskUBNDplfo48meJreC8xgvbZmhTN9OoUrMlx25jc1i5iu4cLOw7QZFwVcWiul bIX6u6QMe03BBH3t/vZaZGK+MAZugVSp8EKTceyt1fQjjFuG8HAb8VykkQSxltLMMYLj v3uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716384250; x=1716989050; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TCVW4MKx2LTWjdAoTn72iCoUQWSCCj9gJeRLnN7op4c=; b=EUde92OyyEycJFaJ6lRPJznX4y1KtdRBX5tV38wMvrx7cpI+jk7D4Hc0fglqrYtgay dJTLAqrpAI/ldXz7x+91VYEXsAOsih+Yw6ry25HOcjToTgGK+y09LS3rqj2Io1FLcam0 y6qdIc8K9ykyJ+NLZkZZEoG4hUk2iEsQXmUaoMfVh5ZmCO87xixatV/bpkYSCeZUcB+I cAw7X0gKp8Tct8VIfuTXiplWC+e7vBUk505VqTfZR8FTozGVH4tW/+NziE+jJ/ijfyke xe4Sq627hwr9p4IB2Xq62l76Gg4s958SeVFXV9ZHWv9L4pweNH8U3xg2bmv3P0PtbyKS /Zfw== X-Forwarded-Encrypted: i=1; AJvYcCVr7z7Jj+ygDjGqPbLNf8nP+RlMCivAGu/NWqau5Y/WC0EvjtEGTiaxwDuek38pqASuVf0VMWiuX5xP12lF1n5fn+MwYkLk4A== X-Gm-Message-State: AOJu0YzViSDTTxt3sWUh62+/Vm8zFK+8kbl4cPVpFLVcxuvLezSn+xHX i43czOXIdoeNefuebav+kDNvUuGYThIWj47bpfHMzWXtEj0v1kh8bO3V1LNWVSFkKRpRSNirDFg DpT/3eyTvfNEYmYmxqp7BtEglcDI= X-Google-Smtp-Source: AGHT+IFE6mIbVB7HuAVpCal9E7u/QRTZgIxz7qgJgv5VErPEuF4AcW8+uiXX3Jd3sDvyhb2tqQoM1DuWtL+1eaeFeA4= X-Received: by 2002:ac2:44aa:0:b0:51b:528e:ce7d with SMTP id 2adb3069b0e04-526bf7384demr1560134e87.34.1716384250376; Wed, 22 May 2024 06:24:10 -0700 (PDT) MIME-Version: 1.0 References: <20240520110048.2827900-1-pan2.li@intel.com> In-Reply-To: From: Richard Biener Date: Wed, 22 May 2024 15:23:59 +0200 Message-ID: Subject: Re: [PATCH v1 1/2] Match: Support branch form for unsigned SAT_ADD To: Tamar Christina Cc: "pan2.li@intel.com" , "gcc-patches@gcc.gnu.org" , "juzhe.zhong@rivai.ai" , "kito.cheng@gmail.com" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, May 20, 2024 at 1:50=E2=80=AFPM Tamar Christina wrote: > > Hi Pan, > > > -----Original Message----- > > From: pan2.li@intel.com > > Sent: Monday, May 20, 2024 12:01 PM > > To: gcc-patches@gcc.gnu.org > > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; Tamar Christina > > ; richard.guenther@gmail.com; Pan Li > > > > Subject: [PATCH v1 1/2] Match: Support branch form for unsigned SAT_ADD > > > > From: Pan Li > > > > This patch would like to support the branch form for unsigned > > SAT_ADD. For example as below: > > > > uint64_t > > sat_add (uint64_t x, uint64_t y) > > { > > return (uint64_t) (x + y) >=3D x ? (x + y) : -1; > > } > > > > Different to the branchless version, we leverage the simplify to > > convert the branch version of SAT_ADD into branchless if and only > > if the backend has supported the IFN_SAT_ADD. Thus, the backend has > > the ability to choose branch or branchless implementation of .SAT_ADD. > > For example, some target can take care of branches code more optimally= . > > > > When the target implement the IFN_SAT_ADD for unsigned and before this > > patch: > > uint64_t sat_add_u_1_uint64_t (uint64_t x, uint64_t y) > > { > > long unsigned int _1; > > uint64_t _2; > > __complex__ long unsigned int _6; > > long unsigned int _7; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _6 =3D .ADD_OVERFLOW (x_3(D), y_4(D)); > > _1 =3D REALPART_EXPR <_6>; > > _7 =3D IMAGPART_EXPR <_6>; > > if (_7 =3D=3D 0) > > goto ; [65.00%] > > else > > goto ; [35.00%] > > ;; succ: 4 > > ;; 3 > > > > ;; basic block 3, loop depth 0 > > ;; pred: 2 > > ;; succ: 4 > > > > ;; basic block 4, loop depth 0 > > ;; pred: 3 > > ;; 2 > > # _2 =3D PHI <18446744073709551615(3), _1(2)> > > return _2; > > ;; succ: EXIT > > > > } > > > > After this patch: > > uint64_t sat_add (uint64_t x, uint64_t y) > > { > > long unsigned int _9; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _9 =3D .SAT_ADD (x_3(D), y_4(D)); [tail call] > > return _9; > > ;; succ: EXIT > > } > > > > The below test suites are passed for this patch: > > * The x86 bootstrap test. > > * The x86 fully regression test. > > * The riscv fully regression test. > > > > gcc/ChangeLog: > > > > * match.pd: Add new simplify to convert branch SAT_ADD into > > branchless, if and only if backend implement the IFN. > > > > Signed-off-by: Pan Li > > --- > > gcc/match.pd | 18 ++++++++++++++++++ > > 1 file changed, 18 insertions(+) > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > index 0f9c34fa897..0547b57b3a3 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3094,6 +3094,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > (match (unsigned_integer_sat_add @0 @1) > > (bit_ior:c (usadd_left_part_2 @0 @1) (usadd_right_part_2 @0 @1))) > > > > +#if GIMPLE > > + > > +/* Simplify the branch version of SAT_ADD into branchless if and only = if > > + the backend has supported the IFN_SAT_ADD. Thus, the backend has t= he > > + ability to choose branch or branchless implementation of .SAT_ADD. = */ This comment or part of the description above should say this simplifies (x + y) >=3D x ? (x + y) : -1 as (x + y) | (-(typeof(x))((x + y) < x)) > > +(simplify > > + (cond (ge (plus:c@2 @0 @1) @0) @2 integer_minus_onep) > > + (if (direct_internal_fn_supported_p (IFN_SAT_ADD, type, > > OPTIMIZE_FOR_BOTH)) > > + (bit_ior @2 (negate (convert (lt @2 @0)))))) > > + > > +(simplify > > + (cond (le @0 (plus:c@2 @0 @1)) @2 integer_minus_onep) > > + (if (direct_internal_fn_supported_p (IFN_SAT_ADD, type, > > OPTIMIZE_FOR_BOTH)) > > + (bit_ior @2 (negate (convert (lt @2 @0)))))) and this should probably be (gt @2 @0)? This misses INTEGER_TYPE_P constraints and it's supposed to be only for TYPE_UNSIGNED? > > + > > +#endif > > Thanks, this looks good to me! > > I'll leave it up to Richard to approve, > Richard: The reason for the direct_internal_fn_supported_p is because som= e > targets said that they currently handle the branch version better due to = the lack > of some types. At the time I reason it's just a target expansion bug but= didn't hear anything. > > To be honest, it feels to me like we should do this unconditionally, and = just have the targets > that get faster branch version to handle it during expand? Since the patc= h series provides > a canonicalized version now. I'm not sure this is a good canonical form. __imag .ADD_OVERFLOW (x, y) ? __real .ADD_OVERFLOW (x, y) : -1 would be better IMO. It can be branch-less by using a COND_EXPR. > This means we can also better support targets that have the vector optab = but not the scalar one > as the above check would fail for these targets. > > What do you think? > > Thanks, > Tamar > > > + > > /* x > y && x !=3D XXX_MIN --> x > y > > x > y && x =3D=3D XXX_MIN --> false . */ > > (for eqne (eq ne) > > -- > > 2.34.1 >