From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id DB5AA3849ADB; Thu, 16 May 2024 12:09:11 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DB5AA3849ADB
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1715861351;
	bh=aG4Bh4EYZegFHf7+YauFELoIrurr3OR6U10ubbQHKMM=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=GyZqDTbLucmnbpRIkXWQT4yGXUyLlcxq1+F8TjPPYXw3YxIK3KYBjOlF91vVaq4Eg
	 HmyTFBW5H80zqrMHyeBupaunJVi/BM5svvsDRmkTF9+hu1c8ulqQbdCYmCGOPpgSZw
	 vyt/JgkcmaTfutUpsSLzShrwLaZvMRcSAk18joik=
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/51492] vectorizer does not support saturated
 arithmetic patterns
Date: Thu, 16 May 2024 12:09:10 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 4.6.2
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: cvs-commit at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-51492-4-GYQS3nGr4z@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-51492-4@http.gcc.gnu.org/bugzilla/>
References: <bug-51492-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D51492
--- Comment #19 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pan Li <panli@gcc.gnu.org>:

https://gcc.gnu.org/g:52b0536710ff3f3ace72ab00ce9ef6c630cd1183

commit r15-576-g52b0536710ff3f3ace72ab00ce9ef6c630cd1183
Author: Pan Li <pan2.li@intel.com>
Date:   Wed May 15 10:14:05 2024 +0800

    Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

    This patch would like to add the middle-end presentation for the
    saturation add.  Aka set the result of add to the max when overflow.
    It will take the pattern similar as below.

    SAT_ADD (x, y) =3D> (x + y) | (-(TYPE)((TYPE)(x + y) < x))

    Take uint8_t as example, we will have:

    * SAT_ADD (1, 254)   =3D> 255.
    * SAT_ADD (1, 255)   =3D> 255.
    * SAT_ADD (2, 255)   =3D> 255.
    * SAT_ADD (255, 255) =3D> 255.

    Given below example for the unsigned scalar integer uint64_t:

    uint64_t sat_add_u64 (uint64_t x, uint64_t y)
    {
      return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
    }

    Before this patch:
    uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
    {
      long unsigned int _1;
      _Bool _2;
      long unsigned int _3;
      long unsigned int _4;
      uint64_t _7;
      long unsigned int _10;
      __complex__ long unsigned int _11;

    ;;   basic block 2, loop depth 0
    ;;    pred:       ENTRY
      _11 =3D .ADD_OVERFLOW (x_5(D), y_6(D));
      _1 =3D REALPART_EXPR <_11>;
      _10 =3D IMAGPART_EXPR <_11>;
      _2 =3D _10 !=3D 0;
      _3 =3D (long unsigned int) _2;
      _4 =3D -_3;
      _7 =3D _1 | _4;
      return _7;
    ;;    succ:       EXIT

    }

    After this patch:
    uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
    {
      uint64_t _7;

    ;;   basic block 2, loop depth 0
    ;;    pred:       ENTRY
      _7 =3D .SAT_ADD (x_5(D), y_6(D)); [tail call]
      return _7;
    ;;    succ:       EXIT
    }

    The below tests are passed for this patch:
    1. The riscv fully regression tests.
    3. The x86 bootstrap tests.
    4. The x86 fully regression tests.

            PR target/51492
            PR target/112600

    gcc/ChangeLog:

            * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD
            to the return true switch case(s).
            * internal-fn.def (SAT_ADD):  Add new signed optab SAT_ADD.
            * match.pd: Add unsigned SAT_ADD match(es).
            * optabs.def (OPTAB_NL): Remove fixed-point limitation for
            us/ssadd.
            * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New
            extern func decl generated in match.pd match.
            (match_saturation_arith): New func impl to match the saturation
arith.
            (math_opts_dom_walker::after_dom_children): Try match saturation
            arith when IOR expr.

    Signed-off-by: Pan Li <pan2.li@intel.com>=