From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 81DD13858296; Tue, 17 Jan 2023 11:19:06 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 81DD13858296
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1673954346;
	bh=V7jNPEOpWcrz49twAp4bmN+BRVvvm9gGhsv3qu7P5RQ=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=UCi/dXndx7hOic1leIi1czu4LU4aHU7W8GQdb66lecvZSFAsaxDUEJmu9HsJfoyDN
	 8amNMaU0yiz6OJc57sG3AkCSmiyWxBkD1PKDRWaf/L7vV+/beowGYQn7d8Zz91+Y3i
	 fO3WjyyG/iWsdudQSvGAuXyoJQsAFwOYTrMwgkrc=
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/106523] [10/11/12/13 Regression] forwprop
 miscompile
Date: Tue, 17 Jan 2023 11:19:05 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: cvs-commit at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: jakub at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 10.5
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-106523-4-Ka5hM5xpLz@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-106523-4@http.gcc.gnu.org/bugzilla/>
References: <bug-106523-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106523
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:001121e8921d5d1a439ce0e64ab04c5959b0bfd8

commit r13-5223-g001121e8921d5d1a439ce0e64ab04c5959b0bfd8
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jan 17 12:14:25 2023 +0100

    forwprop: Fix up rotate pattern matching [PR106523]

    The comment above simplify_rotate roughly describes what patterns
    are matched into what:
       We are looking for X with unsigned type T with bitsize B, OP being
       +, | or ^, some type T2 wider than T.  For:
       (X << CNT1) OP (X >> CNT2)                           iff CNT1 + CNT2=
 =3D=3D
B
       ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2))     iff CNT1 + CNT2=
 =3D=3D
B

       transform these into:
       X r<< CNT1

       Or for:
       (X << Y) OP (X >> (B - Y))
       (X << (int) Y) OP (X >> (int) (B - Y))
       ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
       ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
       (X << Y) | (X >> ((-Y) & (B - 1)))
       (X << (int) Y) | (X >> (int) ((-Y) & (B - 1)))
       ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
       ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))

       transform these into (last 2 only if ranger can prove Y < B):
       X r<< Y

       Or for:
       (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
       (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
       ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1))))
       ((T) ((T2) X << (int) (Y & (B - 1)))) \
         | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))

       transform these into:
       X r<< (Y & (B - 1))

    The following testcase shows that 2 of these are problematic.
    If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one
    of the shift counts but Y on the can do something different from
    rotate.  E.g.:
    __attribute__((noipa)) unsigned char
    f7 (unsigned char x, unsigned int y)
    {
      unsigned int t =3D x;
      return (t << y) | (t >> ((-y) & 7));
    }
    if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U]
    then it is UB, but for y in [9, 31] the left shift in this case
    will never leave any bits in the result, while in a rotate they are
    left there.  Say for y 5 and x 0xaa the expression gives
    0x55 which is the same thing as rotate, while for y 19 and x 0xaa
    0x5, which is different.
    Now, I believe the
       ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
       ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
    forms are ok, because B - Y still needs to be a valid shift count,
    and if Y > B then B - Y should be either negative or very large
    positive (for unsigned types).
    And similarly the last 2 cases above which use & (B - 1) on both
    shift operands are definitely ok.

    The following patch disables the
       ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
       ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
    unless ranger says Y is not in [B, B2 - 1] range.

    And, looking at it again this morning, actually the Y equal to B
    case is still fine, if Y is equal to 0, then it is
    (T) (((T2) X << 0) | ((T2) X >> 0))
    and so X, for Y =3D=3D B it is
    (T) (((T2) X << B) | ((T2) X >> 0))
    which is the same as
    (T) (0 | ((T2) X >> 0))
    which is also X.  So instead of the [B, B2 - 1] range we could use
    [B + 1, B2 - 1].  And, if we wanted to go further, even multiplies
    of B are ok if they are smaller than B2, so we could construct a detail=
ed
    int_range_max if we wanted.

    2023-01-17  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/106523
            * tree-ssa-forwprop.cc (simplify_rotate): For the
            patterns with (-Y) & (B - 1) in one operand's shift
            count and Y in another, if T2 has wider precision than T,
            punt if Y could have a value in [B, B2 - 1] range.

            * c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16,
            f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
            __builtin_unreachable about shift count.
            * c-c++-common/rotate-2b.c: New test.
            * c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16,
            f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
            __builtin_unreachable about shift count.
            * c-c++-common/rotate-4b.c: New test.
            * gcc.c-torture/execute/pr106523.c: New test.=