From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 81DD13858296; Tue, 17 Jan 2023 11:19:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 81DD13858296 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1673954346; bh=V7jNPEOpWcrz49twAp4bmN+BRVvvm9gGhsv3qu7P5RQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=UCi/dXndx7hOic1leIi1czu4LU4aHU7W8GQdb66lecvZSFAsaxDUEJmu9HsJfoyDN 8amNMaU0yiz6OJc57sG3AkCSmiyWxBkD1PKDRWaf/L7vV+/beowGYQn7d8Zz91+Y3i fO3WjyyG/iWsdudQSvGAuXyoJQsAFwOYTrMwgkrc= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/106523] [10/11/12/13 Regression] forwprop miscompile Date: Tue, 17 Jan 2023 11:19:05 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: jakub at gcc dot gnu.org X-Bugzilla-Target-Milestone: 10.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106523 --- Comment #6 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:001121e8921d5d1a439ce0e64ab04c5959b0bfd8 commit r13-5223-g001121e8921d5d1a439ce0e64ab04c5959b0bfd8 Author: Jakub Jelinek Date: Tue Jan 17 12:14:25 2023 +0100 forwprop: Fix up rotate pattern matching [PR106523] The comment above simplify_rotate roughly describes what patterns are matched into what: We are looking for X with unsigned type T with bitsize B, OP being +, | or ^, some type T2 wider than T. For: (X << CNT1) OP (X >> CNT2) iff CNT1 + CNT2= =3D=3D B ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2)) iff CNT1 + CNT2= =3D=3D B transform these into: X r<< CNT1 Or for: (X << Y) OP (X >> (B - Y)) (X << (int) Y) OP (X >> (int) (B - Y)) ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y))) ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y))) (X << Y) | (X >> ((-Y) & (B - 1))) (X << (int) Y) | (X >> (int) ((-Y) & (B - 1))) ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1)))) ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1)))) transform these into (last 2 only if ranger can prove Y < B): X r<< Y Or for: (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1))) (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1))) ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1)))) ((T) ((T2) X << (int) (Y & (B - 1)))) \ | ((T) ((T2) X >> (int) ((-Y) & (B - 1)))) transform these into: X r<< (Y & (B - 1)) The following testcase shows that 2 of these are problematic. If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one of the shift counts but Y on the can do something different from rotate. E.g.: __attribute__((noipa)) unsigned char f7 (unsigned char x, unsigned int y) { unsigned int t =3D x; return (t << y) | (t >> ((-y) & 7)); } if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U] then it is UB, but for y in [9, 31] the left shift in this case will never leave any bits in the result, while in a rotate they are left there. Say for y 5 and x 0xaa the expression gives 0x55 which is the same thing as rotate, while for y 19 and x 0xaa 0x5, which is different. Now, I believe the ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y))) ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y))) forms are ok, because B - Y still needs to be a valid shift count, and if Y > B then B - Y should be either negative or very large positive (for unsigned types). And similarly the last 2 cases above which use & (B - 1) on both shift operands are definitely ok. The following patch disables the ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1)))) ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1)))) unless ranger says Y is not in [B, B2 - 1] range. And, looking at it again this morning, actually the Y equal to B case is still fine, if Y is equal to 0, then it is (T) (((T2) X << 0) | ((T2) X >> 0)) and so X, for Y =3D=3D B it is (T) (((T2) X << B) | ((T2) X >> 0)) which is the same as (T) (0 | ((T2) X >> 0)) which is also X. So instead of the [B, B2 - 1] range we could use [B + 1, B2 - 1]. And, if we wanted to go further, even multiplies of B are ok if they are smaller than B2, so we could construct a detail= ed int_range_max if we wanted. 2023-01-17 Jakub Jelinek PR tree-optimization/106523 * tree-ssa-forwprop.cc (simplify_rotate): For the patterns with (-Y) & (B - 1) in one operand's shift count and Y in another, if T2 has wider precision than T, punt if Y could have a value in [B, B2 - 1] range. * c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16, f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using __builtin_unreachable about shift count. * c-c++-common/rotate-2b.c: New test. * c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16, f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using __builtin_unreachable about shift count. * c-c++-common/rotate-4b.c: New test. * gcc.c-torture/execute/pr106523.c: New test.=