From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2153) id 7BB4238618FF; Tue, 20 Apr 2021 23:29:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7BB4238618FF MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Jakub Jelinek To: gcc-cvs@gcc.gnu.org Subject: [gcc r9-9390] combine: Fix up simplify_shift_const_1 for nested ROTATEs [PR97386] X-Act-Checkin: gcc X-Git-Author: Jakub Jelinek X-Git-Refname: refs/heads/releases/gcc-9 X-Git-Oldrev: 2913a8f35b7100e8632d2c10dc4126a636cbc9d9 X-Git-Newrev: 1471e383f4909e7d6bd548d010eb96afcf2d241e Message-Id: <20210420232950.7BB4238618FF@sourceware.org> Date: Tue, 20 Apr 2021 23:29:50 +0000 (GMT) X-BeenThere: gcc-cvs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Apr 2021 23:29:50 -0000 https://gcc.gnu.org/g:1471e383f4909e7d6bd548d010eb96afcf2d241e commit r9-9390-g1471e383f4909e7d6bd548d010eb96afcf2d241e Author: Jakub Jelinek Date: Tue Oct 13 19:13:26 2020 +0200 combine: Fix up simplify_shift_const_1 for nested ROTATEs [PR97386] The following testcases are miscompiled (the first one since my improvements to rotate discovery on GIMPLE, the other one for many years) because combiner optimizes nested ROTATEs with narrowing SUBREG in between (i.e. the outer rotate is performed in shorter precision than the inner one) to just one ROTATE of the rotated constant. While that (under certain conditions) can work for shifts, it can't work for rotates where we can only do that with rotates of the same precision. 2020-10-13 Jakub Jelinek PR rtl-optimization/97386 * combine.c (simplify_shift_const_1): Don't optimize nested ROTATEs if they have different modes. * gcc.c-torture/execute/pr97386-1.c: New test. * gcc.c-torture/execute/pr97386-2.c: New test. (cherry picked from commit 1a98b22b0468214ae8463d075dacaeea1d46df15) Diff: --- gcc/combine.c | 7 +++++-- gcc/testsuite/gcc.c-torture/execute/pr97386-1.c | 16 ++++++++++++++++ gcc/testsuite/gcc.c-torture/execute/pr97386-2.c | 20 ++++++++++++++++++++ 3 files changed, 41 insertions(+), 2 deletions(-) diff --git a/gcc/combine.c b/gcc/combine.c index ab8463bf27b..c2dd7eef0e4 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -11011,8 +11011,11 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, break; /* For ((int) (cstLL >> count)) >> cst2 just give up. Queuing up outer sign extension (often left and right shift) is - hardly more efficient than the original. See PR70429. */ - if (code == ASHIFTRT && int_mode != int_result_mode) + hardly more efficient than the original. See PR70429. + Similarly punt for rotates with different modes. + See PR97386. */ + if ((code == ASHIFTRT || code == ROTATE) + && int_mode != int_result_mode) break; rtx count_rtx = gen_int_shift_amount (int_result_mode, count); diff --git a/gcc/testsuite/gcc.c-torture/execute/pr97386-1.c b/gcc/testsuite/gcc.c-torture/execute/pr97386-1.c new file mode 100644 index 00000000000..c50e0380a65 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr97386-1.c @@ -0,0 +1,16 @@ +/* PR rtl-optimization/97386 */ + +__attribute__((noipa)) unsigned char +foo (unsigned int c) +{ + return __builtin_bswap16 ((unsigned long long) (0xccccLLU << c | 0xccccLLU >> ((-c) & 63))); +} + +int +main () +{ + unsigned char x = foo (0); + if (__CHAR_BIT__ == 8 && __SIZEOF_SHORT__ == 2 && x != 0xcc) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.c-torture/execute/pr97386-2.c b/gcc/testsuite/gcc.c-torture/execute/pr97386-2.c new file mode 100644 index 00000000000..e61829d71ab --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr97386-2.c @@ -0,0 +1,20 @@ +/* PR rtl-optimization/97386 */ + +__attribute__((noipa)) unsigned +foo (int x) +{ + unsigned long long a = (0x800000000000ccccULL << x) | (0x800000000000ccccULL >> (64 - x)); + unsigned int b = a; + return (b << 24) | (b >> 8); +} + +int +main () +{ + if (__CHAR_BIT__ == 8 + && __SIZEOF_INT__ == 4 + && __SIZEOF_LONG_LONG__ == 8 + && foo (1) != 0x99000199U) + __builtin_abort (); + return 0; +}