From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1984) id DC8493858D20; Thu, 9 Mar 2023 19:44:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DC8493858D20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678391060; bh=uTKQdT37qManq0Z701TDNtt+619q7lAkfDiXjOPWP3I=; h=From:To:Subject:Date:From; b=sXO8pAtwo2LZ4+SBgnXK4qLX0OmkbtFUYgAUy+Me8ErsxzWKjnXZDqsJHYrsr180g x9OYewbIJu9oifJwUbgl5g85xfYUlgoyoIlGZUDjH6vabdOAMFbiuZBkko0+By7ATZ CVWHPm+bODL9P7ZU6yvvENhRWIHiv2yAJOFu7N3k= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Tamar Christina To: gcc-cvs@gcc.gnu.org Subject: [gcc r13-6563] AArch64: Fix codegen regressions around tbz. X-Act-Checkin: gcc X-Git-Author: Tamar Christina X-Git-Refname: refs/heads/master X-Git-Oldrev: 96abc8222464fb1c4fba9f2ffe3fd1b081a9196e X-Git-Newrev: 8e26ac4749c5ddf827e18a846b1010b091f76fa2 Message-Id: <20230309194420.DC8493858D20@sourceware.org> Date: Thu, 9 Mar 2023 19:44:20 +0000 (GMT) List-Id: https://gcc.gnu.org/g:8e26ac4749c5ddf827e18a846b1010b091f76fa2 commit r13-6563-g8e26ac4749c5ddf827e18a846b1010b091f76fa2 Author: Tamar Christina Date: Thu Mar 9 19:42:18 2023 +0000 AArch64: Fix codegen regressions around tbz. We were analyzing code quality after recent changes and have noticed that the tbz support somehow managed to increase the number of branches overall rather than decreased them. While investigating this we figured out that the problem is that when an existing & exists in gimple and the instruction is generated because of the range information gotten from the ANDed constant that we end up with the situation that you get a NOP AND in the RTL expansion. This is not a problem as CSE will take care of it normally. The issue is when this original AND was done in a location where PRE or FRE "lift" the AND to a different basic block. This triggers a problem when the resulting value is not single use. Instead of having an AND and tbz, we end up generating an AND + TST + BR if the mode is HI or QI. This CSE across BB was a problem before but this change made it worse. Our branch patterns rely on combine being able to fold AND or zero_extends into the instructions. To work around this (since a proper fix is outside of the scope of stage-4) we are limiting the new tbranch optab to only HI and QI mode values. This isn't a problem because these two modes are modes for which we don't have CBZ support, so they are the problematic cases to begin with. Additionally booleans are QI. The second thing we're doing is limiting the only legal bitpos to pos 0. i.e. only the bottom bit. This such that we prevent the double ANDs as much as possible. Now most other cases, i.e. where we had an explicit & in the source code are still handled correctly by the anonymous (*tb1) pattern that was added along with tbranch support. This means we don't expand the superflous AND here, and while it doesn't fix the problem that in the cross BB case we loss tbz, it also doesn't make things worse. With these tweaks we've now reduced the number of insn uniformly was originally expected. gcc/ChangeLog: * config/aarch64/aarch64.md (tbranch_3): Restrict to SHORT and bottom bit only. gcc/testsuite/ChangeLog: * gcc.target/aarch64/tbz_2.c: New test. * gcc.target/aarch64/tbz_3.c: New test. Diff: --- gcc/config/aarch64/aarch64.md | 4 +- gcc/testsuite/gcc.target/aarch64/tbz_2.c | 130 +++++++++++++++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/tbz_3.c | 18 +++++ 3 files changed, 150 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 0b326d497b4..af9087508ac 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -949,8 +949,8 @@ (define_expand "tbranch_3" [(set (pc) (if_then_else - (EQL (match_operand:ALLI 0 "register_operand") - (match_operand 1 "aarch64_simd_shift_imm_")) + (EQL (match_operand:SHORT 0 "register_operand") + (match_operand 1 "const0_operand")) (label_ref (match_operand 2 "")) (pc)))] "" diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_2.c b/gcc/testsuite/gcc.target/aarch64/tbz_2.c new file mode 100644 index 00000000000..ec128b58f35 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/tbz_2.c @@ -0,0 +1,130 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -std=c99 -fno-unwind-tables -fno-asynchronous-unwind-tables" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +void h(void); + +/* +** g1: +** cbnz w0, .L[0-9]+ +** ret +** ... +*/ +void g1(int x) +{ + if (__builtin_expect (x, 0)) + h (); +} + +/* +** g2: +** tbnz x0, 0, .L[0-9]+ +** ret +** ... +*/ +void g2(int x) +{ + if (__builtin_expect (x & 1, 0)) + h (); +} + +/* +** g3: +** tbnz x0, 3, .L[0-9]+ +** ret +** ... +*/ +void g3(int x) +{ + if (__builtin_expect (x & 8, 0)) + h (); +} + +/* +** g4: +** tbnz w0, #31, .L[0-9]+ +** ret +** ... +*/ +void g4(int x) +{ + if (__builtin_expect (x & (1 << 31), 0)) + h (); +} + +/* +** g5: +** tst w0, 255 +** bne .L[0-9]+ +** ret +** ... +*/ +void g5(char x) +{ + if (__builtin_expect (x, 0)) + h (); +} + +/* +** g6: +** tbnz w0, 0, .L[0-9]+ +** ret +** ... +*/ +void g6(char x) +{ + if (__builtin_expect (x & 1, 0)) + h (); +} + +/* +** g7: +** tst w0, 3 +** bne .L[0-9]+ +** ret +** ... +*/ +void g7(char x) +{ + if (__builtin_expect (x & 3, 0)) + h (); +} + +/* +** g8: +** tbnz w0, 7, .L[0-9]+ +** ret +** ... +*/ +void g8(char x) +{ + if (__builtin_expect (x & (1 << 7), 0)) + h (); +} + +/* +** g9: +** tbnz w0, 0, .L[0-9]+ +** ret +** ... +*/ +void g9(bool x) +{ + if (__builtin_expect (x, 0)) + h (); +} + +/* +** g10: +** tbnz w0, 0, .L[0-9]+ +** ret +** ... +*/ +void g10(bool x) +{ + if (__builtin_expect (x & 1, 0)) + h (); +} + diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_3.c b/gcc/testsuite/gcc.target/aarch64/tbz_3.c new file mode 100644 index 00000000000..74f758d33e2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/tbz_3.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +void g(int); + +void +f (unsigned int x, _Bool y) +{ + for (int i = 0; i < 100; ++i) + { + if ((x >> 31) | y) + g (1); + for (int j = 0; j < 100; ++j) + g (2); + } +} + +/* { dg-final { scan-assembler-times {and\t} 1 } } */