From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 3B69F38582AD for ; Tue, 6 Sep 2022 09:42:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3B69F38582AD Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2869Sj3B033946 for ; Tue, 6 Sep 2022 09:42:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : from : subject : content-type : content-transfer-encoding; s=pp1; bh=aSO1XkPi8X6irofbzQVYK7Cvw3KjKy7VTEu0LRmDBrA=; b=HgqUm1YdIfz5AARfi8iBoIC9PVxbGxbjWL8j+ao6oV6I5oNSIWhIUD4Ca4j+ycu3zJHF wPWIIKB7uyMbE1182gaPi259whCmp3po29LrqxskFzVw3X0TENkgVgwYmBZRn05Qu2M7 UpU8yYB1Vr/9lpw+N8NDSGQQOCzr0lXYKZc85TrXjCZgBHXVQJb1jwRpyCL91WYZd/Ue GMsxEz5w0/tTq/RQ7b0G35b8n4ZYjcfURGbRJ6M65JgPsj5RWsvGkyseT1zbwV+3rb1/ rb+u4uE+RnudiUY4PnmYBdFtgQSxNIW/SwHtlsSCbepd6mFmw4so6JsbRowijTwiUiLJ QQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3je3m1rjvq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 06 Sep 2022 09:42:40 +0000 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2869ToVv039604 for ; Tue, 6 Sep 2022 09:42:40 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3je3m1rjuh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 06 Sep 2022 09:42:40 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2869e0pO001670; Tue, 6 Sep 2022 09:42:37 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma04ams.nl.ibm.com with ESMTP id 3jbxj8ujc6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 06 Sep 2022 09:42:37 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2869d7dT39846256 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 6 Sep 2022 09:39:07 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 607C511C052; Tue, 6 Sep 2022 09:42:35 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2977E11C04C; Tue, 6 Sep 2022 09:42:35 +0000 (GMT) Received: from [9.171.63.5] (unknown [9.171.63.5]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Tue, 6 Sep 2022 09:42:35 +0000 (GMT) Message-ID: Date: Tue, 6 Sep 2022 11:42:34 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Content-Language: en-US To: GCC Patches , Richard Biener , pinskia@gmail.com From: Robin Dapp Subject: [PATCH] expand: Convert cst - x into cst xor x. Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 9ui4CIm9jCNzrI8U4ZjhkzqzSqoMqT0a X-Proofpoint-ORIG-GUID: KtZn0WnFGo6WbOSEsOkImtxrbS8lq7dO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-06_05,2022-09-05_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 mlxlogscore=999 bulkscore=0 malwarescore=0 suspectscore=0 mlxscore=0 priorityscore=1501 spamscore=0 phishscore=0 impostorscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2209060045 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, posting this separately from PR91213 now. I wrote an s390 test and most likely it could also be done for x86 which will give it broader coverage. Depending on the backend it might be better to convert cst - x into cst xor x if cst + 1 is a power of two and 0 <= x <= cst. This patch compares both sequences and emits the less expensive one. Does this look like a viable approach? Bootstrapped and regtested on s390[x], waited with x86 tests until a first round of comments. Regards Robin gcc/ChangeLog: PR middle-end/91213 * expr.cc (expand_expr_real_2): Call new function. (maybe_optimize_cst_sub): New function. * expr.h (maybe_optimize_cst_sub): Define. gcc/testsuite/ChangeLog: * gcc.target/s390/cst-minus-var.c: New test. --- gcc/expr.cc | 79 +++++++++++++++++++ gcc/expr.h | 2 + gcc/testsuite/gcc.target/s390/cst-minus-var.c | 55 +++++++++++++ 3 files changed, 136 insertions(+) create mode 100644 gcc/testsuite/gcc.target/s390/cst-minus-var.c diff --git a/gcc/expr.cc b/gcc/expr.cc index 80bb1b8a4c5b..80f25720d7b6 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -9397,6 +9397,21 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return simplify_gen_binary (MINUS, mode, op0, op1); } + /* Convert const - A to const xor A if integer_pow2p (const + 1) + and 0 <= A <= const. */ + if (code == MINUS_EXPR + && SCALAR_INT_MODE_P (mode) + && TREE_CODE (treeop0) == INTEGER_CST + && TREE_CODE (TREE_TYPE (treeop1)) == INTEGER_TYPE + && wi::exact_log2 (wi::to_widest (treeop0) + 1) != -1) + { + rtx res = maybe_optimize_cst_sub (code, treeop0, treeop1, + mode, unsignedp, type, + target, subtarget); + if (res) + return res; + } + /* No sense saving up arithmetic to be done if it's all in the wrong mode to form part of an address. And force_operand won't know whether to sign-extend or @@ -12692,6 +12707,70 @@ maybe_optimize_mod_cmp (enum tree_code code, tree *arg0, tree *arg1) return code == EQ_EXPR ? LE_EXPR : GT_EXPR; } +/* Convert const - A to const xor A if integer_pow2p (const + 1) + and 0 <= A <= const. */ + +rtx +maybe_optimize_cst_sub (enum tree_code code, tree treeop0, tree treeop1, + machine_mode mode, int unsignedp, tree type, + rtx target, rtx subtarget) +{ + gcc_checking_assert (code == MINUS_EXPR); + gcc_checking_assert (SCALAR_INT_MODE_P (mode)); + gcc_checking_assert (TREE_CODE (treeop0) == INTEGER_CST); + gcc_checking_assert (TREE_CODE (TREE_TYPE (treeop1)) == INTEGER_TYPE); + gcc_checking_assert (wi::exact_log2 (wi::to_widest (treeop0) + 1) != -1); + + if (!optimize) + return NULL_RTX; + + optab this_optab; + rtx op0, op1; + + if (wi::leu_p (tree_nonzero_bits (treeop1), tree_nonzero_bits (treeop0))) + { + expand_operands (treeop0, treeop1, subtarget, &op0, &op1, + EXPAND_NORMAL); + bool speed_p = optimize_insn_for_speed_p (); + do_pending_stack_adjust (); + start_sequence (); + this_optab = optab_for_tree_code (MINUS_EXPR, type, + optab_default); + rtx subi = expand_binop (mode, this_optab, op0, op1, target, + unsignedp, OPTAB_LIB_WIDEN); + + rtx_insn *sub_insns = get_insns (); + end_sequence (); + start_sequence (); + this_optab = optab_for_tree_code (BIT_XOR_EXPR, type, + optab_default); + rtx xori = expand_binop (mode, this_optab, op0, op1, target, + unsignedp, OPTAB_LIB_WIDEN); + rtx_insn *xor_insns = get_insns (); + end_sequence (); + unsigned sub_cost = seq_cost (sub_insns, speed_p); + unsigned xor_cost = seq_cost (xor_insns, speed_p); + /* If costs are the same then use as tie breaker the other other + factor. */ + if (sub_cost == xor_cost) + { + sub_cost = seq_cost (sub_insns, !speed_p); + xor_cost = seq_cost (xor_insns, !speed_p); + } + + if (sub_cost <= xor_cost) + { + emit_insn (sub_insns); + return subi; + } + + emit_insn (xor_insns); + return xori; + } + + return NULL_RTX; +} + /* Optimize x - y < 0 into x < 0 if x - y has undefined overflow. */ void diff --git a/gcc/expr.h b/gcc/expr.h index 08b59b8d869a..43ea11042d26 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -326,6 +326,8 @@ extern tree string_constant (tree, tree *, tree *, tree *); extern tree byte_representation (tree, tree *, tree *, tree *); extern enum tree_code maybe_optimize_mod_cmp (enum tree_code, tree *, tree *); +extern rtx maybe_optimize_cst_sub (enum tree_code, tree, tree, + machine_mode, int, tree , rtx, rtx); extern void maybe_optimize_sub_cmp_0 (enum tree_code, tree *, tree *); /* Two different ways of generating switch statements. */ diff --git a/gcc/testsuite/gcc.target/s390/cst-minus-var.c b/gcc/testsuite/gcc.target/s390/cst-minus-var.c new file mode 100644 index 000000000000..c713624a9784 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/cst-minus-var.c @@ -0,0 +1,55 @@ +/* Check that we can convert const - x to const xor x if + const + 1 is a power of two and 0 <= x <= const. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mzarch" } */ +/* { dg-final { scan-assembler-times 8 "xr\t" { target { ! 390_z10_hw } } } } */ +/* { dg-final { scan-assembler-times 8 "xilf\t" { target { s390_z10_hw } } } } */ + +unsigned long long foo (unsigned long long a) +{ + if (a > 1) __builtin_unreachable(); + return 1 - a; +} + +unsigned long long bar (unsigned long long a) +{ + if (a > 65535) __builtin_unreachable(); + return 65535 - a; +} + +unsigned long long baz (unsigned long long a) +{ + if (a > 4294967295) __builtin_unreachable(); + return 4294967295 - a; +} + +int fooi (int a) +{ + if (a > 127 || a < 0) __builtin_unreachable(); + return 127 - a; +} + +int bari (int a) +{ + if (a > 65535 || a < 0) __builtin_unreachable(); + return 65535 - a; +} + +long bazl (long a) +{ + if (a > 4294967295 || a < 0) __builtin_unreachable(); + return 4294967295 - a; +} + +short foos (short a) +{ + if (a > 31 || a < 0) __builtin_unreachable(); + return 31 - a; +} + +short bars (int a) +{ + if (a > 65535 || a < 0) __builtin_unreachable(); + return 65535 - a; +} -- 2.31.1