From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 24CAE3858D35 for ; Wed, 4 Jan 2023 08:28:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 24CAE3858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=linux.vnet.ibm.com Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3045aTOY000413; Wed, 4 Jan 2023 08:28:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : cc : from : subject : content-type : content-transfer-encoding; s=pp1; bh=8z+sCwXBsCq1l4ksL8qW4csMiN7ynGC092sA/+jD3k4=; b=byrZp2dP6T16eCPyqn+zMxKY4T/3SE8JyBc0RlXVvf+VmYyJzvAICotouS8yeESk94ky 7yw+yFrJFQn5SQfyo+b0HEVMU+/0K4/29mLQJqKa7R+z7fyhQN6978gM1zLPmZN8Cd32 rIWDB7xqLWPUl0LSdrK0Tt7qCLaxWthv7itdKHYJmZK83PYLtfXAUq0vHg3jeHRpELfN 8HTzZ0rKLjYO1rR/QKgXDZ87dSkbsSu9FW5L/nSsniq5cDsMbE9X+gNhm0iWzBkF6US7 2xdRIT16TORmyfb1IORfJVhsMFGBgLknHv5a7Q073HcSuN57QaPWKACPuG/zYjDh1mZc +Q== Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3mvmkn73fw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 04 Jan 2023 08:28:26 +0000 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3047dHRi007025; Wed, 4 Jan 2023 08:28:25 GMT Received: from smtprelay04.wdc07v.mail.ibm.com ([9.208.129.114]) by ppma01dal.us.ibm.com (PPS) with ESMTPS id 3mtcq8c9j9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 04 Jan 2023 08:28:25 +0000 Received: from smtpav05.wdc07v.mail.ibm.com (smtpav05.wdc07v.mail.ibm.com [10.39.53.232]) by smtprelay04.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3048SMnr27394600 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 4 Jan 2023 08:28:22 GMT Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8D27258053; Wed, 4 Jan 2023 08:28:22 +0000 (GMT) Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CC5E858043; Wed, 4 Jan 2023 08:28:20 +0000 (GMT) Received: from [9.43.90.147] (unknown [9.43.90.147]) by smtpav05.wdc07v.mail.ibm.com (Postfix) with ESMTP; Wed, 4 Jan 2023 08:28:20 +0000 (GMT) Message-ID: Date: Wed, 4 Jan 2023 13:58:19 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 Content-Language: en-US To: GCC Patches Cc: Peter Bergner , Segher Boessenkool , meissner@linux.ibm.com From: Surya Kumari Jangala Subject: [PATCH] swap: Fix incorrect lane extraction by vec_extract() [PR106770] Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: zSa-W_66dJpvjmR7ziOKoobWEGK8XhO4 X-Proofpoint-ORIG-GUID: zSa-W_66dJpvjmR7ziOKoobWEGK8XhO4 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-04_04,2023-01-03_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxlogscore=927 suspectscore=0 impostorscore=0 phishscore=0 mlxscore=0 clxscore=1011 malwarescore=0 adultscore=0 priorityscore=1501 lowpriorityscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301040067 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: swap: Fix incorrect lane extraction by vec_extract() [PR106770] In the routine rs6000_analyze_swaps(), special handling of swappable instructions is done even if the webs that contain the swappable instructions are not optimized, i.e., the webs do not contain any permuting load/store instructions along with the associated register swap instructions. Doing special handling in such webs will result in the extracted lane being adjusted unnecessarily for vec_extract. Modifying swappable instructions is also incorrect in webs where loads/stores on quad word aligned addresses are changed to lvx/stvx. Similarly, in webs where swap(load(vector constant)) instructions are replaced with load(swapped vector constant), the swappable instructions should not be modified. 2023-01-04 Surya Kumari Jangala gcc/ PR rtl-optimization/106770 * rs6000-p8swap.cc (rs6000_analyze_swaps): . gcc/testsuite/ PR rtl-optimization/106770 * gcc.target/powerpc/pr106770.c: New test. --- diff --git a/gcc/config/rs6000/rs6000-p8swap.cc b/gcc/config/rs6000/rs6000-p8swap.cc index 19fbbfb67dc..7ed39251df9 100644 --- a/gcc/config/rs6000/rs6000-p8swap.cc +++ b/gcc/config/rs6000/rs6000-p8swap.cc @@ -179,6 +179,9 @@ class swap_web_entry : public web_entry_base unsigned int special_handling : 4; /* Set if the web represented by this entry cannot be optimized. */ unsigned int web_not_optimizable : 1; + /* Set if the web represented by this entry has been optimized, ie, + register swaps of permuting loads/stores have been removed. */ + unsigned int web_is_optimized : 1; /* Set if this insn should be deleted. */ unsigned int will_delete : 1; }; @@ -2627,22 +2630,43 @@ rs6000_analyze_swaps (function *fun) /* For each load and store in an optimizable web (which implies the loads and stores are permuting), find the associated register swaps and mark them for removal. Due to various - optimizations we may mark the same swap more than once. Also - perform special handling for swappable insns that require it. */ + optimizations we may mark the same swap more than once. Fix up + the non-permuting loads and stores by converting them into + permuting ones. */ for (i = 0; i < e; ++i) if ((insn_entry[i].is_load || insn_entry[i].is_store) && insn_entry[i].is_swap) { swap_web_entry* root_entry = (swap_web_entry*)((&insn_entry[i])->unionfind_root ()); - if (!root_entry->web_not_optimizable) + if (!root_entry->web_not_optimizable) { mark_swaps_for_removal (insn_entry, i); + root_entry->web_is_optimized = true; + } } - else if (insn_entry[i].is_swappable && insn_entry[i].special_handling) + else if (insn_entry[i].is_swappable + && (insn_entry[i].special_handling == SH_NOSWAP_LD || + insn_entry[i].special_handling == SH_NOSWAP_ST)) + { + swap_web_entry* root_entry + = (swap_web_entry*)((&insn_entry[i])->unionfind_root ()); + if (!root_entry->web_not_optimizable) { + handle_special_swappables (insn_entry, i); + root_entry->web_is_optimized = true; + } + } + + /* Perform special handling for swappable insns that require it. + Note that special handling should be done only for those + swappable insns that are present in webs optimized above. */ + for (i = 0; i < e; ++i) + if (insn_entry[i].is_swappable && insn_entry[i].special_handling && + !(insn_entry[i].special_handling == SH_NOSWAP_LD || + insn_entry[i].special_handling == SH_NOSWAP_ST)) { swap_web_entry* root_entry = (swap_web_entry*)((&insn_entry[i])->unionfind_root ()); - if (!root_entry->web_not_optimizable) + if (root_entry->web_is_optimized) handle_special_swappables (insn_entry, i); } diff --git a/gcc/testsuite/gcc.target/powerpc/pr106770.c b/gcc/testsuite/gcc.target/powerpc/pr106770.c new file mode 100644 index 00000000000..84e9aead975 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power8 -O3 " } */ +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */ + +/* Test case to resolve PR106770 */ + +#include + +int cmp2(double a, double b) +{ + vector double va = vec_promote(a, 1); + vector double vb = vec_promote(b, 1); + vector long long vlt = (vector long long)vec_cmplt(va, vb); + vector long long vgt = (vector long long)vec_cmplt(vb, va); + vector signed long long vr = vec_sub(vlt, vgt); + + return vec_extract(vr, 1); +} +