From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 0F4763858D32 for ; Tue, 5 Dec 2023 18:01:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F4763858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0F4763858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701799306; cv=none; b=SjmVsyoQalKR8/wW43Eg1EWey4hiGDDUywm8By1TfDSVPlGl7jETTgD1MDWNQNTu8DW5AUlT8A9RS0B+KaLOojCpcC3QbpWAAsjU6L9jOUTrMCvNRx9+xGxSiV90cJ8iiyCbPF9Eo1hQNMNTS1tMiZE980aKG73sJPR8V8XCHnk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701799306; c=relaxed/simple; bh=Nn+WzidsHRXw6xOHSMHIfEi3wANfw4C/bVWlJdIHKg0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:From:To; b=AbvyhXL6zH4GdcL8tQesVNGA//1f1NlXIkFtHT8UR+yDW+DatbXwIzbSYmpbqVC5/x1mIBXQXhUq2iKr/9Xbz0c/uWcIGWl4jQH/r3ADeTyyzatYE5YYBzq/6mefpgi8lufprkh1lmyY1v2stbpJmzrYmZ8cekh1BYHxFqlcANM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B5Hnfo4028483; Tue, 5 Dec 2023 18:01:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : from : to : cc : references : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=jMBX4yMiWOMuPiWm8O0CnAlaDXNeskuJ8g4YLPTTcwc=; b=gIhX0GLKGRPy3q0vwMo4kZxYKj3jFc5EsgSA+6W8+5CfGFOuBvlsh+Vz0CrrXTNeGwqu odqhgIe0kzZ5IFR6Kdpk59pjyo9qXmA8eBiO5ClMg/9TN98incCb+Wd2TMseo3OJSVwg NmFxkacos6Biri6aUre7MFxbAPUdmEURbiFL6msni/IPFAH7LoGqIj7rcmvWUay6xjbF HJFxqSMVYCMoZTad/UdiwbtlS92PGyd7CkgFGLuBBoyUjHJ+ousQbOtisTh/N58x/Bej dYpjTcfkjVINTYRY9qKAPPrmjAWF7BtpIl83SqGwAEDTZQCJZtS4TL5Anvve6jqJPjEI Lw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ut8kjgbth-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2023 18:01:43 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B5HowUD031010; Tue, 5 Dec 2023 18:01:43 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ut8kjgbt8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2023 18:01:43 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B5GbqUF022634; Tue, 5 Dec 2023 18:01:42 GMT Received: from smtprelay03.dal12v.mail.ibm.com ([172.16.1.5]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3urhm27t22-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2023 18:01:42 +0000 Received: from smtpav06.wdc07v.mail.ibm.com (smtpav06.wdc07v.mail.ibm.com [10.39.53.233]) by smtprelay03.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B5I1fl814942966 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 5 Dec 2023 18:01:42 GMT Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CA97158064; Tue, 5 Dec 2023 18:01:41 +0000 (GMT) Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 59AFD58056; Tue, 5 Dec 2023 18:01:39 +0000 (GMT) Received: from [9.150.26.221] (unknown [9.150.26.221]) by smtpav06.wdc07v.mail.ibm.com (Postfix) with ESMTP; Tue, 5 Dec 2023 18:01:39 +0000 (GMT) Message-ID: <11198028-5b04-4ebd-9374-a78dc85376a8@linux.ibm.com> Date: Tue, 5 Dec 2023 23:31:34 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp Content-Language: en-US From: Ajit Agarwal To: "Kewen.Lin" Cc: Segher Boessenkool , David Edelsohn , Peter Bergner , Michael Meissner , GCC Patches References: <77426697-1571-e180-add9-cfb6d10f8424@linux.ibm.com> <57d3fbcb-98b6-4658-8d08-e30f8c68a18c@linux.ibm.com> In-Reply-To: <57d3fbcb-98b6-4658-8d08-e30f8c68a18c@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Jh5U9xPh7Oj4SVw8d_GgeUNYheKVi64N X-Proofpoint-ORIG-GUID: EQKBB-hIxegUyLdomFrIB1dHxWNfYqlo X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-05_13,2023-12-05_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 spamscore=0 malwarescore=0 suspectscore=0 bulkscore=0 mlxlogscore=999 clxscore=1015 phishscore=0 adultscore=0 mlxscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2312050142 X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Kewen: On 05/12/23 7:13 pm, Ajit Agarwal wrote: > Hello Kewen: > > On 04/12/23 7:31 am, Kewen.Lin wrote: >> Hi Ajit, >> >> on 2023/12/1 17:10, Ajit Agarwal wrote: >>> Hello Kewen: >>> >>> On 24/11/23 3:01 pm, Kewen.Lin wrote: >>>> Hi Ajit, >>>> >>>> Don't forget to CC David (CC-ed) :), some comments are inlined below. >>>> >>>> on 2023/10/8 03:04, Ajit Agarwal wrote: >>>>> Hello All: >>>>> >>>>> This patch add new pass to replace contiguous addresses vector load lxv with mma instruction >>>>> lxvp. >>>> >>>> IMHO the current binding lxvp (and lxvpx, stxvp{x,}) to MMA looks wrong, it's only >>>> Power10 and VSX required, these instructions should perform well without MMA support. >>>> So one patch to separate their support from MMA seems to go first. >>>> >>> >>> I will make the changes for Power10 and VSX. >>> >>>>> This patch addresses one regressions failure in ARM architecture. >>>> >>>> Could you explain this? I don't see any test case for this. >>> >>> I have submitted v1 of the patch and there were regressions failure for Linaro. >>> I have fixed in version V2. >> >> OK, thanks for clarifying. So some unexpected changes on generic code in v1 >> caused the failure exposed on arm. >> >>> >>> >>>> Besides, it seems a bad idea to put this pass after reload? as register allocation >>>> finishes, this pairing has to be restricted by the reg No. (I didn't see any >>>> checking on the reg No. relationship for paring btw.) >>>> >>> >>> Adding before reload pass deletes one of the lxv and replaced with lxvp. This >>> fails in reload pass while freeing reg_eqivs as ira populates them and then >> >> I can't find reg_eqivs, I guessed you meant reg_equivs and moved this pass right before >> pass_reload (between pass_ira and pass_reload)? IMHO it's unexpected as those two passes >> are closely correlated. I was expecting to put it somewhere before ira. > > Yes they are tied together and moving before reload will not work. > >> >>> vecload pass deletes some of insns and while freeing in reload pass as insn >>> is already deleted in vecload pass reload pass segfaults. >>> >>> Moving vecload pass before ira will not make register pairs with lxvp and >>> in ira and that will be a problem. >> >> Could you elaborate the obstacle for moving such pass before pass_ira? >> >> Basing on the status quo, the lxvp is bundled with OOmode, then I'd expect >> we can generate OOmode move (load) and use the components with unspec (or >> subreg with Peter's patch) to replace all the previous use places, it looks >> doable to me. > > Moving before ira passes, we delete the offset lxv and generate lxvp and replace all > the uses, that I am doing. But the offset lxvp register generated by ira are not > register pair and generate random register and hence we cannot generate lxvp. > > For example one lxv is generated with register 32 and other pair is generated > with register 45 by ira if we move it before ira passes. It generates the following. lxvp %vs32,0(%r4) xvf32ger 0,%vs34,%vs32 xvf32gerpp 0,%vs34,%vs45 xxmfacc 0 stxvp %vs2,0(%r3) stxvp %vs0,32(%r3) blr Instead of vs33 ira generates vs45 if we move before pass_ira. Thanks & Regards Ajit > Thanks & Regards > Ajit >> > >>> >>> Making after reload pass is the only solution I see as ira and reload pass >>> makes register pairs and vecload pass will be easier with generation of >>> lxvp. >>> >>> Please suggest. >>> >>>> Looking forward to the comments from Segher/David/Peter/Mike etc. >> >> Still looking forward. :) >> >> BR, >> Kewen