From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 30C3F3858C30 for ; Thu, 7 Dec 2023 11:01:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 30C3F3858C30 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 30C3F3858C30 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701946912; cv=none; b=iTPJxpId8xE1+wc5ZHxDCIV2p4qV8PwoXZ3TY6PaE53MFDD5+ih/fqjAWMSagR0EoepvzE26nbc87fGV/P0k4olvHHRzcvW/CVpJ8xqMiriPxlVikLGGmFcCMp4ksJmVKMZUJvHJLLe/8fQFseuDjdn9Ji0ByCBOXnHILC0+AZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701946912; c=relaxed/simple; bh=OSazCIQAgGjbMx23MHTUErwcPoHVDfyxuXzW/L2CgsE=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=bujvxUf8f9lzdvR8Vf9Sm40wXQPcgQNeqrfiU7Ss3bliYf1GxrNB/YYG0TX3KHz0GqdeDmnMN7Z7hisD/W1ION48wjJjNYfnAVMVBeTK7N/6FCm+3WzRE3euoyBwpruLjDnuy25mMA6hmkSDn/QnQ2Szl2c26TjHBkYJTg6xBxk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353727.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B7AHAEm026920; Thu, 7 Dec 2023 11:01:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=0kyWipbSX6RFIoFuM+neHIHbovdCSpUBLPprziqYG9A=; b=M9sRAFmZ0JRWG/7eJmPk9WbyR+zsoCJPlIKScCmhHOIcpg3YkycbtK+MrIOQHdOGgJb5 NmMOvHC1PTkeUSSLepqlxiQRH4rJf0vSrt7iJ4guZWV3zSLgAcolEc+D5tlRgi+mSAOH o3+dhRJ/qCxz4uN124RsavYUbz1S+9pTNosIrK9eGmIN5eR/A4J/f4sSApU8si7CVMcB alhav9Euzisfxob6x+x7h3J97lWOlmFXvczQryKYtAiQf6OqlzAI2qKKu2NeZBxz5fiw QSvsDZagZN9pHuqOTOaKSf1mwl9DlbbFeBcBAEYGHPGb0RZiLbrNgNa//C81Lk+SDq+b Cw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uuc5k176w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 07 Dec 2023 11:01:46 +0000 Received: from m0353727.ppops.net (m0353727.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B7AtiQG017032; Thu, 7 Dec 2023 11:01:46 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uuc5k176d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 07 Dec 2023 11:01:46 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B780Ed2027026; Thu, 7 Dec 2023 11:01:45 GMT Received: from smtprelay06.wdc07v.mail.ibm.com ([172.16.1.73]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3utav32mtp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 07 Dec 2023 11:01:45 +0000 Received: from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com [10.39.53.230]) by smtprelay06.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B7B1iNF17367770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 7 Dec 2023 11:01:44 GMT Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 539B358062; Thu, 7 Dec 2023 11:01:44 +0000 (GMT) Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6B1915805F; Thu, 7 Dec 2023 11:01:40 +0000 (GMT) Received: from [9.43.99.221] (unknown [9.43.99.221]) by smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 7 Dec 2023 11:01:39 +0000 (GMT) Message-ID: <5467f44c-7b16-444d-a292-8da237f7a9f5@linux.ibm.com> Date: Thu, 7 Dec 2023 16:31:37 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp Content-Language: en-US To: "Kewen.Lin" Cc: Segher Boessenkool , David Edelsohn , Peter Bergner , Michael Meissner , GCC Patches References: <77426697-1571-e180-add9-cfb6d10f8424@linux.ibm.com> <57d3fbcb-98b6-4658-8d08-e30f8c68a18c@linux.ibm.com> <11198028-5b04-4ebd-9374-a78dc85376a8@linux.ibm.com> From: Ajit Agarwal In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: RMSribY1YqxuqB-q2iAFZYO-hSh7E47Z X-Proofpoint-GUID: Ts4bYNLJyTarMCmb3nxmGbj23dxwfPxD X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-07_09,2023-12-07_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 malwarescore=0 impostorscore=0 spamscore=0 phishscore=0 clxscore=1015 mlxscore=0 bulkscore=0 suspectscore=0 adultscore=0 priorityscore=1501 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312070090 X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Kewen: On 06/12/23 7:52 am, Kewen.Lin wrote: > on 2023/12/6 02:01, Ajit Agarwal wrote: >> Hello Kewen: >> >> >> On 05/12/23 7:13 pm, Ajit Agarwal wrote: >>> Hello Kewen: >>> >>> On 04/12/23 7:31 am, Kewen.Lin wrote: >>>> Hi Ajit, >>>> >>>> on 2023/12/1 17:10, Ajit Agarwal wrote: >>>>> Hello Kewen: >>>>> >>>>> On 24/11/23 3:01 pm, Kewen.Lin wrote: >>>>>> Hi Ajit, >>>>>> >>>>>> Don't forget to CC David (CC-ed) :), some comments are inlined below. >>>>>> >>>>>> on 2023/10/8 03:04, Ajit Agarwal wrote: >>>>>>> Hello All: >>>>>>> >>>>>>> This patch add new pass to replace contiguous addresses vector load lxv with mma instruction >>>>>>> lxvp. >>>>>> >>>>>> IMHO the current binding lxvp (and lxvpx, stxvp{x,}) to MMA looks wrong, it's only >>>>>> Power10 and VSX required, these instructions should perform well without MMA support. >>>>>> So one patch to separate their support from MMA seems to go first. >>>>>> >>>>> >>>>> I will make the changes for Power10 and VSX. >>>>> >>>>>>> This patch addresses one regressions failure in ARM architecture. >>>>>> >>>>>> Could you explain this? I don't see any test case for this. >>>>> >>>>> I have submitted v1 of the patch and there were regressions failure for Linaro. >>>>> I have fixed in version V2. >>>> >>>> OK, thanks for clarifying. So some unexpected changes on generic code in v1 >>>> caused the failure exposed on arm. >>>> >>>>> >>>>> >>>>>> Besides, it seems a bad idea to put this pass after reload? as register allocation >>>>>> finishes, this pairing has to be restricted by the reg No. (I didn't see any >>>>>> checking on the reg No. relationship for paring btw.) >>>>>> >>>>> >>>>> Adding before reload pass deletes one of the lxv and replaced with lxvp. This >>>>> fails in reload pass while freeing reg_eqivs as ira populates them and then >>>> >>>> I can't find reg_eqivs, I guessed you meant reg_equivs and moved this pass right before >>>> pass_reload (between pass_ira and pass_reload)? IMHO it's unexpected as those two passes >>>> are closely correlated. I was expecting to put it somewhere before ira. >>> >>> Yes they are tied together and moving before reload will not work. >>> >>>> >>>>> vecload pass deletes some of insns and while freeing in reload pass as insn >>>>> is already deleted in vecload pass reload pass segfaults. >>>>> >>>>> Moving vecload pass before ira will not make register pairs with lxvp and >>>>> in ira and that will be a problem. >>>> >>>> Could you elaborate the obstacle for moving such pass before pass_ira? >>>> >>>> Basing on the status quo, the lxvp is bundled with OOmode, then I'd expect >>>> we can generate OOmode move (load) and use the components with unspec (or >>>> subreg with Peter's patch) to replace all the previous use places, it looks >>>> doable to me. >>> >>> Moving before ira passes, we delete the offset lxv and generate lxvp and replace all >>> the uses, that I am doing. But the offset lxvp register generated by ira are not >>> register pair and generate random register and hence we cannot generate lxvp. >>> >>> For example one lxv is generated with register 32 and other pair is generated >>> with register 45 by ira if we move it before ira passes. >> >> It generates the following. >> lxvp %vs32,0(%r4) >> xvf32ger 0,%vs34,%vs32 >> xvf32gerpp 0,%vs34,%vs45 > > What do the RTL insns for these insns look like? > > I'd expect you use UNSPEC_MMA_EXTRACT to extract V16QI from the result of lxvp, > the current define_insn_and_split "*vsx_disassemble_pair" should be able to take > care of it further (eg: reg and regoff). > Yes with UNSPEC_MMA_EXTRACT it generates lxvp with register pair instead of random register by ira and reload pass. But there is an extra moves that gets generated. I am working further on this and send the new version of the patch with all the comments incorporated. Thanks & Regards Ajit > BR, > Kewen > >> xxmfacc 0 >> stxvp %vs2,0(%r3) >> stxvp %vs0,32(%r3) >> blr >> >> >> Instead of vs33 ira generates vs45 if we move before pass_ira. >> >> Thanks & Regards >> Ajit >> >> >>> Thanks & Regards >>> Ajit >>>> >>> >>>>> >>>>> Making after reload pass is the only solution I see as ira and reload pass >>>>> makes register pairs and vecload pass will be easier with generation of >>>>> lxvp. >>>>> >>>>> Please suggest. >>>>> >>>>>> Looking forward to the comments from Segher/David/Peter/Mike etc. >>>> >>>> Still looking forward. :) >>>> >>>> BR, >>>> Kewen >