From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 1D0CF3858281 for ; Tue, 5 Dec 2023 13:43:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1D0CF3858281 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1D0CF3858281 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701783807; cv=none; b=R9/Dvee2HKqubF0JjYH6CwBREu9+8pF1faYTIYQWbIKKcjNK367SDGGyoVFdVock94yPPYrUNboScF0pjBp+93HIekC8keZ9FRWmOEbhyivHbdL86UuHL+DRqEXciHWiwCcVI+lEEZ7T5kNPyxTrUlrXEvDEZ5nVjEJDCQNy+qQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701783807; c=relaxed/simple; bh=HckupUU1extqMo0xmWwlFl4qok2RFpZ6gNt1rfxdLjg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=MFdBlt533GTKdZytwT3zaDHrRsm+GRrnjZbBhgx/22WS6w2kipTDJVhD6PnHiYjgE3yosHsllOmXbzozVl6wHeGH9c5RJBo+rcVQsXKQ3xP6I8ElnGtMp9cmee3WDCujN04TGwaK8BrMi+yOkd2uUZONx4lKEV964wbROu4GfU4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B5DN0ZD014041; Tue, 5 Dec 2023 13:43:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=JQvFn6z4QhYz///e0KjgaV7QMRzYyuoFA13NVGnC8is=; b=i8pofUCfTCJtrA2by+0+zPn5sg/THH6rZ4F13Z9Ei2OfZrjmWhp4TLaiWXMDNVaSwajp HiiT0mF84Wd3F2ZDdyYClHNhGqYCjIBeB+6CsJs13bGqvvRfZ7fT6IgxK3rgZ9QeYyoQ 6fXz1oM04e8UBswJfVs9dWZFLeRxHm8GyW8OFnEc+5CLQlErc2lQuEDMUbpf2OMH95Sn mFUt99IrzINS0vD75fjFkyoywIAkJXoc6OKWSMbAeQYDpqKhsRv5b8QhL1gQNCJG7FAk lmEJe7u0ll7sR878KPk7tzdJCjKHP8qSE+0WH9QXEL7Ypti90Q4YoiGlRh5e/byu37PL IQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ut4pq8vpb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2023 13:43:25 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B5DObcw019402; Tue, 5 Dec 2023 13:43:24 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ut4pq8vp5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2023 13:43:24 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B5As5lH028492; Tue, 5 Dec 2023 13:43:24 GMT Received: from smtprelay06.wdc07v.mail.ibm.com ([172.16.1.73]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3urv8b3njt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2023 13:43:24 +0000 Received: from smtpav05.dal12v.mail.ibm.com (smtpav05.dal12v.mail.ibm.com [10.241.53.104]) by smtprelay06.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B5DhNXi18612984 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 5 Dec 2023 13:43:23 GMT Received: from smtpav05.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0724D58056; Tue, 5 Dec 2023 13:43:23 +0000 (GMT) Received: from smtpav05.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 06A9558052; Tue, 5 Dec 2023 13:43:21 +0000 (GMT) Received: from [9.150.26.221] (unknown [9.150.26.221]) by smtpav05.dal12v.mail.ibm.com (Postfix) with ESMTP; Tue, 5 Dec 2023 13:43:20 +0000 (GMT) Message-ID: <57d3fbcb-98b6-4658-8d08-e30f8c68a18c@linux.ibm.com> Date: Tue, 5 Dec 2023 19:13:19 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp To: "Kewen.Lin" Cc: Segher Boessenkool , David Edelsohn , Peter Bergner , Michael Meissner , GCC Patches References: <77426697-1571-e180-add9-cfb6d10f8424@linux.ibm.com> Content-Language: en-US From: Ajit Agarwal In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Qqf6qtHPxvGJr8Bkc9ZI9K9PQkYMQme7 X-Proofpoint-GUID: S3F5DspLpcIreE8W9UCDeGercCc233nx X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-05_08,2023-12-05_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 phishscore=0 spamscore=0 malwarescore=0 adultscore=0 suspectscore=0 clxscore=1015 mlxscore=0 priorityscore=1501 mlxlogscore=999 bulkscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2312050107 X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Kewen: On 04/12/23 7:31 am, Kewen.Lin wrote: > Hi Ajit, > > on 2023/12/1 17:10, Ajit Agarwal wrote: >> Hello Kewen: >> >> On 24/11/23 3:01 pm, Kewen.Lin wrote: >>> Hi Ajit, >>> >>> Don't forget to CC David (CC-ed) :), some comments are inlined below. >>> >>> on 2023/10/8 03:04, Ajit Agarwal wrote: >>>> Hello All: >>>> >>>> This patch add new pass to replace contiguous addresses vector load lxv with mma instruction >>>> lxvp. >>> >>> IMHO the current binding lxvp (and lxvpx, stxvp{x,}) to MMA looks wrong, it's only >>> Power10 and VSX required, these instructions should perform well without MMA support. >>> So one patch to separate their support from MMA seems to go first. >>> >> >> I will make the changes for Power10 and VSX. >> >>>> This patch addresses one regressions failure in ARM architecture. >>> >>> Could you explain this? I don't see any test case for this. >> >> I have submitted v1 of the patch and there were regressions failure for Linaro. >> I have fixed in version V2. > > OK, thanks for clarifying. So some unexpected changes on generic code in v1 > caused the failure exposed on arm. > >> >> >>> Besides, it seems a bad idea to put this pass after reload? as register allocation >>> finishes, this pairing has to be restricted by the reg No. (I didn't see any >>> checking on the reg No. relationship for paring btw.) >>> >> >> Adding before reload pass deletes one of the lxv and replaced with lxvp. This >> fails in reload pass while freeing reg_eqivs as ira populates them and then > > I can't find reg_eqivs, I guessed you meant reg_equivs and moved this pass right before > pass_reload (between pass_ira and pass_reload)? IMHO it's unexpected as those two passes > are closely correlated. I was expecting to put it somewhere before ira. Yes they are tied together and moving before reload will not work. > >> vecload pass deletes some of insns and while freeing in reload pass as insn >> is already deleted in vecload pass reload pass segfaults. >> >> Moving vecload pass before ira will not make register pairs with lxvp and >> in ira and that will be a problem. > > Could you elaborate the obstacle for moving such pass before pass_ira? > > Basing on the status quo, the lxvp is bundled with OOmode, then I'd expect > we can generate OOmode move (load) and use the components with unspec (or > subreg with Peter's patch) to replace all the previous use places, it looks > doable to me. Moving before ira passes, we delete the offset lxv and generate lxvp and replace all the uses, that I am doing. But the offset lxvp register generated by ira are not register pair and generate random register and hence we cannot generate lxvp. For example one lxv is generated with register 32 and other pair is generated with register 45 by ira if we move it before ira passes. Thanks & Regards Ajit > >> >> Making after reload pass is the only solution I see as ira and reload pass >> makes register pairs and vecload pass will be easier with generation of >> lxvp. >> >> Please suggest. >> >>> Looking forward to the comments from Segher/David/Peter/Mike etc. > > Still looking forward. :) > > BR, > Kewen