From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9C8683858D37 for ; Fri, 19 Jan 2024 04:19:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9C8683858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9C8683858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705637953; cv=none; b=DR6FjgLnhYBDsSbpJTjkrpSE7W2Z7Xp1tsAER3wK3qAKdgBwe2+aP8f0SEkmX5gmV698v0f0BuYkPvSiPUdoIUGdCXP1+Z5oOE001qGDrpfUCArdb25/mCy4RrykbVmEy0V60TtOOCjxvSMz9xCcCFFMU4iFyW+A9ZlxeedoNN4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705637953; c=relaxed/simple; bh=kjog/Os6IDJpUv4QpxBcVKq/5PuDZTPx+xcx+4z+0pE=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=s3dQVAjfcGRf1sn9JU47BQKEYoPIiJNRwmn4jBulCbHorWeodVXSW9wNWyxqSmGVMMwa17fAr6NNZ3UnahfmT7A+tFdBCGrjKDtMPDT0M0pHPAB0yHsnNTR8mGhxSFahCM/JgCX2wCe46Uh+b2crbesy5l3/Sd2uMNSMf6V5WgQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 40J1CQLO007514; Fri, 19 Jan 2024 04:19:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=j3td6SBaDr4uvE4RrhmeXWKolFaHoLI7s45e5WDPIbc=; b=h+0rV3tuxGyLUwft8uC4QGyVTZBjA/miPaGjLas0w6qlu+EwJYtfUC4+jGga+ZjIpRpw kxNKYe4N5V8KZLxA31x//AdmHUp9r1l/qP0OqPEmnBCfQkgM1EPDDxk8gdnuS0cQMT8+ V0Xo4Z3uaX2lZqcClt7eswrPTenrANMkRTDvV9k7z6cYM4ULlj08BRcL05nxPwFVYC9E 4CtmKoI4Wy+3Zl/fsHedYWZVHhwxwZIBbpII98sy98h3KGHMp1CV8tfzc/zDFai6iZhF eehk102ZxBjsGIRtcpWPxOnXTq6wLY7E9bQ6p0pdnJ4llcpV2VY9bcAVmEUSt6DbOZBo mA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3vqf7abbpp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 19 Jan 2024 04:19:05 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 40J3jZ1p005488; Fri, 19 Jan 2024 04:19:05 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3vqf7abbp6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 19 Jan 2024 04:19:05 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 40J2xNR1011009; Fri, 19 Jan 2024 04:19:04 GMT Received: from smtprelay07.dal12v.mail.ibm.com ([172.16.1.9]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3vm57yypn2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 19 Jan 2024 04:19:04 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay07.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 40J4J3iO51839722 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 19 Jan 2024 04:19:03 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 075BD58059; Fri, 19 Jan 2024 04:19:03 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0F0BF58055; Fri, 19 Jan 2024 04:19:02 +0000 (GMT) Received: from cowardly-lion.the-meissners.org (unknown [9.61.128.150]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Fri, 19 Jan 2024 04:19:01 +0000 (GMT) Date: Thu, 18 Jan 2024 23:19:00 -0500 From: Michael Meissner To: Ajit Agarwal Cc: Richard Biener , "Kewen.Lin" , Vladimir Makarov , Michael Meissner , Segher Boessenkool , Peter Bergner , David Edelsohn , gcc-patches , Richard Sandiford Subject: Re: [PATCH V1] rs6000: New pass for replacement of adjacent (load) lxv with lxvp Message-ID: Mail-Followup-To: Michael Meissner , Ajit Agarwal , Richard Biener , "Kewen.Lin" , Vladimir Makarov , Segher Boessenkool , Peter Bergner , David Edelsohn , gcc-patches , Richard Sandiford References: <7b7e1da7-19bd-4192-b5a3-db3fed3a0aaf@linux.ibm.com> <00272349-aa2a-4ea3-9859-913b7b4fe049@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <00272349-aa2a-4ea3-9859-913b7b4fe049@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: wmIClfhVB7b83lDiXyeSaJLId9WzIrJM X-Proofpoint-ORIG-GUID: z_Leimwmgc6LPjtUkHqOEegY5N2F3w6B X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-01-18_12,2024-01-17_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 bulkscore=0 phishscore=0 adultscore=0 lowpriorityscore=0 mlxlogscore=663 clxscore=1011 impostorscore=0 priorityscore=1501 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2401190012 X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Jan 15, 2024 at 06:25:13PM +0530, Ajit Agarwal wrote: > Also Mike and Kewwn suggested to use this pass \before IRA register > allocator. They are in To List. They have other concerns doing after > register allocator. > > They have responded in other mail Chain. The problem with doing it after register allocation is it limits the hit rate to the situation where the register allocation happened to guess right, and allocated adjacent registers. Note, the PowerPC has some twists: 1) load/store vector pair must use an even/odd VSX register pair. 2) Some instructions only operate on traditional FPR registers (VSX registers 0..31) and others only operate on traditional Altivec registers (VSX reigsters 32..63). I.e. if you are doing a load vector pair, and you are going to do say a V2DI vector add, you need to load the vector pair into Altivec registers to avoid having to do a copy operation. In general, I tend to feel stuffing things into a larger register and then using SUBREG is going to be often times generate other moves. On the PowerPC right now, we can't even use SUBREG of OOmode (the 256-bit opaque type), but Peter has patches to deal with some of the issues. But at the moment, we don't have support for expressing this load such that register allocation can handle it. Rather than using a large register mode, I tend to feel that we should enhace match_parallel so that register allocation can allocate the registers sequentially. Now, I haven't looked at match_parallel for 15-20 years, but my sense was it only worked for fixed registers generated elsewhere (such as for the load/store string instruction support). I.e. rather than doing something like: (set (reg:OO ) (mem:OO )) (set (reg:V2DF ) (subreg:V2DF (reg:OO ) 0)) (set (reg:V2DF ) (subreg:V2DF (reg:OO ) 16)) ; do stuff involving v2df_reg1 and v2df_reg2 (clobber (reg:OO ) (set (subreg:V2DF (reg:OO ) 0) (reg:V2DF )) (set (subreg:V2DF (reg:OO ) 16) (reg:V2DF )) (set (mem:OO ) (reg:OO )) We would do: (parallel [(set (reg:V2DF ) (mem:V2DF )) (set (reg:V2DF ) (mem:V2DF )))]) ; do stuff involving v2df_reg1 and v2df_reg2 (parallel [(set (mem:V2DF ) (reg:V2DF )) (set (mem:V2DF ) (reg:V2DF ))]) Now in those two parallels above, we would need to use match_parallel to ensure that the registers are allocated sequentially (and in the PowerPC, start on an even VSX register), and the addresses are bumped up by 16 bytes. Ideally, the combiner should try to combine things, but it may be simpler to use a separate MD pass. It would be nice if we had a standard constraint mechanism like % that says use % but add 1/2/3/etc. to the register number if it is a REG, or a size*number added to a memory address if it is a MEM. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meissner@linux.ibm.com