From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 5D36F38582BD for ; Tue, 21 Feb 2023 12:31:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5D36F38582BD Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31L9uMbB011129; Tue, 21 Feb 2023 12:30:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=d+nMQGpSvYfzFcA93+Oa/r0nE+6Zw9smXyDBCKTYkps=; b=X+4+Dcfx4LQdAP686ApRjKDsZQcwHr9XySJkwJQh2z4EWJa0XdZeT3SsPXIRJmXzoM2Y GfqqtP+byaErKTjNnpxrJ3OwSJZvc4y2XgPxC4gjSht/PuUzKIn27dLvMeVlp8DMv/PC jVHqWj1pcJB4TB6Lba8EhbE6GS//EUocXFTydO4BSYVST2JMcQPiAX16tThEOPIFzIEV ceS3vKCTNgumPj7i9LvNWxS8ppMouD8K619mnxkWjgKOvPwNC4mivo4GYa6PlhOS9U6C lh6DyHSa5JuzBVLSzF+EWHSxWb5pCvJ+s/8/A10UECP/I6n70I6vFwEhGhjeR4VgNm4G Yw== Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nvtgxdchx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Feb 2023 12:30:59 +0000 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 31L9vltB019794; Tue, 21 Feb 2023 12:30:58 GMT Received: from smtprelay06.wdc07v.mail.ibm.com ([9.208.129.118]) by ppma02dal.us.ibm.com (PPS) with ESMTPS id 3ntpa778j8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Feb 2023 12:30:58 +0000 Received: from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com [10.39.53.230]) by smtprelay06.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 31LCUuWd59900246 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Feb 2023 12:30:56 GMT Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 664345805F; Tue, 21 Feb 2023 12:30:56 +0000 (GMT) Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0ADE95805C; Tue, 21 Feb 2023 12:30:55 +0000 (GMT) Received: from [9.43.1.240] (unknown [9.43.1.240]) by smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTP; Tue, 21 Feb 2023 12:30:54 +0000 (GMT) Message-ID: <46ec2582-b3a8-52c9-584a-9d282d26fb79@linux.ibm.com> Date: Tue, 21 Feb 2023 18:00:52 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571] Content-Language: en-US To: Segher Boessenkool Cc: gcc-patches , bergner@linux.ibm.com References: <16fa34b8-ad8a-20f2-b285-3b3f5bf5d5b2@linux.ibm.com> <20230217172319.GL25951@gate.crashing.org> <5f800afe-f162-0d56-3c78-6ee93610e201@linux.ibm.com> <20230221110448.GP25951@gate.crashing.org> From: Ajit Agarwal In-Reply-To: <20230221110448.GP25951@gate.crashing.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: ---6C9cPqdMOUjNFmUewHwLBcnUaY847 X-Proofpoint-ORIG-GUID: ---6C9cPqdMOUjNFmUewHwLBcnUaY847 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-21_07,2023-02-20_02,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 adultscore=0 malwarescore=0 suspectscore=0 clxscore=1015 impostorscore=0 bulkscore=0 spamscore=0 phishscore=0 priorityscore=1501 mlxlogscore=886 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2302210102 X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Segher: On 21/02/23 4:34 pm, Segher Boessenkool wrote: > Hi! > > On Tue, Feb 21, 2023 at 02:18:25PM +0530, Ajit Agarwal wrote: >> This patch replaces fmr instruction 6 cycles with 2 cycles xxlor instruction >> for p7 and p8 architecture. >> >> I have implemented with switch and cases otherwise it is difficult to accommodate >> xxlor with p7 and p8 and fmr for other architectures. > > Please domn't use a switch, it isn't needed. Instead use the "isa" > attribute (with p7v here), and put the preferred alternative first. I am not sure how this is possible without switch and using only "isa". > >> rs6000: fmr gets used instead of faster xxlor [PR93571] > > rs6000: Use xxlor instead of fmr where possible > >> This patch replaces 6 cycles fmr instruction with xxlor >> 2 cycles in p8 and p7 architecture. > > No, it also does it on all later architectures. > > Do you have any actual timings (i.e. from hardware, not documentation)? > >> * config/rs6000/rs6000.md (*movdf_hardfloat64): Replace fmr with xxlor instruction. > > Line too long. And, that is not what the patch does. Changelog should > be totally boring just saying what the patch changes. If the patch > changes things other than what thechangelog says your reviewer will > think something went missin somewhere :-) I will correct this. > >> - "@ >> - stfd%U0%X0 %1,%0 >> - lfd%U1%X1 %0,%1 >> - xxlor %0,%1,%1 > > That is not what is currently in trunk, so your patch cannot apply. > >> + switch (which_alternative) { >> + case 0 : return "stfd%U0%X0 %1,%0"; >> + case 1 : return "lfd%U1%X1 %0,%1"; > > Formatting is all incorrect. We dom't need or want a switch at all, but > correct would be: > switch (which_alternative) > { > case 0: > return "stfd%U0%X0 %1,%0"; > case 1: > return "lfd%U1%X1 %0,%1"; > > etc. I will correct that. > >> + case 2 : if ((TARGET_VSX || TARGET_P8_VECTOR) >> + && !TARGET_P9_VECTOR >> + && !TARGET_POWER10) >> + return "xxlor %0,%1,%1"; >> + else >> + return "fmr %0,%1"; > > Ah, so you are excluding p9 and p10 here. Hrm. That should be written > TARGET_VSX && !TARGET_P9_VECTOR, none of the rest is needed; but is that > a good idea at all? > > Please use %xN for VSX arguments whenever possible. If this alternative > allows only the low numbered vector registers, that is a hint that you > probably should write this differently (and %xN is harmless then). > >> + return "unreachable"; > > No, never do that. There is "gcc_unreachable ()" if you need it. > I will also correct this. > So, let's first do actual timings, and see if it is better on p9 and > p10 as well (or at least not worse). > > > Segher Thanks & Regards Ajit