From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id AAD603858D1E for ; Wed, 28 Jun 2023 08:18:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AAD603858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35S8GVUw003624; Wed, 28 Jun 2023 08:18:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : from : subject : to : cc : references : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=FN36aApfZtCsnrm+zrY+zSRvF1u5SulRRuGd6MRSMls=; b=KHMwY8a/lUpw8YTZBeObfRZLV4FYxfchqefUO+dC53lkLqYG3NXAlzKkGGpFXZbbDKPp 8ziK/c6wXZGzG3pUBf1CWIn47BxOXj9HR5gqbEv3u/issa0agCp9OyI+Jux3vFpkCdPJ o2/feCekrzctkI3u+MMmpXCRP4OJJIHYgSkmNAa0k9aYPto4Pk5vWAthcKPp16zv6gTg Ul3mhGhL1xDyXWpHnZQvv7IUtluBQ0JLOHaSObhe2nHLSJHER0srUhkSPfHBi5UUeZG7 Rw9EtUlg8UyhSu9P/u/DoJKWqA74R9WwKhnyxteMnAwMCm0C8YVoKV8+vB/KcfDxK5E3 YA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rgh7681em-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 28 Jun 2023 08:18:55 +0000 Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35S8IloV010712; Wed, 28 Jun 2023 08:18:55 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rgh7681e0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 28 Jun 2023 08:18:55 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35S7VCMZ032460; Wed, 28 Jun 2023 08:18:53 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3rdr452dwh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 28 Jun 2023 08:18:53 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35S8Io4337683672 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 28 Jun 2023 08:18:50 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4088720043; Wed, 28 Jun 2023 08:18:50 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4DFBA20040; Wed, 28 Jun 2023 08:18:48 +0000 (GMT) Received: from [9.197.237.226] (unknown [9.197.237.226]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 28 Jun 2023 08:18:47 +0000 (GMT) Message-ID: Date: Wed, 28 Jun 2023 16:18:46 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 From: "Kewen.Lin" Subject: Re: [PATCH V3, rs6000] Disable generation of scalar modulo instructions To: Pat Haugen Cc: GCC Patches , Segher Boessenkool , David Edelsohn , Peter Bergner References: <68d7fbb3-59b3-6a59-a8ac-773d5d9c6817@linux.ibm.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 0B17USwFtSmXyf8EADA3pW1fb3Fw8RoJ X-Proofpoint-ORIG-GUID: rTYZUKPi_5pZ4541P_U7J6IKS4uOMPUm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-28_04,2023-06-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 mlxscore=0 mlxlogscore=999 priorityscore=1501 suspectscore=0 bulkscore=0 lowpriorityscore=0 malwarescore=0 clxscore=1015 phishscore=0 adultscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306280070 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Pat, > On 6/27/23 1:52 PM, Pat Haugen via Gcc-patches wrote: >> Updated from prior version to address review comments (update rs6000_rtx_cost, >> update scan strings of mod-1.c/mod-2.c)l. >> >> Disable generation of scalar modulo instructions. >> >> It was recently discovered that the scalar modulo instructions can suffer >> noticeable performance issues for certain input values. This patch disables >> their generation since the equivalent div/mul/sub sequence does not suffer >> the same problem. >> >> Bootstrapped and regression tested on powerpc64/powerpc64le. >> Ok for master and backports after burn in? >> >> -Pat >> >> >> 2023-06-27  Pat Haugen  >> >> gcc/ >>      * config/rs6000/rs6000.cc (rs6000_rtx_costs): Check if disabling >>      scalar modulo. >>      * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New. >>      * config/rs6000/rs6000.md (mod3, *mod3): Disable. >>      (define_expand umod3): New. >>      (define_insn umod3): Rename to *umod3 and disable. >>      (umodti3, modti3): Disable. >> >> gcc/testsuite/ >>      * gcc.target/powerpc/clone1.c: Add xfails. >>      * gcc.target/powerpc/clone3.c: Likewise. >>      * gcc.target/powerpc/mod-1.c: Update scan strings and add xfails. >>      * gcc.target/powerpc/mod-2.c: Likewise. >>      * gcc.target/powerpc/p10-vdivq-vmodq.c: Add xfails. >> > > Attaching patch since my mailer apparently messed up some formatting again. > > -Pat > > no-modulo.diff > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 546c353029b..2dae217bf64 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -22127,7 +22127,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, > *total = rs6000_cost->divsi; > } > /* Add in shift and subtract for MOD unless we have a mod instruction. */ > - if (!TARGET_MODULO && (code == MOD || code == UMOD)) > + if ((!TARGET_MODULO > + || (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode))) > + && (code == MOD || code == UMOD)) > *total += COSTS_N_INSNS (2); > return false; > > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index 3503614efbd..22595f6ebd7 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -2492,3 +2492,9 @@ while (0) > rs6000_asm_output_opcode (STREAM); \ > } \ > while (0) > + > +/* Disable generation of scalar modulo instructions due to performance issues > + with certain input values. This can be removed in the future when the > + issues have been resolved. */ > +#define RS6000_DISABLE_SCALAR_MODULO 1 > + > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index b0db8ae508d..6c2f237a539 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -3421,6 +3421,17 @@ (define_expand "mod3" > FAIL; > > operands[2] = force_reg (mode, operands[2]); > + > + if (RS6000_DISABLE_SCALAR_MODULO) > + { > + temp1 = gen_reg_rtx (mode); > + temp2 = gen_reg_rtx (mode); > + > + emit_insn (gen_div3 (temp1, operands[1], operands[2])); > + emit_insn (gen_mul3 (temp2, temp1, operands[2])); > + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); > + DONE; > + } > } > else > { > @@ -3440,17 +3451,42 @@ (define_insn "*mod3" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r") > (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") > (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] > - "TARGET_MODULO" > + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" > "mods %0,%1,%2" > [(set_attr "type" "div") > (set_attr "size" "")]) > > +;; This define_expand can be removed when RS6000_DISABLE_SCALAR_MODULO is > +;; removed. > +(define_expand "umod3" > + [(set (match_operand:GPR 0 "gpc_reg_operand") > + (umod:GPR (match_operand:GPR 1 "gpc_reg_operand") > + (match_operand:GPR 2 "gpc_reg_operand")))] > + "" > +{ > + rtx temp1; > + rtx temp2; > + > + if (!TARGET_MODULO) > + FAIL; > Nit: Sorry that I didn't catch this in the previous review, it seems we can just put TARGET_MODULO in the condition of define_expand. Besides, it seems the declarations of temp1 and temp2 can be merged into their assignments below. The others look pretty good to me, thanks! As I know this patch is still on Segher's list to review, I'd defer the final approval to Segher. BR, Kewen