From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 109263858D33 for ; Thu, 4 May 2023 21:19:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 109263858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 344LDhSc021131; Thu, 4 May 2023 21:19:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=IzFyaA1nXNuyDNkG/qdHvbEs9yolQrDJ/0tuSzxY3og=; b=pIAdpo+ncascN+bPA+YIk4N5NjjBx3DvF1bJYY/lqqoRiPCCi5ouMixSaf6CxP82p2hP j4bcxfIQVDed3j5gc1k74m31V5Iudmvu1rnkl1cL866VXN2swYzjIOMfUFyeKY4oQY+t dH4cb1RP7YZw1FSO+40MCv8BSglG/wPKWMQbArr31IGFiLQs1sAY7rkznRJPGqO+7v/t w+CLI4SApA2Q8eOW0PqPhU4QuPladDzUn0Ww7FY40qNTBSMHCKeki8xXL8Rnx9ae2ESx y4LJQvYXK9mG5u7s4OP71DBgkCHuSL+sDHp90em+V1N+rkvxk7ODiGFHDRhYymrlQj0h GA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qcmee05a4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 04 May 2023 21:19:30 +0000 Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 344LFbDu027369; Thu, 4 May 2023 21:19:29 GMT Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qcmee059n-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 04 May 2023 21:19:29 +0000 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 344IHjx8008421; Thu, 4 May 2023 20:25:15 GMT Received: from smtprelay02.wdc07v.mail.ibm.com ([9.208.129.120]) by ppma04dal.us.ibm.com (PPS) with ESMTPS id 3q8tv8j3yu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 04 May 2023 20:25:15 +0000 Received: from smtpav05.wdc07v.mail.ibm.com (smtpav05.wdc07v.mail.ibm.com [10.39.53.232]) by smtprelay02.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 344KP8S160162416 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 4 May 2023 20:25:08 GMT Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 421F95805F; Thu, 4 May 2023 20:25:08 +0000 (GMT) Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 93C7B58053; Thu, 4 May 2023 20:25:07 +0000 (GMT) Received: from [9.77.146.143] (unknown [9.77.146.143]) by smtpav05.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Thu, 4 May 2023 20:25:07 +0000 (GMT) Message-ID: <60d29215-e734-20b1-9cc3-6b673d60fa2a@linux.ibm.com> Date: Thu, 4 May 2023 15:25:06 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Subject: Re: [PATCH V2, rs6000] Disable generation of scalar modulo instructions To: GCC Patches , Segher Boessenkool , Peter Bergner , David Edelsohn , "Kewen.Lin" References: <36819cc8-c948-1426-2429-7fe4f6b67c94@linux.ibm.com> Content-Language: en-US From: Pat Haugen In-Reply-To: <36819cc8-c948-1426-2429-7fe4f6b67c94@linux.ibm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: McXIPYcmOxaiAs0Ac-KpECsIskZxb49Q X-Proofpoint-GUID: qCv_y-g1UGFjZAhcqbaj38MmAZOB0ZM8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-04_13,2023-05-04_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 adultscore=0 suspectscore=0 phishscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 priorityscore=1501 bulkscore=0 malwarescore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2305040171 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Ping. On 4/18/23 7:22 AM, Pat Haugen via Gcc-patches wrote: > Updated from prior patch to also disable for int128. > > > Disable generation of scalar modulo instructions. > > It was recently discovered that the scalar modulo instructions can suffer > noticeable performance issues for certain input values. This patch disables > their generation since the equivalent div/mul/sub sequence does not suffer > the same problem. > > Bootstrapped and regression tested on powerpc64/powerpc64le. > Ok for master and backports after burn in? > > -Pat > > > 2023-04-18  Pat Haugen  > > gcc/ >     * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New. >     * config/rs6000/rs6000.md (mod3, *mod3): Disable. >     (define_expand umod3): New. >     (define_insn umod3): Rename to *umod3 and disable. >     (umodti3, modti3): Disable. > > gcc/testsuite/ >     * gcc.target/powerpc/clone1.c: Add xfails. >     * gcc.target/powerpc/clone3.c: Likewise. >     * gcc.target/powerpc/mod-1.c: Likewise. >     * gcc.target/powerpc/mod-2.c: Likewise. >     * gcc.target/powerpc/p10-vdivq-vmodq.c: Likewise. > > > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index 3503614efbd..1cf0a0013c0 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -2492,3 +2492,9 @@ while (0) >         rs6000_asm_output_opcode (STREAM);                \ >      }                                    \ >    while (0) > + > +/* Disable generation of scalar modulo instructions due to performance > issues > +   with certain input values. This can be removed in the future when the > +   issues have been resolved.  */ > +#define RS6000_DISABLE_SCALAR_MODULO 1 > + > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 44f7dd509cb..4f397bc9179 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -3421,6 +3421,17 @@ (define_expand "mod3" >      FAIL; > >        operands[2] = force_reg (mode, operands[2]); > + > +      if (RS6000_DISABLE_SCALAR_MODULO) > +    { > +      temp1 = gen_reg_rtx (mode); > +      temp2 = gen_reg_rtx (mode); > + > +      emit_insn (gen_div3 (temp1, operands[1], operands[2])); > +      emit_insn (gen_mul3 (temp2, temp1, operands[2])); > +      emit_insn (gen_sub3 (operands[0], operands[1], temp2)); > +      DONE; > +    } >      } >    else >      { > @@ -3440,17 +3451,42 @@ (define_insn "*mod3" >    [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r") >          (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") >           (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] > -  "TARGET_MODULO" > +  "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" >    "mods %0,%1,%2" >    [(set_attr "type" "div") >     (set_attr "size" "")]) > > +;; This define_expand can be removed when RS6000_DISABLE_SCALAR_MODULO is > +;; removed. > +(define_expand "umod3" > +  [(set (match_operand:GPR 0 "gpc_reg_operand") > +    (umod:GPR (match_operand:GPR 1 "gpc_reg_operand") > +          (match_operand:GPR 2 "gpc_reg_operand")))] > +  "" > +{ > +  rtx temp1; > +  rtx temp2; > + > +  if (!TARGET_MODULO) > +    FAIL; > > -(define_insn "umod3" > +  if (RS6000_DISABLE_SCALAR_MODULO) > +    { > +      temp1 = gen_reg_rtx (mode); > +      temp2 = gen_reg_rtx (mode); > + > +      emit_insn (gen_udiv3 (temp1, operands[1], operands[2])); > +      emit_insn (gen_mul3 (temp2, temp1, operands[2])); > +      emit_insn (gen_sub3 (operands[0], operands[1], temp2)); > +      DONE; > +    } > +}) > + > +(define_insn "*umod3" >    [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r") >          (umod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") >            (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] > -  "TARGET_MODULO" > +  "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" >    "modu %0,%1,%2" >    [(set_attr "type" "div") >     (set_attr "size" "")]) > @@ -3507,7 +3543,7 @@ (define_insn "umodti3" >    [(set (match_operand:TI 0 "altivec_register_operand" "=v") >      (umod:TI (match_operand:TI 1 "altivec_register_operand" "v") >           (match_operand:TI 2 "altivec_register_operand" "v")))] > -  "TARGET_POWER10 && TARGET_POWERPC64" > +  "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO" >    "vmoduq %0,%1,%2" >    [(set_attr "type" "vecdiv") >     (set_attr "size" "128")]) > @@ -3516,7 +3552,7 @@ (define_insn "modti3" >    [(set (match_operand:TI 0 "altivec_register_operand" "=v") >      (mod:TI (match_operand:TI 1 "altivec_register_operand" "v") >          (match_operand:TI 2 "altivec_register_operand" "v")))] > -  "TARGET_POWER10 && TARGET_POWERPC64" > +  "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO" >    "vmodsq %0,%1,%2" >    [(set_attr "type" "vecdiv") >     (set_attr "size" "128")]) > diff --git a/gcc/testsuite/gcc.target/powerpc/clone1.c > b/gcc/testsuite/gcc.target/powerpc/clone1.c > index c69fd2aa1b8..74323ca0e8c 100644 > --- a/gcc/testsuite/gcc.target/powerpc/clone1.c > +++ b/gcc/testsuite/gcc.target/powerpc/clone1.c > @@ -21,6 +21,7 @@ long mod_func_or (long a, long b, long c) >    return mod_func (a, b) | c; >  } > > -/* { dg-final { scan-assembler-times {\mdivd\M}  1 } } */ > -/* { dg-final { scan-assembler-times {\mmulld\M} 1 } } */ > -/* { dg-final { scan-assembler-times {\mmodsd\M} 1 } } */ > +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ > +/* { dg-final { scan-assembler-times {\mdivd\M}  1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times {\mmulld\M} 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times {\mmodsd\M} 1 { xfail *-*-* } } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/clone3.c > b/gcc/testsuite/gcc.target/powerpc/clone3.c > index 911b88b781d..d3eb4dd2378 100644 > --- a/gcc/testsuite/gcc.target/powerpc/clone3.c > +++ b/gcc/testsuite/gcc.target/powerpc/clone3.c > @@ -27,7 +27,8 @@ long mod_func_or (long a, long b, long c) >    return mod_func (a, b) | c; >  } > > -/* { dg-final { scan-assembler-times {\mdivd\M}  1 } } */ > -/* { dg-final { scan-assembler-times {\mmulld\M} 1 } } */ > -/* { dg-final { scan-assembler-times {\mmodsd\M} 2 } } */ > +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ > +/* { dg-final { scan-assembler-times {\mdivd\M}  1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times {\mmulld\M} 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times {\mmodsd\M} 2 { xfail *-*-* } } } */ >  /* { dg-final { scan-assembler-times {\mpld\M}   1 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/mod-1.c > b/gcc/testsuite/gcc.target/powerpc/mod-1.c > index 861ba670af4..74af98f5bc3 100644 > --- a/gcc/testsuite/gcc.target/powerpc/mod-1.c > +++ b/gcc/testsuite/gcc.target/powerpc/mod-1.c > @@ -7,13 +7,14 @@ long lsmod (long a, long b) { return a%b; } >  unsigned int iumod (unsigned int a, unsigned int b) { return a%b; } >  unsigned long lumod (unsigned long a, unsigned long b) { return a%b; } > > -/* { dg-final { scan-assembler-times "modsw " 1 } } */ > -/* { dg-final { scan-assembler-times "modsd " 1 } } */ > -/* { dg-final { scan-assembler-times "moduw " 1 } } */ > -/* { dg-final { scan-assembler-times "modud " 1 } } */ > -/* { dg-final { scan-assembler-not   "mullw "   } } */ > -/* { dg-final { scan-assembler-not   "mulld "   } } */ > -/* { dg-final { scan-assembler-not   "divw "    } } */ > -/* { dg-final { scan-assembler-not   "divd "    } } */ > -/* { dg-final { scan-assembler-not   "divwu "   } } */ > -/* { dg-final { scan-assembler-not   "divdu "   } } */ > +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ > +/* { dg-final { scan-assembler-times "modsw " 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times "modsd " 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times "moduw " 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times "modud " 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "mullw "   { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "mulld "   { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "divw "    { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "divd "    { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "divwu "   { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "divdu "   { xfail *-*-* } } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/mod-2.c > b/gcc/testsuite/gcc.target/powerpc/mod-2.c > index 441ec5878f1..896e2e35260 100644 > --- a/gcc/testsuite/gcc.target/powerpc/mod-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/mod-2.c > @@ -5,8 +5,9 @@ >  int ismod (int a, int b) { return a%b; } >  unsigned int iumod (unsigned int a, unsigned int b) { return a%b; } > > -/* { dg-final { scan-assembler-times "modsw " 1 } } */ > -/* { dg-final { scan-assembler-times "moduw " 1 } } */ > -/* { dg-final { scan-assembler-not   "mullw "   } } */ > -/* { dg-final { scan-assembler-not   "divw "    } } */ > -/* { dg-final { scan-assembler-not   "divwu "   } } */ > +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ > +/* { dg-final { scan-assembler-times "modsw " 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-times "moduw " 1 { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "mullw "   { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "divw "    { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler-not   "divwu "   { xfail *-*-* } } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c > b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c > index 84685e5ff43..148998c8c9d 100644 > --- a/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c > +++ b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c > @@ -23,5 +23,6 @@ __int128 s_mod(__int128 a, __int128 b) > >  /* { dg-final { scan-assembler {\mvdivsq\M} } } */ >  /* { dg-final { scan-assembler {\mvdivuq\M} } } */ > -/* { dg-final { scan-assembler {\mvmodsq\M} } } */ > -/* { dg-final { scan-assembler {\mvmoduq\M} } } */ > +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ > +/* { dg-final { scan-assembler {\mvmodsq\M} { xfail *-*-* } } } */ > +/* { dg-final { scan-assembler {\mvmoduq\M} { xfail *-*-* } } } */