From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id D36BB383309E for ; Wed, 14 Dec 2022 13:50:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D36BB383309E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 190ACFEC; Wed, 14 Dec 2022 05:51:12 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6C6D93F73B; Wed, 14 Dec 2022 05:50:30 -0800 (PST) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,Richard Earnshaw , "gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov , richard.sandiford@arm.com Cc: Richard Earnshaw , "gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov Subject: Re: [PATCH]AArch64 div-by-255, ensure that arguments are registers. [PR107988] References: Date: Wed, 14 Dec 2022 13:50:29 +0000 In-Reply-To: (Tamar Christina's message of "Wed, 14 Dec 2022 11:18:37 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-38.2 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_LOTSOFHASH,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: >> -----Original Message----- >> From: Richard Sandiford >> Sent: Friday, December 9, 2022 7:08 AM >> To: Richard Earnshaw >> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; >> nd ; Richard Earnshaw ; >> Marcus Shawcroft ; Kyrylo Tkachov >> >> Subject: Re: [PATCH]AArch64 div-by-255, ensure that arguments are >> registers. [PR107988] >> >> Richard Earnshaw writes: >> > On 08/12/2022 16:39, Tamar Christina via Gcc-patches wrote: >> >> Hi All, >> >> >> >> At -O0 (as opposed to e.g. volatile) we can get into the situation >> >> where the >> >> in0 and result RTL arguments passed to the division function are >> >> memory locations instead of registers. I think we could reject these >> >> early on by checking that the gimple values are GIMPLE registers, but >> >> I think it's better to handle it. >> >> >> >> As such I force them to registers and emit a move to the memory >> >> locations and leave it up to reload to handle. This fixes the ICE >> >> and still allows the optimization in these cases, which improves the code >> quality a lot. >> >> >> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >> >> >> >> Ok for master? >> >> >> >> Thanks, >> >> Tamar >> >> >> >> >> >> >> >> gcc/ChangeLog: >> >> >> >> PR target/107988 >> >> * config/aarch64/aarch64.cc >> >> (aarch64_vectorize_can_special_div_by_constant): Ensure input and >> output >> >> RTL are registers. >> >> >> >> gcc/testsuite/ChangeLog: >> >> >> >> PR target/107988 >> >> * gcc.target/aarch64/pr107988-1.c: New test. >> >> >> >> --- inline copy of patch -- >> >> diff --git a/gcc/config/aarch64/aarch64.cc >> >> b/gcc/config/aarch64/aarch64.cc index >> >> >> b8dc3f070c8afc47c85fa18768c4da92c774338f..9f96424993c4fcccce90e1b241f >> >> cb3aa97025225 100644 >> >> --- a/gcc/config/aarch64/aarch64.cc >> >> +++ b/gcc/config/aarch64/aarch64.cc >> >> @@ -24337,12 +24337,27 @@ >> aarch64_vectorize_can_special_div_by_constant (enum tree_code code, >> >> if (!VECTOR_TYPE_P (vectype)) >> >> return false; >> >> >> >> + if (!REG_P (in0)) >> >> + in0 = force_reg (GET_MODE (in0), in0); >> >> + >> >> gcc_assert (output); >> >> >> >> - if (!*output) >> >> - *output = gen_reg_rtx (TYPE_MODE (vectype)); >> >> + rtx res = NULL_RTX; >> >> + >> >> + /* Once e get to this point we cannot reject the RTL, if it's not a reg >> then >> >> + Create a new reg and write the result to the output afterwards. >> >> + */ if (!*output || !REG_P (*output)) >> >> + res = gen_reg_rtx (TYPE_MODE (vectype)); else >> >> + res = *output; >> > >> > Why not write >> > rtx res = *output >> > if (!res || !REG_P (res)) >> > res = gen_reg_rtx... >> > >> > then you don't need either the else clause or the dead NULL_RTX >> assignment. >> >> I'd prefer that we use the expand_insn interface, which already has logic for >> coercing inputs and outputs to predicates. Something like: >> >> machine_mode mode = TYPE_MODE (vectype); >> unsigned int flags = aarch64_classify_vector_mode (mode); >> if ((flags & VEC_ANY_SVE) && !TARGET_SVE2) >> return false; >> >> ... >> >> expand_operand ops[3]; >> create_output_operand (&ops[0], *output, mode); >> create_input_operand (&ops[1], in0, mode); >> create_fixed_operand (&ops[2], in1); >> expand_insn (insn_code, 3, ops); >> *output = ops[0].value; >> return true; >> >> On this function: why do we have the VECTOR_TYPE_P condition in: >> > > It was left over after checking for optabs support. It's superfluous now. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? OK, thanks. Richard > Thanks, > Tamar > > gcc/ChangeLog: > > PR target/107988 > * config/aarch64/aarch64.cc > (aarch64_vectorize_can_special_div_by_constant): Ensure input and output > RTL are registers. > > gcc/testsuite/ChangeLog: > > PR target/107988 > * gcc.target/aarch64/pr107988-1.c: New test. > > --- inline copy of patch --- > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index 7bb0b7602ff6474410059494dd86b7be1621dde5..e1f34ef5da170ef11727e0c99a5bd42849c5d185 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -24395,7 +24395,8 @@ aarch64_vectorize_can_special_div_by_constant (enum tree_code code, > || !TYPE_UNSIGNED (vectype)) > return false; > > - unsigned int flags = aarch64_classify_vector_mode (TYPE_MODE (vectype)); > + machine_mode mode = TYPE_MODE (vectype); > + unsigned int flags = aarch64_classify_vector_mode (mode); > if ((flags & VEC_ANY_SVE) && !TARGET_SVE2) > return false; > > @@ -24411,15 +24412,14 @@ aarch64_vectorize_can_special_div_by_constant (enum tree_code code, > if (in0 == NULL_RTX && in1 == NULL_RTX) > return true; > > - if (!VECTOR_TYPE_P (vectype)) > - return false; > - > gcc_assert (output); > > - if (!*output) > - *output = gen_reg_rtx (TYPE_MODE (vectype)); > - > - emit_insn (gen_aarch64_bitmask_udiv3 (TYPE_MODE (vectype), *output, in0, in1)); > + expand_operand ops[3]; > + create_output_operand (&ops[0], *output, mode); > + create_input_operand (&ops[1], in0, mode); > + create_fixed_operand (&ops[2], in1); > + expand_insn (insn_code, 3, ops); > + *output = ops[0].value; > return true; > } > > diff --git a/gcc/testsuite/gcc.target/aarch64/pr107988-1.c b/gcc/testsuite/gcc.target/aarch64/pr107988-1.c > new file mode 100644 > index 0000000000000000000000000000000000000000..c4fd290271b738345173b569bdc58c092fba7fe9 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/pr107988-1.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-O0" } */ > +typedef unsigned short __attribute__((__vector_size__ (16))) V; > + > +V > +foo (V v) > +{ > + v /= 255; > + return v; > +}