From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 0970C3888C4E for ; Thu, 14 Jul 2022 11:33:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0970C3888C4E Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B887113D5; Thu, 14 Jul 2022 04:33:48 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.37]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CBF7F3F792; Thu, 14 Jul 2022 04:33:47 -0700 (PDT) From: Richard Sandiford To: Prathamesh Kulkarni Mail-Followup-To: Prathamesh Kulkarni , gcc Patches , richard.sandiford@arm.com Cc: gcc Patches Subject: Re: [aarch64] Use op_mode instead of vmode for op0, op1 in aarch64_vectorize_vec_perm_const References: Date: Thu, 14 Jul 2022 12:33:46 +0100 In-Reply-To: (Prathamesh Kulkarni's message of "Thu, 14 Jul 2022 16:25:05 +0530") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-54.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jul 2022 11:33:51 -0000 Prathamesh Kulkarni writes: > Hi, > For following test case: > > svint32_t foo() > { > int32x4_t v =3D (int32x4_t) { 1, 2, 3, 4 }; > svint32_t v2 =3D svld1rq_s32 (svptrue_b8(), &v[0]); > return v2; > } > > After applying workaround in forwprop to not simplify VEC_PERM_EXPR in > simplify_permutation to avoid type error in middle end (or using > -fno-tree-forwprop) > as mentioned in: > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598390.html > > We get following optimized gimple: > v2_2 =3D VEC_PERM_EXPR <{ 1, 2, 3, 4 }, { 1, 2, 3, 4 }, { 0, 1, 2, 3, .= .. }>; > return v2_2; Hmm, we really should be able to fold that to a constant. However=E2=80=A6 > However we hit the following ICE during expansion of vec_perm_expr > because in aarch64_vectorize_vec_perm_const, > op0 is VECTOR_CST, and we call force_reg (VNx4SI, op0), which is incorrec= t mode > for op0. The patch fixes it by using op_mode instead of vmode in calls > to force_reg > for op0 and op1. > > during RTL pass: expand > foo2.c: In function =E2=80=98foo=E2=80=99: > foo2.c:8:10: internal compiler error: in emit_move_insn, at expr.cc:4052 > 8 | return v2; > | ^~ > 0x74789b emit_move_insn(rtx_def*, rtx_def*) > ../../gcc/gcc/expr.cc:4052 > 0xb8f664 force_reg(machine_mode, rtx_def*) > ../../gcc/gcc/explow.cc:688 > 0x134182f aarch64_vectorize_vec_perm_const > ../../gcc/gcc/config/aarch64/aarch64.cc:24132 > 0xe63070 expand_vec_perm_const(machine_mode, rtx_def*, rtx_def*, > int_vector_builder > const&, machine_mode, > rtx_def*) > ../../gcc/gcc/optabs.cc:6254 > 0xbb1569 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, > expand_modifier) > ../../gcc/gcc/expr.cc:10273 > 0xbb6498 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, > expand_modifier, rtx_def**, bool) > ../../gcc/gcc/expr.cc:10625 > 0xa897dc expand_expr > ../../gcc/gcc/expr.h:310 > 0xa897dc expand_return > ../../gcc/gcc/cfgexpand.cc:3809 > 0xa897dc expand_gimple_stmt_1 > ../../gcc/gcc/cfgexpand.cc:3918 > 0xa897dc expand_gimple_stmt > ../../gcc/gcc/cfgexpand.cc:4044 > 0xa8f238 expand_gimple_basic_block > ../../gcc/gcc/cfgexpand.cc:6096 > 0xa91187 execute > ../../gcc/gcc/cfgexpand.cc:6822 > > Is the patch OK to commit after bootstrap+test on aarch64-linux-gnu ? > > Thanks, > Prathamesh > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index 25f4cbb466d..303814b8cca 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -24129,11 +24129,11 @@ aarch64_vectorize_vec_perm_const (machine_mode = vmode, machine_mode op_mode, > d.op_mode =3D op_mode; > d.op_vec_flags =3D aarch64_classify_vector_mode (d.op_mode); > d.target =3D target; > - d.op0 =3D op0 ? force_reg (vmode, op0) : NULL_RTX; > + d.op0 =3D op0 ? force_reg (op_mode, op0) : NULL_RTX; > if (op0 =3D=3D op1) > d.op1 =3D d.op0; > else > - d.op1 =3D op1 ? force_reg (vmode, op1) : NULL_RTX; > + d.op1 =3D op1 ? force_reg (op_mode, op1) : NULL_RTX; > d.testing_p =3D !target; >=20=20 > if (!d.testing_p) =E2=80=A6yes, this is OK, thanks. Richard