From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 95244385C415 for ; Thu, 14 Jul 2022 11:52:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 95244385C415 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B68AB13D5; Thu, 14 Jul 2022 04:52:40 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.37]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AD9A03F792; Thu, 14 Jul 2022 04:52:39 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener , Prathamesh Kulkarni , gcc Patches , richard.sandiford@arm.com Cc: Prathamesh Kulkarni , gcc Patches Subject: Re: ICE after folding svld1rq to vec_perm_expr duing forwprop References: Date: Thu, 14 Jul 2022 12:52:38 +0100 In-Reply-To: (Richard Biener's message of "Thu, 14 Jul 2022 10:33:52 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-47.9 required=5.0 tests=BAYES_00, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_PORT autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jul 2022 11:52:42 -0000 Richard Biener writes: > On Thu, Jul 14, 2022 at 9:55 AM Prathamesh Kulkarni > wrote: >> >> On Wed, 13 Jul 2022 at 12:22, Richard Biener wrote: >> > >> > On Tue, Jul 12, 2022 at 9:12 PM Prathamesh Kulkarni via Gcc-patches >> > wrote: >> > > >> > > Hi Richard, >> > > For the following test: >> > > >> > > svint32_t f2(int a, int b, int c, int d) >> > > { >> > > int32x4_t v =3D (int32x4_t) {a, b, c, d}; >> > > return svld1rq_s32 (svptrue_b8 (), &v[0]); >> > > } >> > > >> > > The compiler emits following ICE with -O3 -mcpu=3Dgeneric+sve: >> > > foo.c: In function =E2=80=98f2=E2=80=99: >> > > foo.c:4:11: error: non-trivial conversion in =E2=80=98view_convert_e= xpr=E2=80=99 >> > > 4 | svint32_t f2(int a, int b, int c, int d) >> > > | ^~ >> > > svint32_t >> > > __Int32x4_t >> > > _7 =3D VIEW_CONVERT_EXPR<__Int32x4_t>(_8); >> > > during GIMPLE pass: forwprop >> > > dump file: foo.c.109t.forwprop2 >> > > foo.c:4:11: internal compiler error: verify_gimple failed >> > > 0xfda04a verify_gimple_in_cfg(function*, bool) >> > > ../../gcc/gcc/tree-cfg.cc:5568 >> > > 0xe9371f execute_function_todo >> > > ../../gcc/gcc/passes.cc:2091 >> > > 0xe93ccb execute_todo >> > > ../../gcc/gcc/passes.cc:2145 >> > > >> > > This happens because, after folding svld1rq_s32 to vec_perm_expr, we= have: >> > > int32x4_t v; >> > > __Int32x4_t _1; >> > > svint32_t _9; >> > > vector(4) int _11; >> > > >> > > : >> > > _1 =3D {a_3(D), b_4(D), c_5(D), d_6(D)}; >> > > v_12 =3D _1; >> > > _11 =3D v_12; >> > > _9 =3D VEC_PERM_EXPR <_11, _11, { 0, 1, 2, 3, ... }>; >> > > return _9; >> > > >> > > During forwprop, simplify_permutation simplifies vec_perm_expr to >> > > view_convert_expr, >> > > and the end result becomes: >> > > svint32_t _7; >> > > __Int32x4_t _8; >> > > >> > > ;; basic block 2, loop depth 0 >> > > ;; pred: ENTRY >> > > _8 =3D {a_2(D), b_3(D), c_4(D), d_5(D)}; >> > > _7 =3D VIEW_CONVERT_EXPR<__Int32x4_t>(_8); >> > > return _7; >> > > ;; succ: EXIT >> > > >> > > which causes the error duing verify_gimple since VIEW_CONVERT_EXPR >> > > has incompatible types (svint32_t, int32x4_t). >> > > >> > > The attached patch disables simplification of VEC_PERM_EXPR >> > > in simplify_permutation, if lhs and rhs have non compatible types, >> > > which resolves ICE, but am not sure if it's the correct approach ? >> > >> > It for sure papers over the issue. I think the error happens earlier, >> > the V_C_E should have been built with the type of the VEC_PERM_EXPR >> > which is the type of the LHS. But then you probably run into the >> > different sizes ICE (VLA vs constant size). I think for this case you >> > want a BIT_FIELD_REF instead of a VIEW_CONVERT_EXPR, >> > selecting the "low" part of the VLA vector. >> Hi Richard, >> Sorry I don't quite follow. In this case, we use VEC_PERM_EXPR to >> represent dup operation >> from fixed width to VLA vector. I am not sure how folding it to >> BIT_FIELD_REF will work. >> Could you please elaborate ? >> >> Also, the issue doesn't seem restricted to this case. >> The following test case also ICE's during forwprop: >> svint32_t foo() >> { >> int32x4_t v =3D (int32x4_t) {1, 2, 3, 4}; >> svint32_t v2 =3D svld1rq_s32 (svptrue_b8 (), &v[0]); >> return v2; >> } >> >> foo2.c: In function =E2=80=98foo=E2=80=99: >> foo2.c:9:1: error: non-trivial conversion in =E2=80=98vector_cst=E2=80=99 >> 9 | } >> | ^ >> svint32_t >> int32x4_t >> v2_4 =3D { 1, 2, 3, 4 }; >> >> because simplify_permutation folds >> VEC_PERM_EXPR< {1, 2, 3, 4}, {1, 2, 3, 4}, {0, 1, 2, 3, ...} > >> into: >> vector_cst {1, 2, 3, 4} >> >> and it complains during verify_gimple_assign_single because we don't >> support assignment of vector_cst to VLA vector. >> >> I guess the issue really is that currently, only VEC_PERM_EXPR >> supports lhs and rhs >> to have vector types with differing lengths, and simplifying it to >> other tree codes, like above, >> will result in type errors ? > > That might be the case - Richard should know. I don't see anything particularly special about VEC_PERM_EXPR here, or about the VLA vs. VLS thing. We would have the same issue trying to build a 128-bit vector from 2 64-bit vectors. And there are other tree codes whose input types are/can be different from their output types. So it just seems like a normal type correctness issue: a VEC_PERM_EXPR of type T needs to be replaced by something of type T. Whether T has a constant size or a variable size doesn't matter. > If so your type check > is still too late, you should instead recognize that we are permuting > a VLA vector and then refuse to go any of the non-VEC_PERM generating > paths - that probably means just allowing the code =3D=3D VEC_PERM_EXPR > case and not any of the CTOR/CST/VIEW_CONVERT_EXPR cases? Yeah. If we're talking about the match.pd code, I think only: (if (sel.series_p (0, 1, 0, 1)) { op0; } (if (sel.series_p (0, 1, nelts, 1)) { op1; } need a type compatibility check. For fold_vec_perm I think we should just rearrange: gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (type), nelts) && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)), nelts) && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1)), nelts)); if (TREE_TYPE (TREE_TYPE (arg0)) !=3D TREE_TYPE (type) || TREE_TYPE (TREE_TYPE (arg1)) !=3D TREE_TYPE (type)) return NULL_TREE; so that the assert comes after the early-out. It would be good at some point to relax fold_vec_perm to cases where the outputs are a different length from the inputs, since the all-constant VEC_PERM_EXPR above could be folded to a VECTOR_CST. Thanks, Richard