From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 8EEE03844011 for ; Wed, 16 Dec 2020 12:49:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8EEE03844011 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 41E891FB; Wed, 16 Dec 2020 04:49:35 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 58A263F66E; Wed, 16 Dec 2020 04:49:34 -0800 (PST) From: Richard Sandiford To: Przemyslaw Wirkus Mail-Followup-To: Przemyslaw Wirkus , "gcc-patches\@gcc.gnu.org" , Richard Earnshaw , Kyrylo Tkachov , Marcus Shawcroft , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" , Richard Earnshaw , Kyrylo Tkachov , Marcus Shawcroft Subject: Re: [PATCH][GCC][PR target/98177] aarch64: SVE: ICE in expand_direct_optab_fn References: Date: Wed, 16 Dec 2020 12:49:32 +0000 In-Reply-To: (Przemyslaw Wirkus's message of "Wed, 16 Dec 2020 11:55:44 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2020 12:49:37 -0000 Przemyslaw Wirkus writes: > > This is a bug in the vectoriser: the vectoriser shouldn't generate > > IFN_REDUC_MAX calls that the target doesn't support. > >=20 > > I think the problem comes from using the wrong interface to get the ind= ex > > type for a COND_REDUCTION. vectorizable_reduction has: > >=20 > > cr_index_vector_type =3D build_vector_type (cr_index_scalar_type, > > nunits_out); > >=20 > > which means that for fixed-length SVE we get a V2SI (a 64-bit Advanced = SIMD > > vector) instead of a VNx2SI (an SVE vector that stores SI elements in DI > > containers). It should be using: > >=20 > > cr_index_vector_type =3D get_same_sized_vectype (cr_index_scalar_= type, > > vectype_out); > >=20 > > instead. Same idea for the build_vector_type call in > > vect_create_epilog_for_reduction. Note that for this last bit I meant: tree vectype_unsigned =3D build_vector_type (scalar_type_unsigned, TYPE_VECTOR_SUBPARTS (vectype)); which should become: tree vectype_unsigned =3D get_same_sized_vectype (scalar_type_unsigne= d, vectype); This is the =E2=80=9Ctransform=E2=80=9D code that partners the =E2=80=9Cana= lysis=E2=80=9D code that you're patching. Changing one but not the other would cause problems if (say) the Advanced SIMD REDUC_MAX patterns were disabled. We'd then correctly pick an SVE mode like VNx4SI when doing the analysis, but generate an unsupported V4SI REDUC_MAX in vect_create_epilog_for_reduction. That in turn would trip the kind of expand-time assert that was reported in the PR, just for a different case. It's better for the modes to match up anyway: we should use a VNx4SI reduction when operating on SVE and a V4SI reducation when operating on Advanced SIMD. This is particularly true for big endian, where mixing SVE and Advanced SIMD can involve a permute. > diff --git a/gcc/testsuite/g++.target/aarch64/pr98177-1.C b/gcc/testsuite= /g++.target/aarch64/pr98177-1.C > new file mode 100644 > index 0000000000000000000000000000000000000000..a776b7352f966f6b1d870ed51= a7c94647bc46d80 > --- /dev/null > +++ b/gcc/testsuite/g++.target/aarch64/pr98177-1.C > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Ofast -march=3Darmv8.2-a+sve -msve-vector-bits=3D128" = } */ > + > +int a, b; > +short c; > +void d(long e) { > + for (int f =3D 0; f < b; f +=3D 1) > + for (short g =3D 0; g < c; g +=3D 5) > + a =3D (short)e; > +} It'd be better to put these g++.target/aarch64/sve and drop the -march option. That way we'll test with the user's specified -march or -mcpu if that -march/-mcpu already supports SVE. Same idea for the other tests (including the C ones). OK for trunk with those changes, thanks. Richard