From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 69A2B385840F for ; Thu, 4 Nov 2021 17:48:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 69A2B385840F Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 075771063; Thu, 4 Nov 2021 10:48:37 -0700 (PDT) Received: from localhost (unknown [10.32.98.88]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 68C2B3F719; Thu, 4 Nov 2021 10:48:36 -0700 (PDT) From: Richard Sandiford To: "Andre Vieira \(lists\)" Mail-Followup-To: "Andre Vieira \(lists\)" , Kyrylo Tkachov , "gcc-patches\@gcc.gnu.org" , richard.sandiford@arm.com Cc: Kyrylo Tkachov , "gcc-patches\@gcc.gnu.org" Subject: Re: [AArch64] Fix NEON load/store gimple lowering and big-endian testisms References: <0911f6c6-7832-026c-36c1-cbb576944a62@arm.com> Date: Thu, 04 Nov 2021 17:48:35 +0000 In-Reply-To: <0911f6c6-7832-026c-36c1-cbb576944a62@arm.com> (Andre Vieira's message of "Thu, 4 Nov 2021 09:55:37 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Nov 2021 17:48:39 -0000 "Andre Vieira (lists)" writes: > Hi, > > This should address the ubsan bootstrap build and big-endian testisms=20 > reported against the last NEON load/store gimple lowering patch. I also=20 > fixed a follow-up issue where the alias information was leading to a bad= =20 > codegen transformation. The NEON intrinsics specifications do not forbid= =20 > the use of memory accesses with different pointer types. In fact you=20 > will see intrinsic user code loading a int16x8_t vector from an int=20 > pointer, so we must make sure GCC is aware a NEON memory access of an=20 > 'int' pointer can alias with a 'short' pointer. > > Bootstrapped aarch64-linux-gnu (also did an ubsan bootstrap). > > Is this OK for trunk? > > gcc/ChangeLog: > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * config/aarch64/aarch64-buil= tins.c=20 > (aarch64_general_gimple_fold_builtin): Change pointer alignment and alias. > > gcc/testsuite/ChangeLog: > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * gcc.target/aarch64/fmla_int= rinsic_1.c: Fix big-endian testism. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * gcc.target/aarch64/fmls_int= rinsic_1.c: Likewise. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * gcc.target/aarch64/fmul_int= rinsic_1.c: Likewise. > > diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/a= arch64-builtins.c > index a815e4cfbccab692ca688ba87c71b06c304abbfb..fc8fcb02c55e22963d2a3bf77= b4749eb5b1c1561 100644 > --- a/gcc/config/aarch64/aarch64-builtins.c > +++ b/gcc/config/aarch64/aarch64-builtins.c > @@ -2486,16 +2486,22 @@ aarch64_general_gimple_fold_builtin (unsigned int= fcode, gcall *stmt, > aarch64_simd_type_info simd_type > =3D aarch64_simd_types[mem_type]; > tree elt_ptr_type =3D build_pointer_type (simd_type.eltype); > + elt_ptr_type =3D build_distinct_type_copy (elt_ptr_type); > + TYPE_REF_CAN_ALIAS_ALL (elt_ptr_type) =3D true; > tree zero =3D build_zero_cst (elt_ptr_type); > gimple_seq stmts =3D NULL; > tree base =3D gimple_convert (&stmts, elt_ptr_type, > args[0]); This conversion seems redundant. Do things work if we use args[0] directly? > + /* Use element type alignment. */ > + tree access_type > + =3D build_aligned_type (simd_type.itype, > + TYPE_ALIGN (TREE_TYPE (simd_type.itype))); I think simd_type.eltype is more natural than TREE_TYPE (simd_type.itype) here, to match the pointer target type. Same idea for the stores. > if (stmts) > gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); > new_stmt > =3D gimple_build_assign (gimple_get_lhs (stmt), > fold_build2 (MEM_REF, > - simd_type.itype, > + access_type, > base, zero)); > } > break; > @@ -2508,17 +2514,22 @@ aarch64_general_gimple_fold_builtin (unsigned int= fcode, gcall *stmt, > aarch64_simd_type_info simd_type > =3D aarch64_simd_types[mem_type]; > tree elt_ptr_type =3D build_pointer_type (simd_type.eltype); > + elt_ptr_type =3D build_distinct_type_copy (elt_ptr_type); > + TYPE_REF_CAN_ALIAS_ALL (elt_ptr_type) =3D true; > tree zero =3D build_zero_cst (elt_ptr_type); > gimple_seq stmts =3D NULL; > tree base =3D gimple_convert (&stmts, elt_ptr_type, > args[0]); > + /* Use element type alignment. */ > + tree access_type > + =3D build_aligned_type (simd_type.itype, > + TYPE_ALIGN (TREE_TYPE (simd_type.itype))); > if (stmts) > gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); > new_stmt > - =3D gimple_build_assign (fold_build2 (MEM_REF, > - simd_type.itype, > - base, > - zero), args[1]); > + =3D gimple_build_assign (fold_build2 (MEM_REF, access_type, > + base, zero), > + args[1]); > } > break; >=20=20 > diff --git a/gcc/testsuite/gcc.target/aarch64/fmla_intrinsic_1.c b/gcc/te= stsuite/gcc.target/aarch64/fmla_intrinsic_1.c > index adb787a8599af23847dd62dcd153d7cfe43dacc0..c1aeb06e74753052c2ee441b3= 61b92148f1b4b0a 100644 > --- a/gcc/testsuite/gcc.target/aarch64/fmla_intrinsic_1.c > +++ b/gcc/testsuite/gcc.target/aarch64/fmla_intrinsic_1.c > @@ -107,10 +107,12 @@ main (int argc, char **argv) >=20=20 > /* vfma_lane_f64. > vfma_laneq_f64. */ > -/* { dg-final { scan-assembler-times "fmadd\\td\[0-9\]+\, d\[0-9\]+\, d\= [0-9\]+\, d\[0-9\]+" 2 } } */ > +/* { dg-final { scan-assembler-times "fmadd\\td\[0-9\]+\, d\[0-9\]+\, d\= [0-9\]+\, d\[0-9\]+" 1 { target aarch64_big_endian } } } */ > +/* { dg-final { scan-assembler-times "fmadd\\td\[0-9\]+\, d\[0-9\]+\, d\= [0-9\]+\, d\[0-9\]+" 2 { target aarch64_little_endian } } } */ Could you explain this in more detail? What happens for little-endian vs. big-endian? Thanks, Richard >=20=20 > /* vfmaq_lane_f64. > vfmaq_laneq_f64. */ > -/* { dg-final { scan-assembler-times "fmla\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 } } */ > +/* { dg-final { scan-assembler-times "fmla\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 3 { target aarch64_big_endian } } } */ > +/* { dg-final { scan-assembler-times "fmla\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 { target aarch64_little_endian } } } */ >=20=20 >=20=20 > diff --git a/gcc/testsuite/gcc.target/aarch64/fmls_intrinsic_1.c b/gcc/te= stsuite/gcc.target/aarch64/fmls_intrinsic_1.c > index 865def28c3f4d04042ab495d232bb865cabb2b50..3137ea91e809e37de589091e9= bbd43bfe4d221a1 100644 > --- a/gcc/testsuite/gcc.target/aarch64/fmls_intrinsic_1.c > +++ b/gcc/testsuite/gcc.target/aarch64/fmls_intrinsic_1.c > @@ -108,10 +108,12 @@ main (int argc, char **argv) >=20=20 > /* vfms_lane_f64. > vfms_laneq_f64. */ > -/* { dg-final { scan-assembler-times "fmsub\\td\[0-9\]+\, d\[0-9\]+\, d\= [0-9\]+\, d\[0-9\]+" 2 } } */ > +/* { dg-final { scan-assembler-times "fmsub\\td\[0-9\]+\, d\[0-9\]+\, d\= [0-9\]+\, d\[0-9\]+" 1 { target aarch64_big_endian } } } */ > +/* { dg-final { scan-assembler-times "fmsub\\td\[0-9\]+\, d\[0-9\]+\, d\= [0-9\]+\, d\[0-9\]+" 2 { target aarch64_little_endian } } } */ >=20=20 > /* vfmsq_lane_f64. > vfmsq_laneq_f64. */ > -/* { dg-final { scan-assembler-times "fmls\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 } } */ > +/* { dg-final { scan-assembler-times "fmls\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 3 { target aarch64_big_endian } } } */ > +/* { dg-final { scan-assembler-times "fmls\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 { target aarch64_little_endian } } } */ >=20=20 >=20=20 > diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_intrinsic_1.c b/gcc/te= stsuite/gcc.target/aarch64/fmul_intrinsic_1.c > index d01095e81c1e45dc1da998aa337ba551b3752ebe..7d4829c40d7042226f2f09fab= 9fdfa7c3dd211c4 100644 > --- a/gcc/testsuite/gcc.target/aarch64/fmul_intrinsic_1.c > +++ b/gcc/testsuite/gcc.target/aarch64/fmul_intrinsic_1.c > @@ -107,10 +107,12 @@ main (int argc, char **argv) >=20=20 > /* vmul_lane_f64. > Vmul_laneq_f64. */ > -/* { dg-final { scan-assembler-times "fmul\\td\[0-9\]+, d\[0-9\]+, d\[0-= 9\]+" 2 } } */ > +/* { dg-final { scan-assembler-times "fmul\\td\[0-9\]+, d\[0-9\]+, d\[0-= 9\]+" 1 { target aarch64_big_endian } } } */ > +/* { dg-final { scan-assembler-times "fmul\\td\[0-9\]+, d\[0-9\]+, d\[0-= 9\]+" 2 { target aarch64_little_endian } } } */ >=20=20 > /* vmulq_lane_f64. > vmulq_laneq_f64. */ > -/* { dg-final { scan-assembler-times "fmul\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 } } */ > +/* { dg-final { scan-assembler-times "fmul\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 3 { target aarch64_big_endian } } } */ > +/* { dg-final { scan-assembler-times "fmul\\tv\[0-9\]+\.2d, v\[0-9\]+\.2= d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 { target aarch64_little_endian } } } */ >=20=20 >=20=20