From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E3C5A3858D37 for ; Thu, 9 Nov 2023 10:39:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E3C5A3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E3C5A3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699526344; cv=none; b=wUYUCVUukhGwD0sh8n8TR8tYUNUC1KYd2hRbIEtdjUdt35YLFh9vGf9GRiP9nk1WQEnt/xtNX9giy+LQY90bc3SVq5W8/EOAeZeU2KNywigc8oQWbEq7O2bryTnVRejUZWdtGUfay3Xu/G3COnIknClg2Cs65jE8svwDBgu1KRo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699526344; c=relaxed/simple; bh=r1OgFFAZf1aFc4J4hPhg7/vgTWzSw+5AhN9IK9UaPho=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=k7FzFQKndiN3wFWrwhfO79xDLRTMFG0Tb5YgseFIHAlv+Zsom6Jik0cJ2p7RaX7RQ4gmezosDCcllMcNk1oyX1m2s2pR3GAMebU0lDEWf7DNXaLgWRadEiE5meQ55SrYVr+XcOQhp/zrawq4WJa4HeSaP51o9o2w71hFfoBvqUY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 28AC212FC; Thu, 9 Nov 2023 02:39:47 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B4EBF3F703; Thu, 9 Nov 2023 02:39:01 -0800 (PST) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,"gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov Subject: Re: [PATCH]AArch64: Use SVE unpredicated LOGICAL expressions when Advanced SIMD inefficient [PR109154] References: Date: Thu, 09 Nov 2023 10:39:00 +0000 In-Reply-To: (Tamar Christina's message of "Wed, 8 Nov 2023 14:21:25 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-23.0 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_LOTSOFHASH,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: >> >> > + "&& TARGET_SVE && rtx_equal_p (operands[0], operands[1]) >> >> > + && satisfies_constraint_ (operands[2]) >> >> > + && FP_REGNUM_P (REGNO (operands[0]))" >> >> > + [(const_int 0)] >> >> > + { >> >> > + rtx op1 = lowpart_subreg (mode, operands[1], >> mode); >> >> > + rtx op2 = gen_const_vec_duplicate (mode, operands[2]); >> >> > + emit_insn (gen_3 (op1, op1, op2)); >> >> > + DONE; >> >> > + } >> >> > ) >> >> >> >> The WIP SME patches add a %Z modifier for 'z' register prefixes, >> >> similarly to b/h/s/d for scalar FP. With that I think the alternative can be: >> >> >> >> [w , 0 , ; * , sve ] \t%Z0., %Z0., #%2 >> >> >> >> although it would be nice to keep the hex constant. >> > >> > My original patch added a %u for (undecorated) which just prints the >> > register number and changed %C to also accept a single constant instead of >> only a uniform vector. >> >> Not saying no to %u in future, but %Z seems more consistent with the current >> approach. And yeah, I'd also wondered about extending %C. >> The problem is guessing whether to print a 32-bit, 64-bit or 128-bit constant >> for negative immediates. >> > > Rebased patch, > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR tree-optimization/109154 > * config/aarch64/aarch64.md (3): Add SVE case. > * config/aarch64/aarch64-simd.md (ior3): Likewise. > * config/aarch64/iterators.md (VCONV, vconv): New. > * config/aarch64/predicates.md(aarch64_orr_imm_sve_advsimd): New. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/109154 > * gcc.target/aarch64/sve/fneg-abs_1.c: Updated. > * gcc.target/aarch64/sve/fneg-abs_2.c: Updated. > * gcc.target/aarch64/sve/fneg-abs_4.c: Updated. > > --- inline copy of patch -- > > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md > index 33eceb436584ff73c7271f93639f2246d1af19e0..98c418c54a82a348c597310caa23916f9c16f9b6 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -1219,11 +1219,14 @@ (define_insn "and3" > (define_insn "ior3" > [(set (match_operand:VDQ_I 0 "register_operand") > (ior:VDQ_I (match_operand:VDQ_I 1 "register_operand") > - (match_operand:VDQ_I 2 "aarch64_reg_or_orr_imm")))] > - "TARGET_SIMD" > - {@ [ cons: =0 , 1 , 2 ] > - [ w , w , w ] orr\t%0., %1., %2. > - [ w , 0 , Do ] << aarch64_output_simd_mov_immediate (operands[2], , AARCH64_CHECK_ORR); > + (match_operand:VDQ_I 2 "aarch64_orr_imm_sve_advsimd")))] > + "TARGET_SIMD" > + {@ [ cons: =0 , 1 , 2; attrs: arch ] > + [ w , w , w ; simd ] orr\t%0., %1., %2. > + [ w , 0 , vsl; sve ] orr\t%Z0., %Z0., #%2 > + [ w , 0 , Do ; simd ] \ > + << aarch64_output_simd_mov_immediate (operands[2], , \ > + AARCH64_CHECK_ORR); > } > [(set_attr "type" "neon_logic")] > ) > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 4fcd71a2e9d1e8c35f35593255c4f66a68856a79..c6b1506fe7b47dd40741f26ef0cc92692008a631 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -4599,7 +4599,8 @@ (define_insn "3" > "" > {@ [ cons: =0 , 1 , 2 ; attrs: type , arch ] > [ r , %r , r ; logic_reg , * ] \t%0, %1, %2 > - [ rk , r , ; logic_imm , * ] \t%0, %1, %2 > + [ rk , ^r , ; logic_imm , * ] \t%0, %1, %2 > + [ w , 0 , ; * , sve ] \t%Z0., %Z0., #%2 > [ w , w , w ; neon_logic , simd ] \t%0., %1., %2. > } > ) > diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md > index 1593a8fd04f91259295d0e393cbc7973daf7bf73..d24109b4fe6a867125b9474d34d616155bc36b3f 100644 > --- a/gcc/config/aarch64/iterators.md > +++ b/gcc/config/aarch64/iterators.md > @@ -1435,6 +1435,19 @@ (define_mode_attr VCONQ [(V8QI "V16QI") (V16QI "V16QI") > (HI "V8HI") (QI "V16QI") > (SF "V4SF") (DF "V2DF")]) > > +;; 128-bit container modes for the lower part of an SVE vector to the inner or > +;; neon source mode. > +(define_mode_attr VCONV [(SI "VNx4SI") (DI "VNx2DI") > + (V8QI "VNx16QI") (V16QI "VNx16QI") > + (V4HI "VNx8HI") (V8HI "VNx8HI") > + (V2SI "VNx4SI") (V4SI "VNx4SI") > + (V2DI "VNx2DI")]) > +(define_mode_attr vconv [(SI "vnx4si") (DI "vnx2di") > + (V8QI "vnx16qi") (V16QI "vnx16qi") > + (V4HI "vnx8hi") (V8HI "vnx8hi") > + (V2SI "vnx4si") (V4SI "vnx4si") > + (V2DI "vnx2di")]) > + > ;; Half modes of all vector modes. > (define_mode_attr VHALF [(V8QI "V4QI") (V16QI "V8QI") > (V4HI "V2HI") (V8HI "V4HI") These attributes arne't needed any more (at least, not by this patch). OK for trunk with those removed. Thanks, Richard > diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md > index 01de47439744acb3708c645b98eaa607294a1f1f..a73724a7fc05636d4c0643a291f40f2609564778 100644 > --- a/gcc/config/aarch64/predicates.md > +++ b/gcc/config/aarch64/predicates.md > @@ -871,6 +871,11 @@ (define_predicate "aarch64_sve_logical_operand" > (ior (match_operand 0 "register_operand") > (match_operand 0 "aarch64_sve_logical_immediate"))) > > +(define_predicate "aarch64_orr_imm_sve_advsimd" > + (ior (match_operand 0 "aarch64_reg_or_orr_imm") > + (and (match_test "TARGET_SVE") > + (match_operand 0 "aarch64_sve_logical_operand")))) > + > (define_predicate "aarch64_sve_gather_offset_b" > (ior (match_operand 0 "register_operand") > (match_operand 0 "aarch64_sve_gather_immediate_b"))) > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > index 0c7664e6de77a497682952653ffd417453854d52..a8b27199ff83d0eebadfc7dcf03f94e1229d76b8 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > @@ -6,7 +6,7 @@ > > /* > ** t1: > -** orr v[0-9]+.2s, #128, lsl #24 > +** orr z[0-9]+.s, z[0-9]+.s, #-2147483648 > ** ret > */ > float32x2_t t1 (float32x2_t a) > @@ -16,7 +16,7 @@ float32x2_t t1 (float32x2_t a) > > /* > ** t2: > -** orr v[0-9]+.4s, #128, lsl #24 > +** orr z[0-9]+.s, z[0-9]+.s, #-2147483648 > ** ret > */ > float32x4_t t2 (float32x4_t a) > @@ -26,9 +26,7 @@ float32x4_t t2 (float32x4_t a) > > /* > ** t3: > -** adrp x0, .LC[0-9]+ > -** ldr q[0-9]+, \[x0, #:lo12:.LC0\] > -** orr v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b > +** orr z[0-9]+.d, z[0-9]+.d, #-9223372036854775808 > ** ret > */ > float64x2_t t3 (float64x2_t a) > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > index a60cd31b9294af2dac69eed1c93f899bd5c78fca..19a7695e605bc8aced486a9c450d1cdc6be4691a 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > @@ -7,8 +7,7 @@ > > /* > ** f1: > -** movi v[0-9]+.2s, 0x80, lsl 24 > -** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > +** orr z0.s, z0.s, #-2147483648 > ** ret > */ > float32_t f1 (float32_t a) > @@ -18,9 +17,7 @@ float32_t f1 (float32_t a) > > /* > ** f2: > -** mov x0, -9223372036854775808 > -** fmov d[0-9]+, x0 > -** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > +** orr z0.d, z0.d, #-9223372036854775808 > ** ret > */ > float64_t f2 (float64_t a) > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > index 21f2a8da2a5d44e3d01f6604ca7be87e3744d494..663d5fe17e091d128313b6b8b8dc918a01a96c4f 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > @@ -6,9 +6,7 @@ > > /* > ** negabs: > -** mov x0, -9223372036854775808 > -** fmov d[0-9]+, x0 > -** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > +** orr z0.d, z0.d, #-9223372036854775808 > ** ret > */ > double negabs (double x) > @@ -22,8 +20,7 @@ double negabs (double x) > > /* > ** negabsf: > -** movi v[0-9]+.2s, 0x80, lsl 24 > -** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > +** orr z0.s, z0.s, #-2147483648 > ** ret > */ > float negabsf (float x)