From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 75B863882027 for ; Mon, 13 Nov 2023 07:08:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 75B863882027 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 75B863882027 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.220.29 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699859338; cv=none; b=qkun3gPVgY7qfonwG/cx8ILu3u0xAkNddPmRB22P4Mg9vN2UG681qy2bYSiLUwqyj8eajNz71cgs50Bjp+KgKy3DW8N8Kmog48mn9ZUk6K6r7z36EPmHEEz8082OE+VM1iGOQfzbEJlsVHoCnzb4xsO21NummzZLnPlZjwdKto0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699859338; c=relaxed/simple; bh=7l8cuUhcRCW5dSRPVPmeg8yqeXUdHNwRCG8O4Wj/i4Q=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=l8WtBd63oqY7r19gBpa7lf/6ziOFv3A3KiyvWukLr4QD+e5deWIUVj5QpJlzGlXtOL6gxT74jIpTNW8lBd/esr4PcQShylpqaY2+gN1kShBI7fQI45KF3qQEO/ESie+INnNL1zdRt2ZxJ9jakHtkjvMsIyjLmOdpenq6dxWlSQ0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 5365C1F45E; Mon, 13 Nov 2023 07:08:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1699859334; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yOyWRW15SaXCT4sK3XZaMqOaVzopXpzbt82Xdi5ksM4=; b=ul9WxEd4/+jmjLUDG+Z+M0Ix9itUrOil037mEL1ME/eWSfGcx8Pzj93EblA44N5xt0nAKG j6nRl6BVoLm+Cj/iZ6h+0kflOablQMFlyA1qjSYF2TIakr346F8WzCqOG7RH/N9buw2Nro boPvsKqD1UkSB8HRUISiXVmaBFNJ9cg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1699859334; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yOyWRW15SaXCT4sK3XZaMqOaVzopXpzbt82Xdi5ksM4=; b=hzbWz8W7MBGcmS8i67sGO0XxMB8Uk6nyCH8OI45vKciPb0E1TZhpUw3y8/6+x7i8t1m1St 4pVzDR/zoq+ENkDg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 01AF42D21B; Mon, 13 Nov 2023 07:08:54 +0000 (UTC) Date: Mon, 13 Nov 2023 07:08:54 +0000 (UTC) From: Richard Biener To: Andrew Pinski cc: Tamar Christina , Prathamesh Kulkarni , "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154] In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_LOTSOFHASH,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 10 Nov 2023, Andrew Pinski wrote: > On Fri, Nov 10, 2023 at 5:12?AM Richard Biener wrote: > > > > On Fri, 10 Nov 2023, Tamar Christina wrote: > > > > > > > > Hi Prathamesh, > > > > > > Yes Arm requires SIMD for copysign. The testcases fail because they don't turn on Neon. > > > > > > I'll update them. > > > > On x86_64 with -m32 I see > > > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized ".COPYSIGN" 1 > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR" 1 > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= ABS_EXPR" > > 1 > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -" 1 > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= .COPYSIGN" > > 2 > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop > > "Deleting[^\\\\n]* = -" 4 > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop > > "Deleting[^\\\\n]* = \\\\.COPYSIGN" 2 > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop > > "Deleting[^\\\\n]* = ABS_EXPR <" 1 > > FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if" > > > > maybe add a copysign effective target? > > I get the feeling that the internal function for copysign should not > be a direct internal function for scalar modes and call > expand_copysign instead when expanding. > This will fix some if not all of the issues where COPYSIGN is now > trying to show up. But then I'd rather have a COPYSIGN_EXPR tree code, leaving internal-fns to optab mappings. We've discussed this and discarded any of this as too much work right now. But yes, the situation is a bit messy (as also discussed). Richard. > BY the way this is most likely PR 88786 (and PR 112468 and a few > others). and PR 58797 . > > Thanks, > Andrew > > > > > > > > Regards, > > > Tamar > > > ________________________________ > > > From: Prathamesh Kulkarni > > > Sent: Friday, November 10, 2023 12:24 PM > > > To: Tamar Christina > > > Cc: gcc-patches@gcc.gnu.org ; nd ; rguenther@suse.de ; jlaw@ventanamicro.com > > > Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154] > > > > > > On Mon, 6 Nov 2023 at 15:50, Tamar Christina wrote: > > > > > > > > Hi All, > > > > > > > > This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more > > > > canonical and allows a target to expand this sequence efficiently. Such > > > > sequences are common in scientific code working with gradients. > > > > > > > > There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x)) > > > > which I remove since this is a less efficient form. The testsuite is also > > > > updated in light of this. > > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > Hi Tamar, > > > It seems the patch caused following regressions on arm: > > > > > > Running gcc:gcc.dg/dg.exp ... > > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized ".COPYSIGN" 1 > > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR" 1 > > > > > > Running gcc:gcc.dg/tree-ssa/tree-ssa.exp ... > > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -" 1 > > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= .COPYSIGN" 2 > > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= ABS_EXPR" 1 > > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop > > > "Deleting[^\\n]* = -" 4 > > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop > > > "Deleting[^\\n]* = ABS_EXPR <" 1 > > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop > > > "Deleting[^\\n]* = \\.COPYSIGN" 2 > > > FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized ".COPYSIGN" 1 > > > FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized "ABS" 1 > > > FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple ".COPYSIGN" 4 > > > FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple "ABS" 4 > > > FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if" > > > Link to log files: > > > https://ci.linaro.org/job/tcwg_gcc_check--master-arm-build/1240/artifact/artifacts/00-sumfiles/ > > > > > > Even for following test-case: > > > double g (double a) > > > { > > > double t1 = fabs (a); > > > double t2 = -t1; > > > return t2; > > > } > > > > > > It seems, the pattern gets applied but doesn't get eventually > > > simplified to copysign(a, -1). > > > forwprop dump shows: > > > Applying pattern match.pd:1131, gimple-match-4.cc:4134 > > > double g (double a) > > > { > > > double t2; > > > double t1; > > > > > > : > > > t1_2 = ABS_EXPR ; > > > t2_3 = -t1_2; > > > return t2_3; > > > > > > } > > > > > > while on x86_64: > > > Applying pattern match.pd:1131, gimple-match-4.cc:4134 > > > gimple_simplified to t2_3 = .COPYSIGN (a_1(D), -1.0e+0); > > > Removing dead stmt:t1_2 = ABS_EXPR ; > > > double g (double a) > > > { > > > double t2; > > > double t1; > > > > > > : > > > t2_3 = .COPYSIGN (a_1(D), -1.0e+0); > > > return t2_3; > > > > > > } > > > > > > Thanks, > > > Prathamesh > > > > > > > > > > > > > > Ok for master? > > > > > > > > Thanks, > > > > Tamar > > > > > > > > gcc/ChangeLog: > > > > > > > > PR tree-optimization/109154 > > > > * match.pd: Add new neg+abs rule, remove inverse copysign rule. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > PR tree-optimization/109154 > > > > * gcc.dg/fold-copysign-1.c: Updated. > > > > * gcc.dg/pr55152-2.c: Updated. > > > > * gcc.dg/tree-ssa/abs-4.c: Updated. > > > > * gcc.dg/tree-ssa/backprop-6.c: Updated. > > > > * gcc.dg/tree-ssa/copy-sign-2.c: Updated. > > > > * gcc.dg/tree-ssa/mult-abs-2.c: Updated. > > > > * gcc.target/aarch64/fneg-abs_1.c: New test. > > > > * gcc.target/aarch64/fneg-abs_2.c: New test. > > > > * gcc.target/aarch64/fneg-abs_3.c: New test. > > > > * gcc.target/aarch64/fneg-abs_4.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_1.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_2.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_3.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_4.c: New test. > > > > > > > > --- inline copy of patch -- > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > > index db95931df0672cf4ef08cca36085c3aa6831519e..7a023d510c283c43a87b1795a74761b8af979b53 100644 > > > > --- a/gcc/match.pd > > > > +++ b/gcc/match.pd > > > > @@ -1106,13 +1106,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > > (hypots @0 (copysigns @1 @2)) > > > > (hypots @0 @1)))) > > > > > > > > -/* copysign(x, CST) -> [-]abs (x). */ > > > > -(for copysigns (COPYSIGN_ALL) > > > > - (simplify > > > > - (copysigns @0 REAL_CST@1) > > > > - (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1))) > > > > - (negate (abs @0)) > > > > - (abs @0)))) > > > > +/* Transform fneg (fabs (X)) -> copysign (X, -1). */ > > > > + > > > > +(simplify > > > > + (negate (abs @0)) > > > > + (IFN_COPYSIGN @0 { build_minus_one_cst (type); })) > > > > > > > > /* copysign(copysign(x, y), z) -> copysign(x, z). */ > > > > (for copysigns (COPYSIGN_ALL) > > > > diff --git a/gcc/testsuite/gcc.dg/fold-copysign-1.c b/gcc/testsuite/gcc.dg/fold-copysign-1.c > > > > index f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f68e62801d21c2df6a6 100644 > > > > --- a/gcc/testsuite/gcc.dg/fold-copysign-1.c > > > > +++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c > > > > @@ -12,5 +12,5 @@ double bar (double x) > > > > return __builtin_copysign (x, minuszero); > > > > } > > > > > > > > -/* { dg-final { scan-tree-dump-times "= -" 1 "cddce1" } } */ > > > > -/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 2 "cddce1" } } */ > > > > +/* { dg-final { scan-tree-dump-times "__builtin_copysign" 1 "cddce1" } } */ > > > > +/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1 "cddce1" } } */ > > > > diff --git a/gcc/testsuite/gcc.dg/pr55152-2.c b/gcc/testsuite/gcc.dg/pr55152-2.c > > > > index 54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921457b02ff0b88cc63ce6 100644 > > > > --- a/gcc/testsuite/gcc.dg/pr55152-2.c > > > > +++ b/gcc/testsuite/gcc.dg/pr55152-2.c > > > > @@ -10,4 +10,5 @@ int f(int a) > > > > return (a<-a)?a:-a; > > > > } > > > > > > > > -/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" } } */ > > > > +/* { dg-final { scan-tree-dump-times "\.COPYSIGN" 1 "optimized" } } */ > > > > +/* { dg-final { scan-tree-dump-times "ABS_EXPR" 1 "optimized" } } */ > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c > > > > index 6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b3a52d733368805ad31d 100644 > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c > > > > @@ -9,5 +9,6 @@ long double abs_ld(long double x) { return __builtin_signbit(x) ? x : -x; } > > > > > > > > /* __builtin_signbit(x) ? x : -x. Should be convert into - ABS_EXP */ > > > > /* { dg-final { scan-tree-dump-not "signbit" "optimized"} } */ > > > > -/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 3 "optimized"} } */ > > > > -/* { dg-final { scan-tree-dump-times "= -" 3 "optimized"} } */ > > > > +/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1 "optimized"} } */ > > > > +/* { dg-final { scan-tree-dump-times "= -" 1 "optimized"} } */ > > > > +/* { dg-final { scan-tree-dump-times "= \.COPYSIGN" 2 "optimized"} } */ > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c > > > > index 31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91fa1343cb2718db7ae1 100644 > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c > > > > @@ -26,5 +26,6 @@ TEST_FUNCTION (float, f) > > > > TEST_FUNCTION (double, ) > > > > TEST_FUNCTION (long double, l) > > > > > > > > -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 6 "backprop" } } */ > > > > -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <} 3 "backprop" } } */ > > > > +/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 4 "backprop" } } */ > > > > +/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = \.COPYSIGN} 2 "backprop" } } */ > > > > +/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <} 1 "backprop" } } */ > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c > > > > index de52c5f7c8062958353d91f5031193defc9f3f91..e5d565c4b9832c00106588ef411fbd8c292a5cad 100644 > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c > > > > @@ -10,4 +10,5 @@ float f1(float x) > > > > float t = __builtin_copysignf (1.0f, -x); > > > > return x * t; > > > > } > > > > -/* { dg-final { scan-tree-dump-times "ABS" 2 "optimized"} } */ > > > > +/* { dg-final { scan-tree-dump-times "ABS" 1 "optimized"} } */ > > > > +/* { dg-final { scan-tree-dump-times ".COPYSIGN" 1 "optimized"} } */ > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c > > > > index a41f1baf25669a4fd301a586a49ba5e3c5b966ab..a22896b21c8b5a4d5d8e28bd8ae0db896e63ade0 100644 > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c > > > > @@ -34,4 +34,5 @@ float i1(float x) > > > > { > > > > return x * (x <= 0.f ? 1.f : -1.f); > > > > } > > > > -/* { dg-final { scan-tree-dump-times "ABS" 8 "gimple"} } */ > > > > +/* { dg-final { scan-tree-dump-times "ABS" 4 "gimple"} } */ > > > > +/* { dg-final { scan-tree-dump-times "\.COPYSIGN" 4 "gimple"} } */ > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..f823013c3ddf6b3a266c3abfcbf2642fc2a75fa6 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c > > > > @@ -0,0 +1,39 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** t1: > > > > +** orr v[0-9]+.2s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x2_t t1 (float32x2_t a) > > > > +{ > > > > + return vneg_f32 (vabs_f32 (a)); > > > > +} > > > > + > > > > +/* > > > > +** t2: > > > > +** orr v[0-9]+.4s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x4_t t2 (float32x4_t a) > > > > +{ > > > > + return vnegq_f32 (vabsq_f32 (a)); > > > > +} > > > > + > > > > +/* > > > > +** t3: > > > > +** adrp x0, .LC[0-9]+ > > > > +** ldr q[0-9]+, \[x0, #:lo12:.LC0\] > > > > +** orr v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b > > > > +** ret > > > > +*/ > > > > +float64x2_t t3 (float64x2_t a) > > > > +{ > > > > + return vnegq_f64 (vabsq_f64 (a)); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..141121176b309e4b2aa413dc55271a6e3c93d5e1 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c > > > > @@ -0,0 +1,31 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float32_t f1 (float32_t a) > > > > +{ > > > > + return -fabsf (a); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float64_t f2 (float64_t a) > > > > +{ > > > > + return -fabs (a); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..b4652173a95d104ddfa70c497f0627a61ea89d3b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c > > > > @@ -0,0 +1,36 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** ... > > > > +** ldr q[0-9]+, \[x0\] > > > > +** orr v[0-9]+.4s, #128, lsl #24 > > > > +** str q[0-9]+, \[x0\], 16 > > > > +** ... > > > > +*/ > > > > +void f1 (float32_t *a, int n) > > > > +{ > > > > + for (int i = 0; i < (n & -8); i++) > > > > + a[i] = -fabsf (a[i]); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** ... > > > > +** ldr q[0-9]+, \[x0\] > > > > +** orr v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b > > > > +** str q[0-9]+, \[x0\], 16 > > > > +** ... > > > > +*/ > > > > +void f2 (float64_t *a, int n) > > > > +{ > > > > + for (int i = 0; i < (n & -8); i++) > > > > + a[i] = -fabs (a[i]); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..10879dea74462d34b26160eeb0bd54ead063166b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c > > > > @@ -0,0 +1,39 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** negabs: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +double negabs (double x) > > > > +{ > > > > + unsigned long long y; > > > > + memcpy (&y, &x, sizeof(double)); > > > > + y = y | (1UL << 63); > > > > + memcpy (&x, &y, sizeof(double)); > > > > + return x; > > > > +} > > > > + > > > > +/* > > > > +** negabsf: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float negabsf (float x) > > > > +{ > > > > + unsigned int y; > > > > + memcpy (&y, &x, sizeof(float)); > > > > + y = y | (1U << 31); > > > > + memcpy (&x, &y, sizeof(float)); > > > > + return x; > > > > +} > > > > + > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..0c7664e6de77a497682952653ffd417453854d52 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > > > > @@ -0,0 +1,37 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** t1: > > > > +** orr v[0-9]+.2s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x2_t t1 (float32x2_t a) > > > > +{ > > > > + return vneg_f32 (vabs_f32 (a)); > > > > +} > > > > + > > > > +/* > > > > +** t2: > > > > +** orr v[0-9]+.4s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x4_t t2 (float32x4_t a) > > > > +{ > > > > + return vnegq_f32 (vabsq_f32 (a)); > > > > +} > > > > + > > > > +/* > > > > +** t3: > > > > +** adrp x0, .LC[0-9]+ > > > > +** ldr q[0-9]+, \[x0, #:lo12:.LC0\] > > > > +** orr v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b > > > > +** ret > > > > +*/ > > > > +float64x2_t t3 (float64x2_t a) > > > > +{ > > > > + return vnegq_f64 (vabsq_f64 (a)); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..a60cd31b9294af2dac69eed1c93f899bd5c78fca > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > > > > @@ -0,0 +1,29 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float32_t f1 (float32_t a) > > > > +{ > > > > + return -fabsf (a); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float64_t f2 (float64_t a) > > > > +{ > > > > + return -fabs (a); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..1bf34328d8841de8e6b0a5458562a9f00e31c275 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c > > > > @@ -0,0 +1,34 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** ... > > > > +** ld1w z[0-9]+.s, p[0-9]+/z, \[x0, x2, lsl 2\] > > > > +** orr z[0-9]+.s, z[0-9]+.s, #0x80000000 > > > > +** st1w z[0-9]+.s, p[0-9]+, \[x0, x2, lsl 2\] > > > > +** ... > > > > +*/ > > > > +void f1 (float32_t *a, int n) > > > > +{ > > > > + for (int i = 0; i < (n & -8); i++) > > > > + a[i] = -fabsf (a[i]); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** ... > > > > +** ld1d z[0-9]+.d, p[0-9]+/z, \[x0, x2, lsl 3\] > > > > +** orr z[0-9]+.d, z[0-9]+.d, #0x8000000000000000 > > > > +** st1d z[0-9]+.d, p[0-9]+, \[x0, x2, lsl 3\] > > > > +** ... > > > > +*/ > > > > +void f2 (float64_t *a, int n) > > > > +{ > > > > + for (int i = 0; i < (n & -8); i++) > > > > + a[i] = -fabs (a[i]); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d01f6604ca7be87e3744d494 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > > > > @@ -0,0 +1,37 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** negabs: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +double negabs (double x) > > > > +{ > > > > + unsigned long long y; > > > > + memcpy (&y, &x, sizeof(double)); > > > > + y = y | (1UL << 63); > > > > + memcpy (&x, &y, sizeof(double)); > > > > + return x; > > > > +} > > > > + > > > > +/* > > > > +** negabsf: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float negabsf (float x) > > > > +{ > > > > + unsigned int y; > > > > + memcpy (&y, &x, sizeof(float)); > > > > + y = y | (1U << 31); > > > > + memcpy (&x, &y, sizeof(float)); > > > > + return x; > > > > +} > > > > + > > > > > > > > > > > > > > > > > > > > -- > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)