From: Di Zhao OS <dizhao@os.amperecomputing.com>
To: Di Zhao OS <dizhao@os.amperecomputing.com>,
Thomas Schwinge <thomas@codesourcery.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <richard.guenther@gmail.com>
Subject: RE: [PATCH v4] [tree-optimization/110279] Consider FMA in get_reassociation_width
Date: Fri, 22 Dec 2023 15:05:37 +0000 [thread overview]
Message-ID: <SN6PR01MB42408DB958BBF389520545ADE894A@SN6PR01MB4240.prod.exchangelabs.com> (raw)
In-Reply-To: <SN6PR01MB42401BF525027F4958C97AAEE891A@SN6PR01MB4240.prod.exchangelabs.com>
[-- Attachment #1: Type: text/plain, Size: 6905 bytes --]
Updated the fix in attachment.
Is it OK for trunk?
Tested on aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu.
Thanks,
Di Zhao
> -----Original Message-----
> From: Di Zhao OS <dizhao@os.amperecomputing.com>
> Sent: Sunday, December 17, 2023 8:31 PM
> To: Thomas Schwinge <thomas@codesourcery.com>; gcc-patches@gcc.gnu.org
> Cc: Richard Biener <richard.guenther@gmail.com>
> Subject: RE: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
>
> Hello Thomas,
>
> > -----Original Message-----
> > From: Thomas Schwinge <thomas@codesourcery.com>
> > Sent: Friday, December 15, 2023 5:46 PM
> > To: Di Zhao OS <dizhao@os.amperecomputing.com>; gcc-patches@gcc.gnu.org
> > Cc: Richard Biener <richard.guenther@gmail.com>
> > Subject: RE: [PATCH v4] [tree-optimization/110279] Consider FMA in
> > get_reassociation_width
> >
> > Hi!
> >
> > On 2023-12-13T08:14:28+0000, Di Zhao OS <dizhao@os.amperecomputing.com>
> wrote:
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/pr110279-2.c
> > > @@ -0,0 +1,41 @@
> > > +/* PR tree-optimization/110279 */
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-Ofast --param tree-reassoc-width=4 --param fully-
> > pipelined-fma=1 -fdump-tree-reassoc2-details -fdump-tree-optimized" } */
> > > +/* { dg-additional-options "-march=armv8.2-a" { target aarch64-*-* } } */
> > > +
> > > +#define LOOP_COUNT 800000000
> > > +typedef double data_e;
> > > +
> > > +#include <stdio.h>
> > > +
> > > +__attribute_noinline__ data_e
> > > +foo (data_e in)
> >
> > Pushed to master branch commit 91e9e8faea4086b3b8aef2355fc12c1559d425f6
> > "Fix 'gcc.dg/pr110279-2.c' syntax error due to '__attribute_noinline__'",
> > see attached.
> >
> > However:
> >
> > > +{
> > > + data_e a1, a2, a3, a4;
> > > + data_e tmp, result = 0;
> > > + a1 = in + 0.1;
> > > + a2 = in * 0.1;
> > > + a3 = in + 0.01;
> > > + a4 = in * 0.59;
> > > +
> > > + data_e result2 = 0;
> > > +
> > > + for (int ic = 0; ic < LOOP_COUNT; ic++)
> > > + {
> > > + /* Test that a complete FMA chain with length=4 is not broken. */
> > > + tmp = a1 + a2 * a2 + a3 * a3 + a4 * a4 ;
> > > + result += tmp - ic;
> > > + result2 = result2 / 2 - tmp;
> > > +
> > > + a1 += 0.91;
> > > + a2 += 0.1;
> > > + a3 -= 0.01;
> > > + a4 -= 0.89;
> > > +
> > > + }
> > > +
> > > + return result + result2;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-not "was chosen for reassociation"
> > "reassoc2"} } */
> > > +/* { dg-final { scan-tree-dump-times {\.FMA } 3 "optimized"} } */
>
> Thank you for the fix.
>
> > ..., I still see these latter two tree dump scans FAIL, for GCN:
> >
> > $ grep -C2 'was chosen for reassociation' pr110279-2.c.197t.reassoc2
> > 2 *: a3_40
> > 2 *: a2_39
> > Width = 4 was chosen for reassociation
> > Transforming _15 = powmult_1 + powmult_3;
> > into _63 = powmult_1 + a1_38;
> > $ grep -F .FMA pr110279-2.c.265t.optimized
> > _63 = .FMA (a2_39, a2_39, a1_38);
> > _64 = .FMA (a3_40, a3_40, powmult_5);
> >
> > ..., nvptx:
> >
> > $ grep -C2 'was chosen for reassociation' pr110279-2.c.197t.reassoc2
> > 2 *: a3_40
> > 2 *: a2_39
> > Width = 4 was chosen for reassociation
> > Transforming _15 = powmult_1 + powmult_3;
> > into _63 = powmult_1 + a1_38;
> > $ grep -F .FMA pr110279-2.c.265t.optimized
> > _63 = .FMA (a2_39, a2_39, a1_38);
> > _64 = .FMA (a3_40, a3_40, powmult_5);
>
> For these 2 targets, the reassoc_width for FMUL is 1 (default value),
> While the testcase assumes that to be 4. The bug was introduced when I
> updated the patch but forgot to update the testcase.
>
> > ..., but also x86_64-pc-linux-gnu:
> >
> > $ grep -C2 'was chosen for reassociation' pr110279-2.c.197t.reassoc2
> > 2 *: a3_40
> > 2 *: a2_39
> > Width = 2 was chosen for reassociation
> > Transforming _15 = powmult_1 + powmult_3;
> > into _63 = powmult_1 + powmult_3;
> > $ grep -cF .FMA pr110279-2.c.265t.optimized
> > 0
>
> For x86_64 this needs "-mfma". Sorry the compile options missed that.
> Can the change below fix these issues? I moved them into
> testsuite/gcc.target/aarch64, since they rely on tunings.
>
> Tested on aarch64-unknown-linux-gnu.
>
> >
> > Grüße
> > Thomas
> >
> >
> > -----------------
> > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
> 80634
> > München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
> > Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
> > München, HRB 106955
>
> Thanks,
> Di Zhao
>
> ---
> gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-1.c | 3 +--
> gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-2.c | 3 +--
> 2 files changed, 2 insertions(+), 4 deletions(-)
> rename gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-1.c (83%)
> rename gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-2.c (78%)
>
> diff --git a/gcc/testsuite/gcc.dg/pr110279-1.c
> b/gcc/testsuite/gcc.target/aarch64/pr110279-1.c
> similarity index 83%
> rename from gcc/testsuite/gcc.dg/pr110279-1.c
> rename to gcc/testsuite/gcc.target/aarch64/pr110279-1.c
> index f25b6aec967..97d693f56a5 100644
> --- a/gcc/testsuite/gcc.dg/pr110279-1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr110279-1.c
> @@ -1,6 +1,5 @@
> /* { dg-do compile } */
> -/* { dg-options "-Ofast --param avoid-fma-max-bits=512 --param tree-reassoc-
> width=4 -fdump-tree-widening_mul-details" } */
> -/* { dg-additional-options "-march=armv8.2-a" { target aarch64-*-* } } */
> +/* { dg-options "-Ofast -mcpu=generic --param avoid-fma-max-bits=512 --param
> tree-reassoc-width=4 -fdump-tree-widening_mul-details" } */
>
> #define LOOP_COUNT 800000000
> typedef double data_e;
> diff --git a/gcc/testsuite/gcc.dg/pr110279-2.c
> b/gcc/testsuite/gcc.target/aarch64/pr110279-2.c
> similarity index 78%
> rename from gcc/testsuite/gcc.dg/pr110279-2.c
> rename to gcc/testsuite/gcc.target/aarch64/pr110279-2.c
> index b6b69969c6b..a88cb361fdc 100644
> --- a/gcc/testsuite/gcc.dg/pr110279-2.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr110279-2.c
> @@ -1,7 +1,6 @@
> /* PR tree-optimization/110279 */
> /* { dg-do compile } */
> -/* { dg-options "-Ofast --param tree-reassoc-width=4 --param fully-pipelined-
> fma=1 -fdump-tree-reassoc2-details -fdump-tree-optimized" } */
> -/* { dg-additional-options "-march=armv8.2-a" { target aarch64-*-* } } */
> +/* { dg-options "-Ofast -mcpu=generic --param tree-reassoc-width=4 --param
> fully-pipelined-fma=1 -fdump-tree-reassoc2-details -fdump-tree-optimized" } */
>
> #define LOOP_COUNT 800000000
> typedef double data_e;
> --
> 2.25.1
[-- Attachment #2: 0001-Fix-compile-options-of-pr110279-1.c-and-pr110279-2.c.patch --]
[-- Type: application/octet-stream, Size: 2606 bytes --]
From 216976028c4d5d66b1666fe501abb869d480c214 Mon Sep 17 00:00:00 2001
From: "dzhao.ampere" <di.zhao@amperecomputing.com>
Date: Sun, 17 Dec 2023 19:33:42 +0800
Subject: [PATCH] Fix compile options of pr110279-1.c and pr110279-2.c
The two testcases are for targets that support FMA. And
pr110279-2.c assumes reassoc_width of FMUL to be 4.
This patch adds missing options, to fix regression test failures
on nvptx/GCN (default reassoc_width of FMUL is 1) and x86_64
(need "-mfma").
gcc/testsuite/ChangeLog:
* gcc.dg/pr110279-1.c: Add "-mcpu=generic" for aarch64; add
"-mfma" for x86_64.
* gcc.dg/pr110279-2.c: Replace "-march=armv8.2-a" with
"-mcpu=generic"; limit the check to be on aarch64.
---
gcc/testsuite/gcc.dg/pr110279-1.c | 3 ++-
gcc/testsuite/gcc.dg/pr110279-2.c | 6 +++---
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/gcc/testsuite/gcc.dg/pr110279-1.c b/gcc/testsuite/gcc.dg/pr110279-1.c
index f25b6aec967..c2737418afe 100644
--- a/gcc/testsuite/gcc.dg/pr110279-1.c
+++ b/gcc/testsuite/gcc.dg/pr110279-1.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
/* { dg-options "-Ofast --param avoid-fma-max-bits=512 --param tree-reassoc-width=4 -fdump-tree-widening_mul-details" } */
-/* { dg-additional-options "-march=armv8.2-a" { target aarch64-*-* } } */
+/* { dg-additional-options "-mfma" { target i?86-*-* x86_64-*-* } } */
+/* { dg-additional-options "-mcpu=generic" { target aarch64*-*-* } } */
#define LOOP_COUNT 800000000
typedef double data_e;
diff --git a/gcc/testsuite/gcc.dg/pr110279-2.c b/gcc/testsuite/gcc.dg/pr110279-2.c
index b6b69969c6b..135e64882d1 100644
--- a/gcc/testsuite/gcc.dg/pr110279-2.c
+++ b/gcc/testsuite/gcc.dg/pr110279-2.c
@@ -1,7 +1,7 @@
/* PR tree-optimization/110279 */
/* { dg-do compile } */
/* { dg-options "-Ofast --param tree-reassoc-width=4 --param fully-pipelined-fma=1 -fdump-tree-reassoc2-details -fdump-tree-optimized" } */
-/* { dg-additional-options "-march=armv8.2-a" { target aarch64-*-* } } */
+/* { dg-additional-options "-mcpu=generic" { target aarch64*-*-* } } */
#define LOOP_COUNT 800000000
typedef double data_e;
@@ -35,5 +35,5 @@ foo (data_e in)
return result + result2;
}
-/* { dg-final { scan-tree-dump-not "was chosen for reassociation" "reassoc2"} } */
-/* { dg-final { scan-tree-dump-times {\.FMA } 3 "optimized"} } */
\ No newline at end of file
+/* { dg-final { scan-tree-dump-not "was chosen for reassociation" "reassoc2" { target aarch64*-*-* }} } */
+/* { dg-final { scan-tree-dump-times {\.FMA } 3 "optimized" { target aarch64*-*-* }} } */
\ No newline at end of file
--
2.25.1
next prev parent reply other threads:[~2023-12-22 15:05 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-14 12:43 Di Zhao OS
2023-10-06 9:33 ` Richard Biener
2023-10-08 16:39 ` Di Zhao OS
2023-10-23 3:49 ` [PING][PATCH " Di Zhao OS
2023-10-31 13:47 ` [PATCH " Richard Biener
2023-11-09 17:53 ` Di Zhao OS
2023-11-21 13:01 ` Richard Biener
2023-11-29 14:35 ` Di Zhao OS
2023-12-11 11:01 ` Richard Biener
2023-12-13 8:14 ` Di Zhao OS
2023-12-13 9:00 ` Richard Biener
2023-12-14 20:55 ` Di Zhao OS
2023-12-15 7:23 ` Richard Biener
2023-12-15 9:46 ` Thomas Schwinge
2023-12-17 12:30 ` Di Zhao OS
2023-12-22 15:05 ` Di Zhao OS [this message]
2023-12-22 15:39 ` Richard Biener
2023-12-27 9:35 ` Di Zhao OS
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SN6PR01MB42408DB958BBF389520545ADE894A@SN6PR01MB4240.prod.exchangelabs.com \
--to=dizhao@os.amperecomputing.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
--cc=thomas@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).