public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work117)] Update ChangeLog.*
@ 2023-04-06 18:24 Michael Meissner
0 siblings, 0 replies; 6+ messages in thread
From: Michael Meissner @ 2023-04-06 18:24 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:fda75a1282b7d69dbc40394621e4b3b0de0d33a4
commit fda75a1282b7d69dbc40394621e4b3b0de0d33a4
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Thu Apr 6 13:21:53 2023 -0400
Update ChangeLog.*
Diff:
---
gcc/ChangeLog.meissner | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index af161e66bf9..3eb363f1474 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,4 +1,4 @@
-==================== Branch work117, patch #1 ====================
+==================== Branch work117, patch #2 ====================
Do not generate fmaddfp and fnmsubfp
@@ -6,22 +6,45 @@ The Altivec instructions fmaddfp and fnmsubfp have different rounding behaviors
than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating
these instructions seems to break Eigen.
-2023-04-05 Michael Meissner <meissner@linux.ibm.com>
+GCC has generated the Altivec fmaddfp and fnmsubfp instructions on VSX systems
+as an alternative to the xsmadd{a,m}sp and xsnmsub{a,m}sp instructions. The
+advantage of the Altivec instructions is that they are 4 operand instructions
+(i.e. the target register does not have to overlap with one of the input
+registers). The advantage is it can eliminate an extra move instruction. The
+disadvantage is it does round the same was as the VSX instructions.
+
+This patch eliminates the generation of the Altivec fmaddfp and fnmsubfp
+instructions as alternatives in the VSX instruction insn support, and in the
+Altivec insns it adds a test to prevent the insn from being used if VSX is
+available. I also added a test to the regression test suite.
+
+I have done bootstrap builds on power9 little endian (with both IEEE long
+double and IBM long double). I have also done the builds and test on a power8
+big endian system (testing both 32-bit and 64-bit code generation). Chip has
+verified that it fixes the problem that Eigen encountered. Can I check this
+into the master GCC branch? After a burn-in period, can I check this patch
+into the active GCC branches?
+
+Thanks in advance.
+
+2023-04-06 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/70243
- * config/rs6000/altivec.md (altivec_fmav4sf4): Add a test to prevent
- fmaddfp and fnmsubfp from being generated on VSX systems.
- (altivec_vnmsubfp): Likewise.
- * config/rs6000/rs6000.md (vsx_fmav4sf4): Do not generate fmaddfp or
- fnmsubfp.
+ * config/rs6000/rs6000.md (isa attribute): Add fastmath.
+ (enabled attribute): Add support for fastmath.
+ * config/rs6000/vsx.md (vsx_fmav4sf4): Set the isa attribute to
+ fastmath to disable Altivec instruction generatins normally.
(vsx_nfmsv4sf4): Likewise.
gcc/testsuite/
PR target/70243
* gcc.target/powerpc/pr70243.c: New test.
+ * gcc.target/powerpc/pr70243-2.c: New test.
+
+==================== Branch work117, patch #1 was reverted ====================
==================== Branch work117, baseline ====================
^ permalink raw reply [flat|nested] 6+ messages in thread
* [gcc(refs/users/meissner/heads/work117)] Update ChangeLog.*
@ 2023-04-08 0:56 Michael Meissner
0 siblings, 0 replies; 6+ messages in thread
From: Michael Meissner @ 2023-04-08 0:56 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:a4b24c7844c39aaa09e2850bab1689b58636c0d0
commit a4b24c7844c39aaa09e2850bab1689b58636c0d0
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Fri Apr 7 20:56:41 2023 -0400
Update ChangeLog.*
Diff:
---
gcc/ChangeLog.meissner | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 6187129d7f6..496e305bf0b 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -2,6 +2,12 @@
Do not generate vmaddfp and vnmsubfp
+This is version 3 of the patch. This is essentially version 1 with the removal
+of changes to altivec.md, and cleanup of the comments.
+
+Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used,
+and those changes are deleted in this patch.
+
The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors
than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating
these instructions seems to break Eigen on big endian systems.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [gcc(refs/users/meissner/heads/work117)] Update ChangeLog.*
@ 2023-04-07 18:35 Michael Meissner
0 siblings, 0 replies; 6+ messages in thread
From: Michael Meissner @ 2023-04-07 18:35 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:183d391c79c290b28f4e5a57e01686a9f800cd96
commit 183d391c79c290b28f4e5a57e01686a9f800cd96
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Fri Apr 7 14:35:44 2023 -0400
Update ChangeLog.*
Diff:
---
gcc/ChangeLog.meissner | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 80190b5d6b4..6187129d7f6 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,33 @@
+==================== Branch work117, patch #4 ====================
+
+Do not generate vmaddfp and vnmsubfp
+
+The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors
+than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating
+these instructions seems to break Eigen on big endian systems.
+
+I have done bootstrap builds on power9 little endian (with both IEEE long
+double and IBM long double). I have also done the builds and test on a power8
+big endian system (testing both 32-bit and 64-bit code generation). Chip has
+verified that it fixes the problem that Eigen encountered. Can I check this
+into the master GCC branch? After a burn-in period, can I check this patch
+into the active GCC branches?
+
+Thanks in advance.
+
+2023-04-07 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/70243
+ * config/rs6000/rs6000.md (vsx_fmav4sf4): Do not generate vmaddfp.
+ (vsx_nfmsv4sf4): Do not generate vnmsubfp.
+
+gcc/testsuite/
+
+ PR target/70243
+ * gcc.target/powerpc/pr70243.c: New test.
+
==================== Branch work117, patch #3 was reverted ====================
==================== Branch work117, patch #2 was reverted ====================
^ permalink raw reply [flat|nested] 6+ messages in thread
* [gcc(refs/users/meissner/heads/work117)] Update ChangeLog.*
@ 2023-04-07 18:27 Michael Meissner
0 siblings, 0 replies; 6+ messages in thread
From: Michael Meissner @ 2023-04-07 18:27 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:b43af8b222874969509b2b58d6cff6fbfa2f2866
commit b43af8b222874969509b2b58d6cff6fbfa2f2866
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Fri Apr 7 02:37:29 2023 -0400
Update ChangeLog.*
Diff:
---
gcc/ChangeLog.meissner | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 8ef792ed8e3..801fea03f4b 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,4 +1,4 @@
-==================== Branch work117, patch #2 ====================
+==================== Branch work117, patch #3 ====================
PR target/70243: Do not generate vmaddfp and vnmsubfp
^ permalink raw reply [flat|nested] 6+ messages in thread
* [gcc(refs/users/meissner/heads/work117)] Update ChangeLog.*
@ 2023-04-07 18:27 Michael Meissner
0 siblings, 0 replies; 6+ messages in thread
From: Michael Meissner @ 2023-04-07 18:27 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:464b39902ce871e1c62b15e19015891a118b9eef
commit 464b39902ce871e1c62b15e19015891a118b9eef
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Fri Apr 7 00:38:07 2023 -0400
Update ChangeLog.*
Diff:
---
gcc/ChangeLog.meissner | 52 +++++++++++++++++++++++++++++++++++---------------
1 file changed, 37 insertions(+), 15 deletions(-)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 3eb363f1474..8ef792ed8e3 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,22 +1,42 @@
==================== Branch work117, patch #2 ====================
-Do not generate fmaddfp and fnmsubfp
+PR target/70243: Do not generate vmaddfp and vnmsubfp
-The Altivec instructions fmaddfp and fnmsubfp have different rounding behaviors
-than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating
-these instructions seems to break Eigen.
+This is version 2 of the patch. The first version was posted on April 6th.
-GCC has generated the Altivec fmaddfp and fnmsubfp instructions on VSX systems
-as an alternative to the xsmadd{a,m}sp and xsnmsub{a,m}sp instructions. The
-advantage of the Altivec instructions is that they are 4 operand instructions
-(i.e. the target register does not have to overlap with one of the input
-registers). The advantage is it can eliminate an extra move instruction. The
-disadvantage is it does round the same was as the VSX instructions.
+In this version, I eliminated the changes to Altivec.md that added checks to
+altivec_fmav4sf4 and altivec_vnmsubfp. After writing the code, I remembered
+that VECTOR_UNIT_ALTIVEC_P that is used by those insns will not be true if the
+VSX instruction set is enabled, so no additional test is needed.
-This patch eliminates the generation of the Altivec fmaddfp and fnmsubfp
-instructions as alternatives in the VSX instruction insn support, and in the
-Altivec insns it adds a test to prevent the insn from being used if VSX is
-available. I also added a test to the regression test suite.
+As we discussed in a private chat room, I modified the code to generate vmaddfp
+and vnmsubfp if -Ofast (-ffast-math) is used. This allows the compiler to
+eliminate the extra move if the user does not care about strict floating point
+code generation, but it generates only the VSX instructions in the normal
+case.
+
+I reworked the examples and split them into two tests to test both the normal
+case when -Ofast is not used and when it is used.
+
+I also fixed the instructions mentioned in the comments to be the actual
+instructions (vmaddfp and vnmsubfp) instead of fmaddfp and fnmsubdp. Sorry
+about tat.
+
+The AltiVec (VMX) instructions vmaddfp and vnmsubfp have different rounding
+behaviors than the VSX xvmadd{a,m}sp and xvnmsub{a,m}sp instructions. In
+particular, generating these instructions seems to break Eigen.
+
+The bug is that GCC has generated the VMX vmaddfp and vnmsubfp instructions on
+VSX systems as an alternative to the xsmadd{a,m}sp and xsnmsub{a,m}sp
+instructions. The advantage of the VMX instructions is that they are 4 operand
+instructions (i.e. the target register does not have to overlap with one of the
+input registers). This can mean that the compiler can eliminate an extra move
+instruction. The disadvantage of generating these instructions is it does not
+round the same was as the VSX instructions.
+
+This patch will only generate the VMX vmaddfp and vnmsubfp instructions as
+alternatives in the VSX instruction insn support if -Ofast (-ffast-math) is
+used. I also added 2 tests to the regression suite.
I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double). I have also done the builds and test on a power8
@@ -27,7 +47,7 @@ into the active GCC branches?
Thanks in advance.
-2023-04-06 Michael Meissner <meissner@linux.ibm.com>
+2023-04-07 Michael Meissner <meissner@linux.ibm.com>
gcc/
@@ -44,6 +64,8 @@ gcc/testsuite/
* gcc.target/powerpc/pr70243.c: New test.
* gcc.target/powerpc/pr70243-2.c: New test.
+==================== Branch work117, patch #2 was reverted ====================
+
==================== Branch work117, patch #1 was reverted ====================
==================== Branch work117, baseline ====================
^ permalink raw reply [flat|nested] 6+ messages in thread
* [gcc(refs/users/meissner/heads/work117)] Update ChangeLog.*
@ 2023-04-05 23:21 Michael Meissner
0 siblings, 0 replies; 6+ messages in thread
From: Michael Meissner @ 2023-04-05 23:21 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:5a41ee3c44a6e31a1b5a569cd7b8365df1dac977
commit 5a41ee3c44a6e31a1b5a569cd7b8365df1dac977
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Wed Apr 5 19:21:45 2023 -0400
Update ChangeLog.*
Diff:
---
gcc/ChangeLog.meissner | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 942671c488f..af161e66bf9 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,28 @@
+==================== Branch work117, patch #1 ====================
+
+Do not generate fmaddfp and fnmsubfp
+
+The Altivec instructions fmaddfp and fnmsubfp have different rounding behaviors
+than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating
+these instructions seems to break Eigen.
+
+2023-04-05 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/70243
+ * config/rs6000/altivec.md (altivec_fmav4sf4): Add a test to prevent
+ fmaddfp and fnmsubfp from being generated on VSX systems.
+ (altivec_vnmsubfp): Likewise.
+ * config/rs6000/rs6000.md (vsx_fmav4sf4): Do not generate fmaddfp or
+ fnmsubfp.
+ (vsx_nfmsv4sf4): Likewise.
+
+gcc/testsuite/
+
+ PR target/70243
+ * gcc.target/powerpc/pr70243.c: New test.
+
==================== Branch work117, baseline ====================
2023-04-05 Michael Meissner <meissner@linux.ibm.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-04-08 0:56 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-06 18:24 [gcc(refs/users/meissner/heads/work117)] Update ChangeLog.* Michael Meissner
-- strict thread matches above, loose matches on Subject: below --
2023-04-08 0:56 Michael Meissner
2023-04-07 18:35 Michael Meissner
2023-04-07 18:27 Michael Meissner
2023-04-07 18:27 Michael Meissner
2023-04-05 23:21 Michael Meissner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).