public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-06-02 19:14 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-06-02 19:14 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:1917d73c7bc6a299964803dec1279c1ffd4cc430

commit 1917d73c7bc6a299964803dec1279c1ffd4cc430
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Fri Jun 2 15:14:40 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index a72fff69d53..fe59d9cd0a9 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,4 +1,39 @@
-==================== Branch work121, patch #40 was reverted ====================
+==================== Branch work121, patch #40 ====================
+
+PR target/89213 - Optimize vector shift by a constant.
+
+Optimize vector shifts by a constant, taking advantage that the shift
+instructions only look at the bits within the element.
+
+The PowerPC doesn't have a VSPLTID instruction.  This meant that if we are doing
+a V2DI shift of 0..15, we had to do VSPLTIW and VEXTSW2D instructions to load
+the constant into the vector register.
+
+Similarly for V4SI and V2DI, if we wanted to shift more than 15 bits, we would
+generate XXSPLTIB and VEXTSB2D or VEXTSB2W instructions to load the constant
+into the vector register.
+
+Given the vector shift instructions only look at the bottom 5 or 6 bits of the
+shift value, we can load the constant via VSPLTISW or XXSPLTIB instructions and
+eliminate the sign extend instructions (VEXTSW2D, VEXTSB2D, and VEXTSB2W).
+
+2023-06-02  Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/89213
+	* config/rs6000/altivec.md (UNSPEC_VECTOR_SHIFT): New unspec.
+	(V4SI_V2DI): New mode iterator.
+	(vshift_code): New code iterator.
+	(altivec_<code>_const_<mode): New insns.
+	(altivec_shift_const_<mode>): New insns.
+	* config/rs6000/predicates.md (vector_shift_constant): New
+	predicate.
+
+gcc/testsuite/
+
+	PR target/89213
+	* gcc.target/powerpc/pr89213.c: New test.
 
 ==================== Branch work121, patch #37 ====================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-06-03  4:07 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-06-03  4:07 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:9ad58bc668ccff75ffc89cf6a18aabeadced6dbe

commit 9ad58bc668ccff75ffc89cf6a18aabeadced6dbe
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Sat Jun 3 00:07:28 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index fe59d9cd0a9..b047e0712b2 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -17,7 +17,7 @@ Given the vector shift instructions only look at the bottom 5 or 6 bits of the
 shift value, we can load the constant via VSPLTISW or XXSPLTIB instructions and
 eliminate the sign extend instructions (VEXTSW2D, VEXTSB2D, and VEXTSB2W).
 
-2023-06-02  Michael Meissner  <meissner@linux.ibm.com>
+2023-06-03  Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/
 
@@ -34,6 +34,7 @@ gcc/testsuite/
 
 	PR target/89213
 	* gcc.target/powerpc/pr89213.c: New test.
+	* gcc.target/powerpc/vec-rlmi-rlnm.c: Update insn count.
 
 ==================== Branch work121, patch #37 ====================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-06-02  4:15 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-06-02  4:15 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:938734aea522cc3a73a1d163d004f50e30688392

commit 938734aea522cc3a73a1d163d004f50e30688392
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Fri Jun 2 00:15:06 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 2772ab6ccb6..4d9fec8a92f 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,31 @@
+==================== Branch work121, patch #40 ====================
+
+PR target/89213 - Optimize vector shift by a constant.
+
+On power9 and power10 systems, optimize vector shifts by a constant, taking
+advantage that the shift instructions only look at the bits within the element.
+
+2023-06-02  Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/89213
+	* config/rs6000/altivec.md (UNSPEC_VECTOR_SHIFT): New unspec.
+	(VSHIFT_MODE): New mode iterator.
+	(vshift_code): New code iterator.
+	(vshift_attr): New code attribute.
+	(altivec_<mode>_<vshift_attr>_const): New pattern to optimize
+	vector long long/int shifts by a constant.
+	(altivec_<mode>_shift_const): New helper insn to load up a
+	constant used by the shift operation.
+	* config/rs6000/predicates.md (vector_shift_constant): New
+	predicate.
+
+gcc/testsuite/
+
+	PR target/89213
+	* gcc.target/powerpc/pr89213.c: New test.
+
 ==================== Branch work121, patch #37 ====================
 
 Update fold-vec-extract insn counts on 32-bit big endian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-18  5:13 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-18  5:13 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:201de78e4290e524b736fbd2e65945dec76305e3

commit 201de78e4290e524b736fbd2e65945dec76305e3
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Thu May 18 01:13:33 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index ba56e9787f1..2772ab6ccb6 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -6,12 +6,15 @@ In running tests, I noticed on big endian systems that the expected ADDIs is
 higher than the current number of ADDIs generated by the compiler.  This patch
 adjusts those counts.
 
-2023-05-17   Michael Meissner  <meissner@linux.ibm.com>
+2023-05-18   Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/testsuite/
 
-	* gcc.target/powerpc/fold-vec-extract-float.p8.c: Update insn count for
+	* gcc.target/powerpc/fold-vec-extract-char.p7.c: Update insn count for
 	32-bit.
+	* gcc.target/powerpc/fold-vec-extract-double.p7.c: Likewise.
+	* gcc.target/powerpc/fold-vec-extract-float.p7.c: Likewise.
+	* gcc.target/powerpc/fold-vec-extract-float.p8.c: Likewise.
 	* gcc.target/powerpc/fold-vec-extract-int.p7.c: Likewise.
 	* gcc.target/powerpc/fold-vec-extract-int.p8.c: Likewise.
 	* gcc.target/powerpc/fold-vec-extract-short.p7.c: Likewise.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-17 21:26 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-17 21:26 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:e699c67d84e9511c658cc40a7d34ec9c2947ad32

commit e699c67d84e9511c658cc40a7d34ec9c2947ad32
Author: MichaelMeissner <github@the-meissners.org>
Date:   Wed May 17 17:26:17 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index e646ec9b076..ba56e9787f1 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,22 @@
+==================== Branch work121, patch #37 ====================
+
+Update fold-vec-extract insn counts on 32-bit big endian
+
+In running tests, I noticed on big endian systems that the expected ADDIs is
+higher than the current number of ADDIs generated by the compiler.  This patch
+adjusts those counts.
+
+2023-05-17   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/testsuite/
+
+	* gcc.target/powerpc/fold-vec-extract-float.p8.c: Update insn count for
+	32-bit.
+	* gcc.target/powerpc/fold-vec-extract-int.p7.c: Likewise.
+	* gcc.target/powerpc/fold-vec-extract-int.p8.c: Likewise.
+	* gcc.target/powerpc/fold-vec-extract-short.p7.c: Likewise.
+	* gcc.target/powerpc/fold-vec-extract-short.p8.c: Likewise.
+
 ==================== Branch work121, patch #36 ====================
 
 Optimize variable element vec_extract to be converted to floating point

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-15 23:31 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-15 23:31 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:bea291a30c2cf2cb9a1222177759294bdb2d0e11

commit bea291a30c2cf2cb9a1222177759294bdb2d0e11
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Mon May 15 19:31:14 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 270 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 270 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 2ac5ac844e9..e646ec9b076 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,273 @@
+==================== Branch work121, patch #36 ====================
+
+Optimize variable element vec_extract to be converted to floating point
+
+This patch optimizes vec_extract with a variable element number of the following
+types to be converted to floating point by loading the value directly to the
+vector register, and then doing the conversion instead of loading the value to a
+GPR and then doing a direct move:
+
+vector int
+vector unsigned int
+vector unsigned short
+vector unsigned char
+
+2023-05-18   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/vsx.md (vsx_extract_v4si_var_load_to_<uns><mode>): New
+	* insn.
+	* vsx_extract_<VSX_EXTRACT_I2:mode>_var_load_to_uns<SFDF:mode>: New
+	insn.
+
+gcc/testsuite/
+
+	* gcc.target/powerpc/vec-extract-mem-int-6.c: New file.
+	* gcc.target/powerpc/vec-extract-mem-int_7.c: New file.
+
+==================== Branch work121, patch #35 ====================
+
+Allow constant element vec_extract to be converted to floating point
+
+This patch allows vec_extract of the following types to be converted to
+floating point by loading the value directly to the vector register, and then
+doing the conversion instead of loading the value to a GPR and then doing a
+direct move:
+
+vector int
+vector unsigned int
+vector unsigned short
+vector unsigned char
+
+2023-05-15   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/rs6000.md (fp_int_extend): New code attribute.
+	* config/rs6000/vsx.md (vsx_extract_v4si_load_to_<uns><mode>): New
+	insn.
+	(vsx_extract_<VSX_EXTRACT_I2:mode>_load_to_uns<SFDF:mode>: New insn.
+	(vsx_extract_v4si_var_load_to_<uns><mode>): New insn.
+	(vsx_extract_<VSX_EXTRACT_I2:mode>_var_load_to_uns<SFDF:mode>): New
+	insn.
+
+gcc/testsuite/
+
+	* gcc.target/powerpc/vec-extract-mem-char-2.c: New file.
+	* gcc.target/powerpc/vec-extract-mem-int-4.c: New file.
+	* gcc.target/powerpc/vec-extract-mem-int_5.c: New file.
+	* gcc.target/powerpc/vec-extract-mem-short-4.c: New file.
+
+==================== Branch work121, patch #34 ====================
+
+Allow variable element vec_extract to be sign or zero extended
+
+This patch allows vec_extract of V4SI, V8HI, and V16QI vector types with a
+variable element number to be loaded with sign or zero extension, and GCC will
+not generate a separate zero/sign extension instruction.
+
+2023-05-15   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/vsx.md (vsx_extract_v4si_var_load_to_<su>di): New insn.
+	(vsx_extract_<VSX_EXTRACT_I2:mode>_var_load_to_u<GPR:mode>): New insn.
+	(vsx_extract_v8hi_var_load_to_s<mode>): New insn.
+
+gcc/testsuite/
+
+	* gcc.target/powerpc/vec-extract-mem-int-3.c: New file.
+	* gcc.target/powerpc/vec-extract-mem-short-3.c: New file.
+
+==================== Branch work121, patch #33 ====================
+
+Optimize vec_extract of small integer vectors with constant element from memory
+
+This patch optimizes vec_extract of small integer vectors (V16QI/V8HI/V4SI)
+using a constant element where the vector is in memory.  This patch combines the
+sign or zero extension of the vec_extract with the load.
+
+In addition, it also optimizes the load with sign/zero extension when the result
+is going into vector registers and the machine supports such combined load
+operations.
+
+V4SImode is optimized for both sign and zero extension to DImode where the
+target is either a GPR or a vector register.
+
+V8HImode is optimized for sign extension to either SImode or DImode when the
+target is a GPR register.  V8HImode is also optimized for zero extension when
+the target is a vector register on a power9/power10 system.  While you can do an
+unsigned HImode to a vector register and then do a VEXTSH2D instruction, I
+believe it it better to do a LHA and then a direct move operation.
+
+V16QImode is optimized for zero extension to either SImode or DImode when the
+target is a GPR register.  V16QImode is also optimized for zero extension when
+the target is a vector register on a power9/power19 system.  Since the PowerPC
+machine does not have a signed load byte instruction, I didn't implement a
+combine insn for this case.
+
+2023-05-15   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/vsx.md (vsx_extract_v4si_load_to_<su>di): New combiner
+	insn.
+	(vsx_extract_<VSX_EXTRACT_I2:mode>_load_to_u<GPR:mode>): Likewise.
+	(vsx_extract_v8hi_load_to_s<mode>): Likewise.
+
+gcc/testsuite/
+
+	* gcc.target/powerpc/vec-extract-mem-char-1.c: New test.
+	* gcc.target/powerpc/vec-extract-mem-int-1.c: New test.
+	* gcc.target/powerpc/vec-extract-mem-int-2.c: New test.
+	* gcc.target/powerpc/vec-extract-mem-short-1.c: New test.
+	* gcc.target/powerpc/vec-extract-mem-short-2.c: New test.
+
+==================== Branch work121, patch #32 ====================
+
+Add alternatives for vec_extract with constant element loading from memory.
+
+This patch expands the alternatives for doing vec_extract of V4SI, V8HI, and
+V16QI vectors with a constant offset when the vector is in memory.  If the
+element number is 0 or we are using offsettable addressing for loading up GPR
+registers we don't need to allocate a temporary base register.  We can fold the
+offset from the vec_extract into the normal address.
+
+I also added alternatives to load the values into vector registers.  If we load
+the value into vector registers, we require X-form addressing.
+
+2023-05-12   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/vsx.md (VSX_EX_ISA): New mode attribute.
+	(vsx_extract_<mode>_load): Add more alternatives for memory options.
+	Allow the load to load up vector registers if needed.
+
+==================== Branch work121, patch #31 ====================
+
+Optimize vec_extract of V4SF with variable element number being converted to DF
+
+This patch adds a combiner insn to include the conversion of float to double
+within the memory address when vec_extract of V4SF with a variable element
+number is done.
+
+It also removes the '?' from the 'r' constraint so that if the SFmode is needed
+in a GPR, it doesn't have to load it to the vector unit, store it on the stack,
+and reload it into a GPR register.
+
+2023-05-12   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/vsx.md (vsx_extract_v4sf_var_load): Remove '?' from 'r'
+	constraint.
+	(vsx_extract_v4sf_var_load_to_df): New insn.
+
+gcc/testsuite/
+
+	* gcc.target/powerpc/vec-extract-mem-float-2.c: New file.
+
+==================== Branch work121, patch #30 ====================
+
+Optimize vec_extract of V4SF from memory with constant element numbers.
+
+This patch updates vec_extract of V4SF from memory with constant element
+numbers.
+
+I went through the alternatives, and I added alternatives to dentote when we
+don't need to allocate a temporary base register.  These cases include
+extracting element 0, and extracting elements 1-3 where we can use offsetable
+addresses.
+
+I added alternatives for power8 and power9 units to account for the expanded
+addressing on these machines (power8 can load SFmode into Altivec registers with
+x-form addressing, and power9 can use offsettable adressing to load up Altivec
+registers.
+
+This patch corrects the ISA for loading SF values to altivec registers to be
+power8 vector, and not power7.
+
+This patch adds a combiner patch to combine loading up a SF element and
+converting it to double.
+
+It also removes the '?' from the 'r' constraint so that if the SFmode is needed
+in a GPR, it doesn't have to load it to the vector unit, store it, and then
+reload it into the GPR register.
+
+2023-05-12   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* gcc/config/rs6000/vsx.md (vsx_extract_v4sf_load): Fix ISA for loading
+	up SFmode values with x-form addresses.  Remove ? from 'r' constraint.
+	Add more alternatives to prevent requiring a temporary base register if
+	we don't need the temporary.
+	(vsx_extract_v4sf_load_to_df): New insn.
+
+gc/testsuite/
+
+	* gcc.target/powerpc/vec-extract-mem-float-1.c: New file.
+
+==================== Branch work121, patch #22 ====================
+
+Fix typo in insn name.
+
+In doing other work, I noticed that there was an insn:
+
+	vsx_extract_v4sf_<mode>_load
+
+Which did not have an iterator.  I removed the useless <mode>.
+
+2023-05-12   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/vsx.md (vsx_extract_v4sf_load): Rename from
+	vsx_extract_v4sf_<mode>_load.
+
+==================== Branch work121, patch #21 ====================
+
+Improve 64->128 bit zero extension on PowerPC
+
+2023-05-12   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/108958
+	* gcc/config/rs6000.md (zero_extendditi2): New insn.
+
+gcc/testsuite/
+
+	PR target/108958
+	* gcc.target/powerpc/zero-extend-di-ti.c: New test.
+
+==================== Branch work121, patch #20 ====================
+
+Optimize vec_splats of vec_extract for long long and double.
+
+This patch optimizes cases like:
+
+	vector double v1, v2;
+	/* ... */
+	v2 = vec_splats (vec_extract (v1, 0);	/* or  */
+	v2 = vec_splats (vec_extract (v1, 1);
+
+2023-05-12   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/99293
+	* gcc/config/rs6000/vsx.md (vsx_splat_extract_<mode>): New combiner
+	insn.
+
+gcc/testsuite/
+
+	PR target/108958
+	* gcc.target/powerpc/pr99293.c: New test.
+	* gcc.target/powerpc/builtins-1.c: Update insn count.
+
 ==================== Branch work121, rebase up to 5/12/2023 ====================
 
 ==================== Branch work121, patch #5 ====================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-12 16:56 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-12 16:56 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:e3528acf7be0f10cd4b6536101efbb032fd4a285

commit e3528acf7be0f10cd4b6536101efbb032fd4a285
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Fri May 12 12:56:03 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 34 +++++++++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index b8ffd73b0c4..2ac5ac844e9 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,5 @@
+==================== Branch work121, rebase up to 5/12/2023 ====================
+
 ==================== Branch work121, patch #5 ====================
 
 PR target/105325: Fix memory constraints for power10 fusion
@@ -9,7 +11,16 @@ did not use the stricter predicates and constraints, and if the user used the
 -fstack-protector option, it would generate a non-prefixed load instruction
 whose offset was too big if the stack is large.
 
-2023-05-09   Michael Meissner  <meissner@linux.ibm.com>
+This patch has been tested on:
+
+    * Little endian power9 with both IEEE and IBM long double
+    * Little endian power10
+    * Big endian power8 using both 32-bit and 64-bit code generation.
+
+Can I check this into the master branch?  Assuming I can check this in, I will
+also commit to the active GCC branches after a burn-in period.
+
+2023-05-10   Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/
 
@@ -25,17 +36,26 @@ gcc/testsuite/
 	* g++.target/powerpc/pr105325.C: New test.
 	* gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
 
-
 ==================== Branch work121, patch #4 ====================
 
 PR target/105325, Rewrite genfusion.pl's gen_ld_cmpi_p10 function.
 
-This patch rewrites the gen_ld_cmpi_p10 function to be clearer.  The resulting
-fusion.md file that this patch generates is exactly the same as the previous
-version of genfusion.pl.  The next patch in this series will fix PR
-target/105325.
+This patch rewrites the gen_ld_cmpi_p10 function in genfusion.pl to be clearer.
+The resulting fusion.md file that this patch generates is exactly the same
+output that the previous version of genfusion.pl generated.  The next patch in
+this series will fix PR target/105325 (provide correct predicates and
+constraints for power10 fusion of load and compare immediate).
+
+This patch has been tested on:
+
+    * Little endian power9 with both IEEE and IBM long double
+    * Little endian power10
+    * Big endian power8 using both 32-bit and 64-bit code generation.
+
+Can I check this into the master branch?  Assuming I can check this in, I will
+also commit to the active GCC branches after a burn-in period.
 
-2023-05-09   Michael Meissner  <meissner@linux.ibm.com>
+2023-05-10   Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-12 16:51 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-12 16:51 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:bf53f84838cd9b6b6dacec0c48b6ed0b3151f462

commit bf53f84838cd9b6b6dacec0c48b6ed0b3151f462
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Tue May 9 16:45:43 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 0a3f9b83ff1..b8ffd73b0c4 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,6 +1,6 @@
-==================== Branch work121, patch #2 ====================
+==================== Branch work121, patch #5 ====================
 
-PR target/105325: Fix power10_fusion
+PR target/105325: Fix memory constraints for power10 fusion
 
 This patch applies stricter predicates and constraints for LD and LWA
 instructions with power10 fusion.  These instructions are DS-form instructions,
@@ -9,7 +9,7 @@ did not use the stricter predicates and constraints, and if the user used the
 -fstack-protector option, it would generate a non-prefixed load instruction
 whose offset was too big if the stack is large.
 
-2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+2023-05-09   Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/
 
@@ -26,23 +26,30 @@ gcc/testsuite/
 	* gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
 
 
-==================== Branch work121, patch #1 ====================
+==================== Branch work121, patch #4 ====================
 
-Rewrite genfusion.pl's gen_ld_cmpi_p10 function.
+PR target/105325, Rewrite genfusion.pl's gen_ld_cmpi_p10 function.
 
 This patch rewrites the gen_ld_cmpi_p10 function to be clearer.  The resulting
-fusion.md file is exactly the same in this patch.  The next patch in this series
-will fix PR target/105325.
+fusion.md file that this patch generates is exactly the same as the previous
+version of genfusion.pl.  The next patch in this series will fix PR
+target/105325.
 
-2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+2023-05-09   Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/
 
 	PR target/105325
-	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): New function, split
-	off from gen_ld_cmpi_p10.
+	* config/rs6000/genfusion.pl (mode_to_ldst_char): Delete.
+	(print_ld_cmpi_p10): New function, split off from gen_ld_cmpi_p10.
 	(gen_ld_cmpi_p10): Rewrite completely.
 
+==================== Branch work121, patch #3 was reverted ====================
+
+==================== Branch work121, patch #2 was reverted ====================
+
+==================== Branch work121, patch #1 was reverted ====================
+
 ==================== Branch work121, baseline ====================
 
 2023-05-08   Michael Meissner  <meissner@linux.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-12 16:51 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-12 16:51 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:e3c9958de2df8230e78e54011b8df4e2a17f2353

commit e3c9958de2df8230e78e54011b8df4e2a17f2353
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Mon May 8 19:55:01 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index e6965fbb1f7..0a3f9b83ff1 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -16,7 +16,7 @@ gcc/
 	PR target/105325
 	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): Use "YZ" constraints
 	for DS-form loads.  Set the sign_extend attribute for loads that do sign
-	extension.  Use the lwa_operand predicator for the LWA instruction.
+	extension.  Use the lwa_operand predicate for the LWA instruction.
 	* config/rs6000/fusion.md: Regenerate.
 
 gcc/testsuite/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-12 16:50 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-12 16:50 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:819015381d3ee559ad9d835ec8d526a1d1ee89f1

commit 819015381d3ee559ad9d835ec8d526a1d1ee89f1
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Mon May 8 19:46:48 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 3658ec88e7f..e6965fbb1f7 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,48 @@
+==================== Branch work121, patch #2 ====================
+
+PR target/105325: Fix power10_fusion
+
+This patch applies stricter predicates and constraints for LD and LWA
+instructions with power10 fusion.  These instructions are DS-form instructions,
+which means that the bottom 2 bits of the address must be 0.  In the past, we
+did not use the stricter predicates and constraints, and if the user used the
+-fstack-protector option, it would generate a non-prefixed load instruction
+whose offset was too big if the stack is large.
+
+2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/105325
+	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): Use "YZ" constraints
+	for DS-form loads.  Set the sign_extend attribute for loads that do sign
+	extension.  Use the lwa_operand predicator for the LWA instruction.
+	* config/rs6000/fusion.md: Regenerate.
+
+gcc/testsuite/
+
+	PR target/105325
+	* g++.target/powerpc/pr105325.C: New test.
+	* gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
+
+
+==================== Branch work121, patch #1 ====================
+
+Rewrite genfusion.pl's gen_ld_cmpi_p10 function.
+
+This patch rewrites the gen_ld_cmpi_p10 function to be clearer.  The resulting
+fusion.md file is exactly the same in this patch.  The next patch in this series
+will fix PR target/105325.
+
+2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/105325
+	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): New function, split
+	off from gen_ld_cmpi_p10.
+	(gen_ld_cmpi_p10): Rewrite completely.
+
 ==================== Branch work121, baseline ====================
 
 2023-05-08   Michael Meissner  <meissner@linux.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-09 20:45 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-09 20:45 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:b086465da6d844e55d793399576d617c3c86ff73

commit b086465da6d844e55d793399576d617c3c86ff73
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Tue May 9 16:45:43 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 0a3f9b83ff1..b8ffd73b0c4 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,6 +1,6 @@
-==================== Branch work121, patch #2 ====================
+==================== Branch work121, patch #5 ====================
 
-PR target/105325: Fix power10_fusion
+PR target/105325: Fix memory constraints for power10 fusion
 
 This patch applies stricter predicates and constraints for LD and LWA
 instructions with power10 fusion.  These instructions are DS-form instructions,
@@ -9,7 +9,7 @@ did not use the stricter predicates and constraints, and if the user used the
 -fstack-protector option, it would generate a non-prefixed load instruction
 whose offset was too big if the stack is large.
 
-2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+2023-05-09   Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/
 
@@ -26,23 +26,30 @@ gcc/testsuite/
 	* gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
 
 
-==================== Branch work121, patch #1 ====================
+==================== Branch work121, patch #4 ====================
 
-Rewrite genfusion.pl's gen_ld_cmpi_p10 function.
+PR target/105325, Rewrite genfusion.pl's gen_ld_cmpi_p10 function.
 
 This patch rewrites the gen_ld_cmpi_p10 function to be clearer.  The resulting
-fusion.md file is exactly the same in this patch.  The next patch in this series
-will fix PR target/105325.
+fusion.md file that this patch generates is exactly the same as the previous
+version of genfusion.pl.  The next patch in this series will fix PR
+target/105325.
 
-2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+2023-05-09   Michael Meissner  <meissner@linux.ibm.com>
 
 gcc/
 
 	PR target/105325
-	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): New function, split
-	off from gen_ld_cmpi_p10.
+	* config/rs6000/genfusion.pl (mode_to_ldst_char): Delete.
+	(print_ld_cmpi_p10): New function, split off from gen_ld_cmpi_p10.
 	(gen_ld_cmpi_p10): Rewrite completely.
 
+==================== Branch work121, patch #3 was reverted ====================
+
+==================== Branch work121, patch #2 was reverted ====================
+
+==================== Branch work121, patch #1 was reverted ====================
+
 ==================== Branch work121, baseline ====================
 
 2023-05-08   Michael Meissner  <meissner@linux.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-08 23:55 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-08 23:55 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:dba283af20b512ec7da687e5c1afbd03af238b5d

commit dba283af20b512ec7da687e5c1afbd03af238b5d
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Mon May 8 19:55:01 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index e6965fbb1f7..0a3f9b83ff1 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -16,7 +16,7 @@ gcc/
 	PR target/105325
 	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): Use "YZ" constraints
 	for DS-form loads.  Set the sign_extend attribute for loads that do sign
-	extension.  Use the lwa_operand predicator for the LWA instruction.
+	extension.  Use the lwa_operand predicate for the LWA instruction.
 	* config/rs6000/fusion.md: Regenerate.
 
 gcc/testsuite/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner
@ 2023-05-08 23:46 Michael Meissner
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Meissner @ 2023-05-08 23:46 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:cd2f91a22920c4cf8713f9edec707a350fc0bbfd

commit cd2f91a22920c4cf8713f9edec707a350fc0bbfd
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Mon May 8 19:46:48 2023 -0400

    Update ChangeLog.meissner

Diff:
---
 gcc/ChangeLog.meissner | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 3658ec88e7f..e6965fbb1f7 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,48 @@
+==================== Branch work121, patch #2 ====================
+
+PR target/105325: Fix power10_fusion
+
+This patch applies stricter predicates and constraints for LD and LWA
+instructions with power10 fusion.  These instructions are DS-form instructions,
+which means that the bottom 2 bits of the address must be 0.  In the past, we
+did not use the stricter predicates and constraints, and if the user used the
+-fstack-protector option, it would generate a non-prefixed load instruction
+whose offset was too big if the stack is large.
+
+2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/105325
+	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): Use "YZ" constraints
+	for DS-form loads.  Set the sign_extend attribute for loads that do sign
+	extension.  Use the lwa_operand predicator for the LWA instruction.
+	* config/rs6000/fusion.md: Regenerate.
+
+gcc/testsuite/
+
+	PR target/105325
+	* g++.target/powerpc/pr105325.C: New test.
+	* gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
+
+
+==================== Branch work121, patch #1 ====================
+
+Rewrite genfusion.pl's gen_ld_cmpi_p10 function.
+
+This patch rewrites the gen_ld_cmpi_p10 function to be clearer.  The resulting
+fusion.md file is exactly the same in this patch.  The next patch in this series
+will fix PR target/105325.
+
+2023-05-08   Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/105325
+	* config/rs6000/genfusion.pl (print_ld_cmpi_p10): New function, split
+	off from gen_ld_cmpi_p10.
+	(gen_ld_cmpi_p10): Rewrite completely.
+
 ==================== Branch work121, baseline ====================
 
 2023-05-08   Michael Meissner  <meissner@linux.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-03  4:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-02 19:14 [gcc(refs/users/meissner/heads/work121)] Update ChangeLog.meissner Michael Meissner
  -- strict thread matches above, loose matches on Subject: below --
2023-06-03  4:07 Michael Meissner
2023-06-02  4:15 Michael Meissner
2023-05-18  5:13 Michael Meissner
2023-05-17 21:26 Michael Meissner
2023-05-15 23:31 Michael Meissner
2023-05-12 16:56 Michael Meissner
2023-05-12 16:51 Michael Meissner
2023-05-12 16:51 Michael Meissner
2023-05-12 16:50 Michael Meissner
2023-05-09 20:45 Michael Meissner
2023-05-08 23:55 Michael Meissner
2023-05-08 23:46 Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).