public inbox for libstdc++-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work125-vpair)] Merge commit 'refs/users/meissner/heads/work125-vpair' of git+ssh://gcc.gnu.org/git/gcc into me/work
@ 2023-07-14 19:52 Michael Meissner
0 siblings, 0 replies; only message in thread
From: Michael Meissner @ 2023-07-14 19:52 UTC (permalink / raw)
To: gcc-cvs, libstdc++-cvs
https://gcc.gnu.org/g:630e1ba4d622b53da62dc7d07bcfa842da5f2162
commit 630e1ba4d622b53da62dc7d07bcfa842da5f2162
Merge: 9846b58a20f c53c1e9f353
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Fri Jul 14 15:51:50 2023 -0400
Merge commit 'refs/users/meissner/heads/work125-vpair' of git+ssh://gcc.gnu.org/git/gcc into me/work125-vpair
Diff:
gcc/ChangeLog.meissner | 2 +-
gcc/REVISION | 2 +-
gcc/c-family/ChangeLog.meissner | 2 +-
gcc/c/ChangeLog.meissner | 2 +-
gcc/cp/ChangeLog.meissner | 2 +-
gcc/fortran/ChangeLog.meissner | 2 +-
gcc/testsuite/ChangeLog.meissner | 2 +-
libgcc/ChangeLog.meissner | 2 +-
libstdc++-v3/ChangeLog.meissner | 2 +-
9 files changed, 9 insertions(+), 9 deletions(-)
diff --cc gcc/ChangeLog.meissner
index 306c070821c,8c6f8f14248..5051fa02d69
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@@ -1,364 -1,4 +1,364 @@@
+==================== Branch work125, patch #9 ====================
+
+PR target/89213 - Optimize vector shift by a constant.
+
+Optimize vector shifts by a constant, taking advantage that the shift
+instructions only look at the bits within the element.
+
+The PowerPC doesn't have a VSPLTID instruction. This meant that if we are doing
+a V2DI shift of 0..15, we had to do VSPLTIW and VEXTSW2D instructions to load
+the constant into the vector register.
+
+Similarly for V4SI and V2DI, if we wanted to shift more than 15 bits, we would
+generate XXSPLTIB and VEXTSB2D or VEXTSB2W instructions to load the constant
+into the vector register.
+
+Given the vector shift instructions only look at the bottom 5 or 6 bits of the
+shift value, we can load the constant via VSPLTISW or XXSPLTIB instructions and
+eliminate the sign extend instructions (VEXTSW2D, VEXTSB2D, and VEXTSB2W).
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/89213
+ * config/rs6000/altivec.md (UNSPEC_VECTOR_SHIFT): New unspec.
+ (V4SI_V2DI): New mode iterator.
+ (vshift_code): New code iterator.
+ (altivec_<code>_const_<mode): New insns.
+ (altivec_shift_const_<mode>): New insns.
+ * config/rs6000/predicates.md (vector_shift_constant): New
+ predicate.
+
+gcc/testsuite/
+
+ PR target/89213
+ * gcc.target/powerpc/pr89213.c: New test.
+ * gcc.target/powerpc/vec-rlmi-rlnm.c: Update insn count.
+
+==================== Branch work125, patch #8 ====================
+
+Update fold-vec-extract insn counts on 32-bit big endian
+
+In running tests, I noticed on big endian systems that the expected ADDIs is
+higher than the current number of ADDIs generated by the compiler. This patch
+adjusts those counts.
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/testsuite/
+
+ * gcc.target/powerpc/fold-vec-extract-char.p7.c: Update insn count for
+ 32-bit.
+ * gcc.target/powerpc/fold-vec-extract-double.p7.c: Likewise.
+ * gcc.target/powerpc/fold-vec-extract-float.p7.c: Likewise.
+ * gcc.target/powerpc/fold-vec-extract-float.p8.c: Likewise.
+ * gcc.target/powerpc/fold-vec-extract-int.p7.c: Likewise.
+ * gcc.target/powerpc/fold-vec-extract-int.p8.c: Likewise.
+ * gcc.target/powerpc/fold-vec-extract-short.p7.c: Likewise.
+ * gcc.target/powerpc/fold-vec-extract-short.p8.c: Likewise.
+
+==================== Branch work125, patch #7 ====================
+
+Allow constant element vec_extract to be converted to floating point
+
+This patch allows vec_extract of the following types to be converted to
+floating point by loading the value directly to the vector register, and then
+doing the conversion instead of loading the value to a GPR and then doing a
+direct move:
+
+vector int
+vector unsigned int
+vector unsigned short
+vector unsigned char
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ * config/rs6000/rs6000.md (fp_int_extend): New code attribute.
+ * config/rs6000/vsx.md (vsx_extract_v4si_load_to_<uns><mode>): New
+ insn.
+ (vsx_extract_<VSX_EXTRACT_I2:mode>_load_to_uns<SFDF:mode>): New insn.
+
+gcc/testsuite/
+
+ * gcc.target/powerpc/vec-extract-mem-char-2.c: New file.
+ * gcc.target/powerpc/vec-extract-mem-int-2.c: New file.
+ * gcc.target/powerpc/vec-extract-mem-int_3.c: New file.
+ * gcc.target/powerpc/vec-extract-mem-short-2.c: New file.
+
+==================== Branch work125, patch #6 ====================
+
+Add alternatives for vec_extract with constant element loading from memory.
+
+This patch expands the alternatives for doing vec_extract of V4SI, V8HI, and
+V16QI vectors with a constant offset when the vector is in memory. If the
+element number is 0 or we are using offsettable addressing for loading up GPR
+registers we don't need to allocate a temporary base register. We can fold the
+offset from the vec_extract into the normal address.
+
+I also added alternatives to load the values into vector registers. If we load
+the value into vector registers, we require X-form addressing.
+
+I added the VSX_EX_ISA mode attribute to distinguish that we can load 32-bit
+integers on a power8 system to vector registers, but we need a power9 system to
+be able to load 8-bit or 16-bit integers.
+
+In general, loading up small integer values with vec_extract into the vector
+registers explicitly is likely not done that much. However, this will be needed
+in later patches when we want to combine loading up a small integer value into a
+vector register with sign/zero extension. This happens when we want to do a
+vec_extract of a smal integer value and convert it to floating point.
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ * config/rs6000/vsx.md (VSX_EX_ISA): New mode attribute.
+ (vsx_extract_<mode>_load): Add more alternatives for memory options.
+ Allow the load to load up vector registers if needed.
+
+gcc/testsuite/
+
+ * gcc.target/powerpc/vec-extract-mem-char-1.c: New test.
+ * gcc.target/powerpc/vec-extract-mem-int-1.c: New test.
+ * gcc.target/powerpc/vec-extract-mem-short-1.c: New test.
+
+==================== Branch work125, patch #5 ====================
+
+Optimize vec_extract of V4SF with variable element number being converted to DF
+
+This patch adds a combiner insn to include the conversion of float to double
+within the memory address when vec_extract of V4SF with a variable element
+number is done.
+
+It also removes the '?' from the 'r' constraint so that if the SFmode is needed
+in a GPR, it doesn't have to load it to the vector unit, store it on the stack,
+and reload it into a GPR register.
+
+2023-07-11 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ * config/rs6000/vsx.md (vsx_extract_v4sf_var_load): Remove '?' from 'r'
+ constraint.
+ (vsx_extract_v4sf_var_load_to_df): New insn.
+
+gcc/testsuite/
+
+ * gcc.target/powerpc/vec-extract-mem-float-2.c: New file.
+
+==================== Branch work125, patch #4 ====================
+
+Optimize vec_extract of V4SF from memory with constant element numbers.
+
+This patch updates vec_extract of V4SF from memory with constant element
+numbers.
+
+I went through the alternatives, and I added alternatives to denote when we
+don't need to allocate a temporary base register. These cases include
+extracting element 0, and extracting elements 1-3 where we can use offsetable
+addresses.
+
+I added alternatives for power8 and power9 units to account for the expanded
+addressing on these machines (power8 can load SFmode into Altivec registers with
+x-form addressing, and power9 can use offsettable adressing to load up Altivec
+registers.
+
+This patch corrects the ISA test for loading SF values to altivec registers to
+be power8 vector, and not power7.
+
+This patch adds a combiner patch to combine loading up a SF element and
+converting it to double.
+
+It also removes the '?' from the 'r' constraint so that if the SFmode is needed
+in a GPR, it doesn't have to load it to the vector unit, store it, and then
+reload it into the GPR register.
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ * gcc/config/rs6000/vsx.md (vsx_extract_v4sf_load): Fix ISA for loading
+ up SFmode values with x-form addresses. Remove ? from 'r' constraint.
+ Add more alternatives to prevent requiring a temporary base register if
+ we don't need the temporary.
+ (vsx_extract_v4sf_load_to_df): New insn.
+
+gcc/testsuite/
+
+ * gcc.target/powerpc/vec-extract-mem-float-1.c: New file.
+
+==================== Branch work125, patch #3 ====================
+
+Fix typo in insn name.
+
+In doing other work, I noticed that there was an insn:
+
+ vsx_extract_v4sf_<mode>_load
+
+Which did not have an iterator. I removed the useless <mode>.
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ * config/rs6000/vsx.md (vsx_extract_v4sf_load): Rename from
+ vsx_extract_v4sf_<mode>_load.
+
+==================== Branch work125, patch #2 ====================
+
+Improve 64->128 bit zero extension on PowerPC (PR target/108958)
+
+If we are converting an unsigned DImode to a TImode value, and the TImode value
+will go in a vector register, GCC currently does the DImode to TImode conversion
+in GPR registers, and then moves the value to the vector register via a mtvsrdd
+instruction.
+
+This patch adds a new zero_extendditi2 insn which optimizes moving a GPR to a
+vector register using the mtvsrdd instruction with RA=0, and using lxvrdx to
+load a 64-bit value into the bottom 64-bits of the vector register.
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/108958
+ * gcc/config/rs6000.md (zero_extendditi2): New insn.
+
+gcc/testsuite/
+
+ PR target/108958
+ * gcc.target/powerpc/pr108958.c: New test.
+
+==================== Branch work125, patch #1 ====================
+
+Optimize vec_splats of vec_extract for V2DI/V2DF (PR target/99293)
+
+This patch optimizes cases like:
+
+ vector double v1, v2;
+ /* ... */
+ v2 = vec_splats (vec_extract (v1, 0); /* or */
+ v2 = vec_splats (vec_extract (v1, 1);
+
+Previously:
+
+ vector long long
+ splat_dup_l_0 (vector long long v)
+ {
+ return __builtin_vec_splats (__builtin_vec_extract (v, 0));
+ }
+
+would generate:
+
+ mfvsrld 9,34
+ mtvsrdd 34,9,9
+ blr
+
+With this patch, GCC generates:
+
+ xxpermdi 34,34,34,3
+ blr
+
+
+I have tested this patch on the following systems and there was no degration.
+Can I check it into the trunk branch?
+
+ * Power10, LE, --with-cpu=power10, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IBM 128-bit long double
+ * Power9, LE, --with-cpu=power9, IEEE 128-bit long double
+ * Power9, LE, --with-cpu=power9, 64-bit default long double
+ * Power9, BE, --with-cpu=power9, IBM 128-bit long double
+ * Power8, BE, --with-cpu=power8, IBM 128-bit long double
+
+2023-07-14 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/99293
+ * gcc/config/rs6000/vsx.md (vsx_splat_extract_<mode>): New combiner
+ insn.
+
+gcc/testsuite/
+
+ PR target/108958
+ * gcc.target/powerpc/pr99293.c: New test.
+ * gcc.target/powerpc/builtins-1.c: Update insn count.
+
+
- ==================== Branch work125, baseline ====================
+ ==================== Branch work125-vpair, baseline ====================
2023-07-14 Michael Meissner <meissner@linux.ibm.com>
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-07-14 19:52 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-14 19:52 [gcc(refs/users/meissner/heads/work125-vpair)] Merge commit 'refs/users/meissner/heads/work125-vpair' of git+ssh://gcc.gnu.org/git/gcc into me/work Michael Meissner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).