public inbox for gcc-cvs@sourceware.org help / color / mirror / Atom feed
From: Michael Meissner <meissner@gcc.gnu.org> To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work120)] Update ChangeLog.meissner Date: Mon, 1 May 2023 23:14:00 +0000 (GMT) [thread overview] Message-ID: <20230501231400.752463858D1E@sourceware.org> (raw) https://gcc.gnu.org/g:41c363568baf9c9a2051c341e4d0c041fdbc8ec5 commit 41c363568baf9c9a2051c341e4d0c041fdbc8ec5 Author: Michael Meissner <meissner@linux.ibm.com> Date: Mon May 1 19:13:57 2023 -0400 Update ChangeLog.meissner Diff: --- gcc/ChangeLog.meissner | 319 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 319 insertions(+) diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner index d93fc70d3f1..b4314aa4954 100644 --- a/gcc/ChangeLog.meissner +++ b/gcc/ChangeLog.meissner @@ -1,3 +1,322 @@ +==================== Branch work120, patch #30 was reverted ==================== + +==================== Branch work120, patch #27 ==================== + +Optimize variable element vec_extract to be converted to floating point + +This patch optimizes vec_extract with a variable element number of the following +types to be converted to floating point by loading the value directly to the +vector register, and then doing the conversion instead of loading the value to a +GPR and then doing a direct move: + +vector int +vector unsigned int +vector unsigned short +vector unsigned char + +2023-04-28 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/vsx.md (vsx_extract_v4si_var_load_to_<uns><mode>): New + * insn. + * vsx_extract_<VSX_EXTRACT_I2:mode>_var_load_to_uns<SFDF:mode>: New + insn. + +gcc/testsuite/ + + * gcc.target/powerpc/vec-extract-mem-int-6.c: New file. + * gcc.target/powerpc/vec-extract-mem-int_7.c: New file. + + +==================== Branch work120, patch #26 ==================== + +Allow constant element vec_extract to be converted to floating point + +This patch allows vec_extract of the following types to be converted to +floating point by loading the value directly to the vector register, and then +doing the conversion instead of loading the value to a GPR and then doing a +direct move: + +vector int +vector unsigned int +vector unsigned short +vector unsigned char + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/rs6000.md (fp_int_extend): New code attribute. + * config/rs6000/vsx.md (vsx_extract_v4si_load_to_<uns><mode>): New + * insn. + * vsx_extract_<VSX_EXTRACT_I2:mode>_load_to_uns<SFDF:mode>: New insn. + +gcc/testsuite/ + + * gcc.target/powerpc/vec-extract-mem-char-2.c: New file. + * gcc.target/powerpc/vec-extract-mem-int-4.c: New file. + * gcc.target/powerpc/vec-extract-mem-int_5.c: New file. + * gcc.target/powerpc/vec-extract-mem-short-4.c: New file. + +==================== Branch work120, patch #25 ==================== + +Allow variable element vec_extract to be sign or zero extended + +This patch allows vec_extract of V4SI, V8HI, and V16QI vector types with a +variable element number to be loaded with sign or zero extension, and GCC will +not generate a separate zero/sign extension instruction. + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/vsx.md (vsx_extract_v4si_var_load_to_<su>di): New insn. + (vsx_extract_<VSX_EXTRACT_I2:mode>_var_load_to_u<GPR:mode>): New insn. + (vsx_extract_v8hi_var_load_to_s<mode>): New insn. + +gcc/testsuite/ + + * gcc.target/powerpc/vec-extract-mem-int-3.c: New file. + * gcc.target/powerpc/vec-extract-mem-short-3.c: New file. + +==================== Branch work120, patch #24 ==================== + +Allow variable element vec_extract to be loaded into vector registers. + +This patch allows vec_extract of V4SI, V8HI, and V16QI vector types with a +variable element number to be loaded into vector registers directly. + +2023-05-1 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/vsx.md (vsx_extract_<mode>_var_load): Allow vector + registers to be loaded. + +==================== Branch work120, patch #23 ==================== + +Optimize sign and zero extension of vec_extract from memory with constant element + +This patch allows vec_extract of V4SI, V8HI, and V16QI vector types with a +constant element number to be zero extended. It also allows vec_extract of V4SI +and V8HI vector types with constant element number to be sign extended. + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/vsx.md (vsx_extract_v4si_load_to_<su>d): New insn. + (vsx_extract_<VSX_EXTRACT_I2:mode>_load_to_u<GPR:mode>): New insn. + (vsx_extract_v8hi_load_to_s<mode>): New insn. + +gcc/testsuite/ + + * gcc.target/powerpc/vec-extract-mem-char-1.c: New file. + * gcc.target/powerpc/vec-extract-mem-int-1.c: New file. + * gcc.target/powerpc/vec-extract-mem-int-2.c: New file. + * gcc.target/powerpc/vec-extract-mem-short-1.c: New file. + * gcc.target/powerpc/vec-extract-mem-short-2.c: New file. + +==================== Branch work120, patch #22 ==================== + +Allow consant element vec_extract to be loaded into vector registers. + +This patch allows vec_extract of V4SI, V8HI, and V16QI vector types with a +constant element number to be loaded into vector registers directly. + +This patch also adds support for optimzing 0 element number to not need a base +register tempoary. Likewise, if we have an offsettable address, we don't need +to allocate a scratch register. + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/vsx.md (VSX_EX_ISA): New mode attribute. + (vsx_extract_<mode>_load): Allow vector registers to be loaded. Add + optimizations for loading up element 0 and/or with an offsettable + address. + +==================== Branch work120, patch #21 ==================== + +Optimize vec_extract of V4SF with variable element number being converted to DF + +This patch adds a combiner insn to include the conversion of float to double +within the memory address when vec_extract of V4SF with a variable element +number is done. + +It also removes the '?' from the 'r' constraint so that if the SFmode is needed +in a GPR, it doesn't have to load it to the vector unit and then store it. + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/vsx.md (vsx_extract_v4sf_var_load): Remove '?' from 'r' + constraint. + (vsx_extract_v4sf_var_load_to_df): New insn. + +gcc/testsuite/ + + * gcc.target/powerpc/vec-extract-mem-float-2.c: New file. + +==================== Branch work120, patch #20 ==================== + +Optimize vec_extract of V4SF from memory with constant element numbers. + +This patch updates vec_extract of V4SF from memory with constant element +numbers. + +This patch corrects the ISA for loading SF values to altivec registers to be +power8 vector, and not power7. + +This patch adds a combiner patch to combine loading up a SF element and +converting it to double. + +It also removes the '?' from the 'r' constraint so that if the SFmode is needed +in a GPR, it doesn't have to load it to the vector unit and then store it. + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * gcc/config/rs6000/vsx.md (vsx_extract_v4sf_load): Fix ISA for loading + up SFmode values with x-form addresses. Remove ? from 'r' constraint. + (vsx_extract_v4sf_load_to_df): New insn. + +gc/testsuite/ + + * gcc.target/powerpc/vec-extract-mem-float-1.c: New file. + +==================== Branch work120, patch #3 ==================== + +Fix typo in insn name. + +In doing other work, I noticed that there was an insn: + + vsx_extract_v4sf_<mode>_load + +Which did not have an iterator. I removed the useless <mode>. + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + * config/rs6000/vsx.md (vsx_extract_v4sf_load): Rename from + vsx_extract_v4sf_<mode>_load. + +==================== Branch work120, patch #2 ==================== + +Improve 64->128 bit zero extension on PowerPC + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + PR target/108958 + * gcc/config/rs6000.md (zero_extendditi2): New insn. + +gcc/testsuite/ + + PR target/108958 + * gcc.target/powerpc/zero-extend-di-ti.c: New test. + +==================== Branch work120, patch #1 ==================== + +Fix splat of extract for long long and double. + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + PR target/99293 + * gcc/config/rs6000/vsx.md (vsx_splat_extract_<mode>): New combiner + insn. + +gcc/testsuite/ + + PR target/108958 + * gcc.target/powerpc/pr99293.c: New test. + * gcc.target/powerpc/builtins-1.c: Update insn count. + +==================== Branch work120, patch #1 ==================== + +PR target/105325: Make load/cmp fusion know about prefixed loads. + +I posted a version of patch on March 21st, a second version on March 24th, and +the third version on March 28th. + +The V4 patch just adds a new condition to the new test case. Previously, I was +using 'powerpc_prefixed_addr' to determine whether the GCC compiler would +automatically generate prefixed addresses. The V4 version also adds a check +for 'power10_ok'. Power10_ok is needed in case the compiler could generate +prefixed addresses, but the assembler does not support prefixed instructions. + +The V3 patch makes some code changes suggested in the genfusion.pl code from +the last 2 patch submissions. The fusion.md that is produced by genfusion.pl +is the same in all 3 versions. + +In V3, I changed the genfusion.pl to match the suggestion for code layout. I +also used the correct comment for each of the instructions (in the 2nd patch, +the when I rewrote the comments about ld and lwa being DS format instructions, +I had put the ld comment in the section handling lwa, and vice versa). + +In V3, I also removed lp64 from the new test. When I first added the prefixed +code, it was only done for 64-bit, but now it is allowed for 32-bit. However, +the case that shows up (lwa) would not hit in 32-bit, since it only generates +lwz and not lwa. It also would not generate ld. But the test does pass when +it is built with -m32. + +The issue with the original bug is the power10 load GPR + cmpi -1/0/1 fusion +optimization generates illegal assembler code. + +Ultimately the code was dying because the fusion load + compare -1/0/1 patterns +did not handle the possibility that the load might be prefixed. + +The main cause is the constraints for the individual loads in the fusion did not +match the machine. In particular, LWA is a ds format instruction when it is +unprefixed. The code did not also set the prefixed attribute correctly. + +This patch rewrites the genfusion.pl script so that it will have more accurate +constraints for the LWA and LD instructions (which are DS instructions). The +updated genfusion.pl was then run to update fusion.md. Finally, the code for +the "prefixed" attribute is modified so that it considers load + compare +immediate patterns to be like the normal load insns in checking whether +operand[1] is a prefixed instruction. + +I have tested this code on a power9 little endian system (with long double +being IEEE 128-bit and IBM 128-bit), a power10 little endian system, and a +power8 big endian system, testing both 32-bit and 64-bit code generation. + +For the V4 changes I also built the compiler on a big endian system with an +older assembler, and I verified that the pr105325.C test was listed as +unsupported. + +Can I put this code into the master branch, and after a waiting period, apply +it to the GCC 12 and GCC 11 branches (the bug does show up in those branches, +and the patch applies without change). + +2023-05-01 Michael Meissner <meissner@linux.ibm.com> + +gcc/ + + PR target/105325 + * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation + of the ld and lwa instructions which use the DS encoding instead of D. + Use the YZ constraint for these loads. Handle prefixed loads better. + Set the sign_extend attribute as appropriate. + * gcc/config/rs6000/fusion.md: Regenerate. + * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi + instructions to the list of instructions that might have a prefixed load + instruction. + +gcc/testsuite/ + + PR target/105325 + * g++.target/powerpc/pr105325.C: New test. + * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts. + ==================== Branch work120, baseline ==================== 2023-05-01 Michael Meissner <meissner@linux.ibm.com>
reply other threads:[~2023-05-01 23:14 UTC|newest] Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230501231400.752463858D1E@sourceware.org \ --to=meissner@gcc.gnu.org \ --cc=gcc-cvs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).