From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id E9C1D3858D20; Tue, 23 Jan 2024 05:23:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E9C1D3858D20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1705987420; bh=TxL0XCoKIp/fgUJ0UmTsmGD+vYTq8tbwPwwJv4KcaLc=; h=From:To:Subject:Date:From; b=leHXMyj4MLrXKxQrScD/lkH1k6H6cAiCwGx3CMrNMyiGDJ46qNneS5Qg3qg37MHt4 /AUP3xdm0F+97Rw6n4gw+YMmps73Av3vVGXfQRWK0uu1TTARa8mQJKnZIlM5dYBRGp j6S+VcPDVuiBC1ShpR8sqYekeCnUo/yqLpsW/+NE= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work154-dmf)] Update ChangeLog.* X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work154-dmf X-Git-Oldrev: e60fac84f708fb11afbc10830e33fc52c50ffbc3 X-Git-Newrev: 361050c46fc9888d6e6c2975e1292dfa8115c375 Message-Id: <20240123052340.E9C1D3858D20@sourceware.org> Date: Tue, 23 Jan 2024 05:23:40 +0000 (GMT) List-Id: https://gcc.gnu.org/g:361050c46fc9888d6e6c2975e1292dfa8115c375 commit 361050c46fc9888d6e6c2975e1292dfa8115c375 Author: Michael Meissner Date: Tue Jan 23 00:23:37 2024 -0500 Update ChangeLog.* Diff: --- gcc/ChangeLog.dmf | 626 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 624 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf index c67a6e04a44..afa42df1c39 100644 --- a/gcc/ChangeLog.dmf +++ b/gcc/ChangeLog.dmf @@ -1,6 +1,628 @@ +==================== Branch work154-dmf, patch #109 ==================== + +Add paddis support. + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/constraints.md (eU): New constraint. + (eV): Likewise. + * config/rs6000/predicates.md (paddis_operand): New predicate. + (paddis_paddi_operand): Likewise. + (add_operand): Add paddis support. + * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS): Add -mpaddis + support. + (POWERPC_MASKS): Likewise. + * config/rs6000/rs6000.cc (num_insns_constant_gpr): Add -mpaddis + support. + (num_insns_constant_multi): Likewise. + (print_operand): Add %B for paddis support. + (rs6000_opt_masks): Add -mpaddis. + & config/rs6000/rs6000.h (SIGNED_INTEGER_32BIT_P): New macro. + * config/rs6000/rs6000.md (isa attribute): Add -mpaddis support. + (enabled attribute); Likewise. + (add3): Likewise. + (adddi3 splitter): New splitter for paddis. + (movdi_internal64): Add -mpaddis support. + (movdi splitter): New splitter for -mpaddis. + * config/rs6000/rs6000.opt (-mpaddis): New switch. + +==================== Branch work154-dmf, patch #108 ==================== + +Add saturating subtract built-ins. + +This patch adds support for a saturating subtract built-in function that may be +added to a future PowerPC processor. Note, if it is added, the name of the +built-in function may change before GCC 13 is released. If the name changes, +we will submit a patch changing the name. + +I also added support for providing dense math built-in functions, even though +at present, we have not added any new built-in functions for dense math. It is +likely we will want to add new dense math built-in functions as the dense math +support is fleshed out. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add support + for flagging invalid use of future built-in functions. + (rs6000_builtin_is_supported): Add support for future built-in + functions. + * config/rs6000/rs6000-builtins.def (__builtin_saturate_subtract32): New + built-in function for -mcpu=future. + (__builtin_saturate_subtract64): Likewise. + * config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add stanzas + for -mcpu=future built-ins. + (stanza_map): Likewise. + (enable_string): Likewise. + (struct attrinfo): Likewise. + (parse_bif_attrs): Likewise. + (write_decls): Likewise. + * config/rs6000/rs6000.md (sat_sub3): Add saturating subtract + built-in insn declarations. + (sat_sub3_dot): Likewise. + (sat_sub3_dot2): Likewise. + * doc/extend.texi (Future PowerPC built-ins): New section. + +gcc/testsuite/ + + * gcc.target/powerpc/subfus-1.c: New test. + * gcc.target/powerpc/subfus-2.c: Likewise. + +==================== Branch work154-dmf, patch #107 ==================== + +Support load/store vector with right length. + +This patch adds support for new instructions that may be added to the PowerPC +architecture in the future to enhance the load and store vector with length +instructions. + +The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use +since the count for the number of bytes must be in the top 8 bits of the GPR +register, instead of the bottom 8 bits. This meant that code generating these +instructions typically had to do a shift left by 56 bits to get the count into +the right position. In a future version of the PowerPC architecture, new +variants of these instructions might be added that expect the count to be in +the bottom 8 bits of the GPR register. These patches add this support to GCC +if the user uses the -mcpu=future option. + +I discovered that the code in rs6000-string.cc to generate ISA 3.1 lxvl/stxvl +future lxvll/stxvll instructions would generate these instructions on 32-bit. +However the patterns for these instructions is only done on 64-bit systems. So +I added a check for 64-bit support before generating the instructions. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/rs6000-string.cc (expand_block_move): Do not generate + lxvl and stxvl on 32-bit. + * config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl with + the shift count automaticaly used in the insn. + (lxvrl): New insn for -mcpu=future. + (lxvrll): Likewise. + (stxvl): If -mcpu=future, generate the stxvl with the shift count + automaticaly used in the insn. + (stxvrl): New insn for -mcpu=future. + (stxvrll): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/lxvrl.c: New test. + * lib/target-supports.exp (check_effective_target_powerpc_future_ok): + New effective target. + +==================== Branch work154-dmf, patch #106 ==================== + +PowerPC: Add support for 1,024 bit DMR registers. + +This patch is a prelimianry patch to add the full 1,024 bit dense math register +(DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the +DMR register. + +This patch only adds the new 1,024 bit register support. It does not add +support for any instructions that need 1,024 bit registers instead of 512 bit +registers. + +I used the new mode 'TDOmode' to be the opaque mode used for 1,204 bit +registers. The 'wD' constraint added in previous patches is used for these +registers. I added support to do load and store of DMRs via the VSX registers, +since there are no load/store dense math instructions. I added the new keyword +'__dmr' to create 1,024 bit types that can be loaded into DMRs. At present, I +don't have aliases for __dmr512 and __dmr1024 that we've discussed internally. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec. + (UNSPEC_DM_INSERT512_LOWER): Likewise. + (UNSPEC_DM_EXTRACT512): Likewise. + (UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise. + (UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise. + (movtdo): New define_expand and define_insn_and_split to implement 1,024 + bit DMR registers. + (movtdo_insert512_upper): New insn. + (movtdo_insert512_lower): Likewise. + (movtdo_extract512): Likewise. + (reload_dmr_from_memory): Likewise. + (reload_dmr_to_memory): Likewise. + * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR + support. + (rs6000_init_builtins): Add support for __dmr keyword. + * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support + for TDOmode. + (rs6000_function_arg): Likewise. + * config/rs6000/rs6000-modes.def (TDOmode): New mode. + * config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add + support for TDOmode. + (rs6000_hard_regno_mode_ok_uncached): Likewise. + (rs6000_hard_regno_mode_ok): Likewise. + (rs6000_modes_tieable_p): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Add support for TDOmode. Setup reload + hooks for DMR mode. + (reg_offset_addressing_ok_p): Add support for TDOmode. + (rs6000_emit_move): Likewise. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_secondary_reload_class): Likewise. + (rs6000_mangle_type): Add mangling for __dmr type. + (rs6000_dmr_register_move_cost): Add support for TDOmode. + (rs6000_split_multireg_move): Likewise. + (rs6000_invalid_conversion): Likewise. + * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode. + (enum rs6000_builtin_type_index): Add DMR type nodes. + (dmr_type_node): Likewise. + (ptr_dmr_type_node): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/dm-1024bit.c: New test. + +==================== Branch work154-dmf, patch #105 ==================== + +PowerPC: Switch to dense math names for all MMA operations. + +This patch changes the assembler instruction names for MMA instructions from +the original name used in power10 to the new name when used with the dense math +system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the +same bits for either spelling. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/mma.md (vvi4i4i8_dm): New int attribute. + (avvi4i4i8_dm): Likewise. + (vvi4i4i2_dm): Likewise. + (avvi4i4i2_dm): Likewise. + (vvi4i4_dm): Likewise. + (avvi4i4_dm): Likewise. + (pvi4i2_dm): Likewise. + (apvi4i2_dm): Likewise. + (vvi4i4i4_dm): Likewise. + (avvi4i4i4_dm): Likewise. + (mma_): Add support for running on DMF systems, generating the dense + math instruction and using the dense math accumulators. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/dm-double-test.c: New test. + * lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New + target test. + +==================== Branch work154-dmf, patch #104 ==================== + +PowerPC: Make MMA insns support DMR registers. + +This patch changes the MMA instructions to use either FPR registers +(-mcpu=power10) or DMRs (-mcpu=future). In this patch, the existing MMA +instruction names are used. + +A macro (__PPC_DMR__) is defined if the MMA instructions use the DMRs. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/mma.md (mma_): New define_expand to handle + mma_ for dense math and non dense math. + (mma_ insn): Restrict to non dense math. + (mma_xxsetaccz): Convert to define_expand to handle non dense math and + dense math. + (mma_xxsetaccz_vsx): Rename from mma_xxsetaccz and restrict usage to non + dense math. + (mma_xxsetaccz_dm): Dense math version of mma_xxsetaccz. + (mma_): Add support for dense math. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define + __PPC_DMR__ if we have dense math instructions. + * config/rs6000/rs6000.cc (print_operand): Make %A handle only DMRs if + dense math and only FPRs if not dense math. + (rs6000_split_multireg_move): Do not generate the xxmtacc instruction to + prime the DMR registers or the xxmfacc instruction to de-prime + instructions if we have dense math register support. + +==================== Branch work154-dmf, patch #103 ==================== + +PowerPC: Add support for accumulators in DMR registers. + +The MMA subsystem added the notion of accumulator registers as an optional +feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with +the traditional floating point registers 0..31, but logically the accumulator +registers were separate from the FPR registers. In ISA 3.1, it was anticipated +that in future systems, the accumulator registers may no overlap with the FPR +registers. This patch adds the support for dense math registers as separate +registers. + +This particular patch does not change the MMA support to use the accumulators +within the dense math registers. This patch just adds the basic support for +having separate DMRs. The next patch will switch the MMA support to use the +accumulators if -mcpu=future is used. + +For testing purposes, I added an undocumented option '-mdense-math' to enable +or disable the dense math support. + +This patch adds a new constraint (wD). If MMA is selected but dense math is +not selected (i.e. -mcpu=power10), the wD constraint will allow access to +accumulators that overlap with the VSX vector registers 0..31. If both MMA and +dense math are selected (i.e. -mcpu=future), the wD constraint will only allow +dense math registers. + +This patch modifies the existing %A output modifier. If MMA is selected but +dense math is not selected, then %A output modifier converts the VSX register +number to the accumulator number, by dividing it by 4. If both MMA and dense +math are selected, then %A will map the separate DMR registers into 0..7. + +The intention is that user code using extended asm can be modified to run on +both MMA without dense math and MMA with dense math: + + 1) If possible, don't use extended asm, but instead use the MMA built-in + functions; + + 2) If you do need to write extended asm, change the d constraints + targetting accumulators should now use wD; + + 3) Only use the built-in zero, assemble and disassemble functions create + move data between vector quad types and dense math accumulators. + I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the + extended asm code. The reason is these instructions assume there is a + 1-to-1 correspondence between 4 adjacent FPR registers and an + accumulator that overlaps with those instructions. With accumulators + now being separate registers, there no longer is a 1-to-1 + correspondence. + +It is possible that the mangling for DMRs and the GDB register numbers may +change in the future. + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/constraints.md (wD constraint): New constraint. + * config/rs6000/mma.md (UNSPEC_DM_ASSEMBLE_ACC): New unspec. + (movxo): Convert into define_expand. + (movxo_vsx): Version of movxo where accumulators overlap with VSX vector + registers 0..31. + (movxo_dm): Verson of movxo that supports separate dense math + accumulators. + (mma_assemble_acc): Add dense math support to define_expand. + (mma_assemble_acc_vsx): Rename from mma_assemble_acc, and restrict it to + non dense math systems. + (mma_assemble_acc_dm): Dense math version of mma_assemble_acc. + (mma_disassemble_acc): Add dense math support to define_expand. + (mma_disassemble_acc_vsx): Rename from mma_disassemble_acc, and restrict + it to non dense math systems. + (mma_disassemble_acc_dm): Dense math version of mma_disassemble_acc. + * config/rs6000/predicates.md (dmr_operand): New predicate. + (accumulator_operand): Likewise. + * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS): Add -mdense-math. + (POWERPC_MASKS): Likewise. + * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE. + (enum rs6000_reload_reg_type): Add RELOAD_REG_DMR. + (LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD + constraint. + (reload_reg_map): Likewise. + (rs6000_reg_names): Likewise. + (alt_reg_names): Likewise. + (rs6000_hard_regno_nregs_internal): Likewise. + (rs6000_hard_regno_mode_ok_uncached): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Likewise. + (rs6000_option_override_internal): Add checking for -mdense-math. + (rs6000_secondary_reload_memory): Add support for DMR registers. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_secondary_reload_class): Likewise. + (print_operand): Make %A handle both FPRs and DMRs. + (rs6000_dmr_register_move_cost): New helper function. + (rs6000_register_move_cost): Add support for DMR registers. + (rs6000_memory_move_cost): Likewise. + (rs6000_compute_pressure_classes): Likewise. + (rs6000_debugger_regno): Likewise. + (rs6000_opt_masks): Add -mdense-math. + (rs6000_split_multireg_move): Add support for DMRs. + * config/rs6000/rs6000.h (UNITS_PER_DMR_WORD): New macro. + (FIRST_PSEUDO_REGISTER): Update for DMRs. + (FIXED_REGISTERS): Add DMRs. + (CALL_REALLY_USED_REGISTERS): Likewise. + (REG_ALLOC_ORDER): Likewise. + (enum reg_class): Add DM_REGS. + (REG_CLASS_NAMES): Likewise. + (REG_CLASS_CONTENTS): Likewise. + * config/rs6000/rs6000.md (FIRST_DMR_REGNO): New constant. + (LAST_DMR_REGNO): Likewise. + (isa attribute): Add 'dm' and 'not_dm' attributes. + (enabled attribute): Support 'dm' and 'not_dm' attributes. + * config/rs6000/rs6000.opt (-mdense-math): New switch. + * doc/md.texi (PowerPC constraints): Document wD constraint. + +==================== Branch work154-dmf, patch #102 ==================== + +PowerPC: Make -mcpu=future enable -mblock-ops-vector-pair. + +This patch re-enables generating load and store vector pair instructions when +doing certain memory copy operations when -mcpu=future is used. + +During power10 development, it was determined that using store vector pair +instructions were problematical in a few cases, so we disabled generating load +and store vector pair instructions for memory options by default. This patch +re-enables generating these instructions if -mcpu=future is used. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS): Add + -mblock-ops-vector-pair. + (POWERPC_MASKS): Likewise. + +==================== Branch work154-dmf, patch #101 ==================== + +PowerPC: Add -mcpu=future. + +This patch implements support for a potential future PowerPC cpu. Features +added with -mcpu=future, may or may not be added to new PowerPC processors. + +This patch adds support for the -mcpu=future option. If you use -mcpu=future, +the macro __ARCH_PWR_FUTURE__ is defined, and the assembler .machine directive +"future" is used. Future patches in this series will add support for new +instructions that may be present in future PowerPC processors. + +This particular patch does not any new features. It exists as a ground work +for future patches to support for a possible PowerPC processor in the future. + +This patch does not implement any differences in tuning when -mcpu=future is +used compared to -mcpu=power10. If -mcpu=future is used, GCC will use power10 +tuning. If you explicitly use -mtune=future, you will get a warning that +-mtune=future is not supported, and default tuning will be set for power10. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define + __ARCH_PWR_FUTURE__ if -mcpu=future. + * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS): New macro. + (POWERPC_MASKS): Add -mcpu=future support. + * config/rs6000/rs6000-opts.h (enum processor_type): Add + PROCESSOR_FUTURE. + * config/rs6000/rs6000-tables.opt: Regenerate. + * config/rs6000/rs6000.cc (rs600_cpu_index_lookup): New helper + function. + (rs6000_option_override_internal): Make -mcpu=future set + -mtune=power10. If the user explicitly uses -mtune=future, give a + warning and reset the tuning to power10. + (rs6000_option_override_internal): Use power10 costs for future + machine. + (rs6000_machine_from_flags): Add support for -mcpu=future. + (rs6000_opt_masks): Likewise. + * config/rs6000/rs6000.h (ASM_CPU_SUPPORT): Likewise. + * config/rs6000/rs6000.md (cpu attribute): Likewise. + * config/rs6000/rs6000.opt (-mfuture): New undocumented debug switch. + * doc/invoke.texi (IBM RS/6000 and PowerPC Options): Document -mcpu=future. + +==================== Branch work154-dmf, work154 patch #2 ==================== + +PR target/112886, Add %S to print_operand for vector pair support. + +In looking at support for load vector pair and store vector pair for the +PowerPC in GCC, I noticed that we were missing a print_operand output modifier +if you are dealing with vector pairs to print the 2nd register in the vector +pair. + +If the instruction inside of the asm used the Altivec encoding, then we could +use the %L modifier: + + __vector_pair *p, *q, *r; + // ... + __asm__ ("vaddudm %0,%1,%2\n\tvaddudm %L0,%L1,%L2" + : "=v" (*p) + : "v" (*q), "v" (*r)); + +Likewise if we know the value to be in a tradiational FPR register, %L will +work for instructions that use the VSX encoding: + + __vector_pair *p, *q, *r; + // ... + __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %L0,%L1,%L2" + : "=f" (*p) + : "f" (*q), "f" (*r)); + +But if have a value that is in a traditional Altivec register, and the +instruction uses the VSX encoding, %L will a value between 0 and 31, when it +should give a value between 32 and 63. + +This patch adds %S that acts like %x, except that it adds 1 to the +register number. + +I have tested this on power10 and power9 little endian systems and on a power9 +big endian system. There were no regressions in the patch. Can I apply it to +the trunk? + +It would be nice if I could apply it to the open branches. Can I backport it +after a burn-in period? + +2024-01-23 Michael Meissner + +gcc/ + + PR target/112886 + * config/rs6000/rs6000.cc (print_operand): Add %S output modifier. + * doc/md.texi (Modifiers): Mention %S can be used like %x. + +gcc/testsuite/ + + PR target/112886 + * /gcc.target/powerpc/pr112886.c: New test. + +==================== Branch work154-dmf, work154 patch #1 ==================== + +Power10: Add options to disable load and store vector pair. + +This is version 2 of the patch to add -mno-load-vector-pair and +-mno-store-vector-pair undocumented tuning switches. + +The differences between the first version of the patch and this version is that +I added explicit RTL abi attributes for when the compiler can generate the load +vector pair and store vector pair instructions. By having this attribute, the +movoo insn has separate alternatives for when we generate the instruction and +when we want to split the instruction into 2 separate vector loads or stores. + +In the first version of the patch, I had previously provided built-in functions +that would always generate load vector pair and store vector pair instructions +even if these instructions are normally disabled. I found these built-ins +weren't specified like the other vector pair built-ins, and I didn't include +documentation for the built-in functions. If we want such built-in functions, +we can add them as a separate patch later. + +In addition, since both versions of the patch adds #pragma target and attribute +support to change the results for individual functions, we can select on a +function by function basis what the defaults for load/store vector pair is. + +The original text for the patch is: + +In working on some future patches that involve utilizing vector pair +instructions, I wanted to be able to tune my program to enable or disable using +the vector pair load or store operations while still keeping the other +operations on the vector pair. + +This patch adds two undocumented tuning options. The -mno-load-vector-pair +option would tell GCC to generate two load vector instructions instead of a +single load vector pair. The -mno-store-vector-pair option would tell GCC to +generate two store vector instructions instead of a single store vector pair. + +If either -mno-load-vector-pair is used, GCC will not generate the indexed +stxvpx instruction. Similarly if -mno-store-vector-pair is used, GCC will not +generate the indexed lxvpx instruction. The reason for this is to enable +splitting the {,p}lxvp or {,p}stxvp instructions after reload without needing a +scratch GPR register. + +The default for -mcpu=power10 is that both load vector pair and store vector +pair are enabled. + +I added code so that the user code can modify these settings using either a +'#pragma GCC target' directive or used __attribute__((__target__(...))) in the +function declaration. + +I added tests for the switches, #pragma, and attribute options. + +I have built this on both little endian power10 systems and big endian power9 +systems doing the normal bootstrap and test. There were no regressions in any +of the tests, and the new tests passed. Can I check this patch into the master +branch? + +2024-01-23 Michael Meissner + +gcc/ + + * config/rs6000/mma.md (movoo): Add support for -mno-load-vector-pair and + -mno-store-vector-pair. + * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support for + -mload-vector-pair and -mstore-vector-pair. + (POWERPC_MASKS): Likewise. + * config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow + indexed mode for OOmode if we are generating both load vector pair and + store vector pair instructions. + (rs6000_option_override_internal): Add support for -mno-load-vector-pair + and -mno-store-vector-pair. + (rs6000_opt_masks): Likewise. + * config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp + attributes. + (enabled attribute): Likewise. + * config/rs6000/rs6000.opt (-mload-vector-pair): New option. + (-mstore-vector-pair): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/vector-pair-attribute.c: New test. + * gcc.target/powerpc/vector-pair-pragma.c: New test. + * gcc.target/powerpc/vector-pair-switch1.c: New test. + * gcc.target/powerpc/vector-pair-switch2.c: New test. + * gcc.target/powerpc/vector-pair-switch3.c: New test. + * gcc.target/powerpc/vector-pair-switch4.c: New test. + ==================== Branch work154-dmf, baseline ==================== -2024-01-22 Michael Meissner +Add ChangeLog.dmf and update REVISION. - Clone branch +2024-01-23 Michael Meissner + +gcc/ + * ChangeLog.dmf: New file for branch. + * REVISION: Update. + +2024-01-23 Michael Meissner + + Clone branch