public inbox for gcc-cvs@sourceware.org help / color / mirror / Atom feed
From: Michael Meissner <meissner@gcc.gnu.org> To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/dmf004)] Use lxvl and stxvl for small variable memcpy moves. Date: Thu, 17 Nov 2022 21:54:14 +0000 (GMT) [thread overview] Message-ID: <20221117215414.E251B384F6C8@sourceware.org> (raw) https://gcc.gnu.org/g:65b1ab1e15183dc06bc00e0ad0ae546731ed513b commit 65b1ab1e15183dc06bc00e0ad0ae546731ed513b Author: Michael Meissner <meissner@linux.ibm.com> Date: Mon Nov 14 19:56:25 2022 -0500 Use lxvl and stxvl for small variable memcpy moves. This patch adds support to generate inline code for block copy with a variable size if the size is 16 bytes or less. If the size is more than 16 bytes, just call memcpy. To handle variable sizes, I found we need DImode versions of the two insns for copying memory (cpymem<mode> and <movmem<mode>). 2022-11-14 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/rs6000-string.cc (expand_block_move): Add support for using lxvl and stxvl to move up to 16 bytes inline without calling memcpy. * config/rs6000/rs6000.md (cpymem<mode>): Expand cpymemsi to also provide cpymemdi to handle DImode sizes as well as SImode sizes. (movmem<mode>): Expand movmemsi to also provide movmemdi to handle DImode sizes as well as SImode sizes. Diff: --- gcc/config/rs6000/rs6000-string.cc | 49 ++++++++++++++++++++++++++++++++++++-- gcc/config/rs6000/rs6000.md | 12 +++++----- 2 files changed, 53 insertions(+), 8 deletions(-) diff --git a/gcc/config/rs6000/rs6000-string.cc b/gcc/config/rs6000/rs6000-string.cc index cd8ee8c2f7e..596fbc634f4 100644 --- a/gcc/config/rs6000/rs6000-string.cc +++ b/gcc/config/rs6000/rs6000-string.cc @@ -2760,9 +2760,54 @@ expand_block_move (rtx operands[], bool might_overlap) rtx stores[MAX_MOVE_REG]; int num_reg = 0; - /* If this is not a fixed size move, just call memcpy */ + /* If this is not a fixed size move, see if we can use load/store vector with + length to handle multiple bytes. Don't do the optimization if -Os. + Otherwise, just call memcpy. */ if (! constp) - return 0; + { + if (TARGET_BLOCK_OPS_UNALIGNED_VSX && TARGET_P9_VECTOR && TARGET_64BIT + && !optimize_size) + { + rtx join_label = gen_label_rtx (); + rtx inline_label = gen_label_rtx (); + rtx dest_addr = copy_addr_to_reg (XEXP (orig_dest, 0)); + rtx src_addr = copy_addr_to_reg (XEXP (orig_src, 0)); + + /* Call memcpy if the size is too large. */ + bytes_rtx = force_reg (Pmode, bytes_rtx); + rtx cr = gen_reg_rtx (CCUNSmode); + rtx max_size = GEN_INT (16); + emit_insn (gen_rtx_SET (cr, + gen_rtx_COMPARE (CCUNSmode, bytes_rtx, + max_size))); + + do_ifelse (CCUNSmode, LEU, NULL_RTX, NULL_RTX, cr, + inline_label, profile_probability::likely ()); + + tree fun = builtin_decl_explicit (BUILT_IN_MEMCPY); + emit_library_call_value (XEXP (DECL_RTL (fun), 0), + NULL_RTX, LCT_NORMAL, Pmode, + dest_addr, Pmode, + src_addr, Pmode, + bytes_rtx, Pmode); + + rtx join_ref = gen_rtx_LABEL_REF (VOIDmode, join_label); + emit_jump_insn (gen_rtx_SET (pc_rtx, join_ref)); + emit_barrier (); + + emit_label (inline_label); + + /* Move the final 0..16 bytes. */ + rtx vreg = gen_reg_rtx (V16QImode); + emit_insn (gen_lxvl (vreg, src_addr, bytes_rtx)); + emit_insn (gen_stxvl (vreg, dest_addr, bytes_rtx)); + + emit_label (join_label); + return 1; + } + + return 0; + } /* This must be a fixed size alignment */ gcc_assert (CONST_INT_P (align_rtx)); diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index e9dfb138603..12bae0d32a7 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -9880,11 +9880,11 @@ ;; Argument 2 is the length ;; Argument 3 is the alignment -(define_expand "cpymemsi" +(define_expand "cpymem<mode>" [(parallel [(set (match_operand:BLK 0 "") (match_operand:BLK 1 "")) - (use (match_operand:SI 2 "")) - (use (match_operand:SI 3 ""))])] + (use (match_operand:GPR 2 "")) + (use (match_operand:GPR 3 ""))])] "" { if (expand_block_move (operands, false)) @@ -9899,11 +9899,11 @@ ;; Argument 2 is the length ;; Argument 3 is the alignment -(define_expand "movmemsi" +(define_expand "movmem<mode>" [(parallel [(set (match_operand:BLK 0 "") (match_operand:BLK 1 "")) - (use (match_operand:SI 2 "")) - (use (match_operand:SI 3 ""))])] + (use (match_operand:GPR 2 "")) + (use (match_operand:GPR 3 ""))])] "" { if (expand_block_move (operands, true))
next reply other threads:[~2022-11-17 21:54 UTC|newest] Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-11-17 21:54 Michael Meissner [this message] -- strict thread matches above, loose matches on Subject: below -- 2022-11-17 21:54 Michael Meissner 2022-11-15 1:56 Michael Meissner 2022-11-15 0:58 Michael Meissner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20221117215414.E251B384F6C8@sourceware.org \ --to=meissner@gcc.gnu.org \ --cc=gcc-cvs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).