public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Christoph Muellner <christoph.muellner@vrull.eu>
To: gcc-patches@gcc.gnu.org, Kito Cheng <kito.cheng@sifive.com>,
	Jim Wilson <jim.wilson.gcc@gmail.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Andrew Waterman <andrew@sifive.com>,
	Philipp Tomsich <philipp.tomsich@vrull.eu>,
	Jeff Law <jeffreyalaw@gmail.com>,
	Vineet Gupta <vineetg@rivosinc.com>
Cc: "Christoph Müllner" <christoph.muellner@vrull.eu>
Subject: [PATCH 5/7] riscv: Use by-pieces to do overlapping accesses in block_move_straight
Date: Mon, 14 Nov 2022 00:05:19 +0100	[thread overview]
Message-ID: <20221113230521.712693-6-christoph.muellner@vrull.eu> (raw)
In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu>

From: Christoph Müllner <christoph.muellner@vrull.eu>

The current implementation of riscv_block_move_straight() emits a couple
of load-store pairs with maximum width (e.g. 8-byte for RV64).
The remainder is handed over to move_by_pieces(), which emits code based
target settings like slow_unaligned_access and overlap_op_by_pieces.

move_by_pieces() will emit overlapping memory accesses with maximum
width only if the given length exceeds the size of one access
(e.g. 15-bytes for 8-byte accesses).

This patch changes the implementation of riscv_block_move_straight()
such, that it preserves a remainder within the interval
[delta..2*delta) instead of [0..delta), so that overlapping memory
access may be emitted (if the requirements for them are given).

gcc/ChangeLog:

	* config/riscv/riscv-string.c (riscv_block_move_straight):
	  Adjust range for emitted load/store pairs.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/config/riscv/riscv-string.cc              |  8 ++++----
 .../gcc.target/riscv/memcpy-overlapping.c     | 19 ++++++++-----------
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 6882f0be269..1137df475be 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -57,18 +57,18 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length)
   delta = bits / BITS_PER_UNIT;
 
   /* Allocate a buffer for the temporary registers.  */
-  regs = XALLOCAVEC (rtx, length / delta);
+  regs = XALLOCAVEC (rtx, length / delta - 1);
 
   /* Load as many BITS-sized chunks as possible.  Use a normal load if
      the source has enough alignment, otherwise use left/right pairs.  */
-  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+  for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++)
     {
       regs[i] = gen_reg_rtx (mode);
       riscv_emit_move (regs[i], adjust_address (src, mode, offset));
     }
 
   /* Copy the chunks to the destination.  */
-  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+  for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++)
     riscv_emit_move (adjust_address (dest, mode, offset), regs[i]);
 
   /* Mop up any left-over bytes.  */
@@ -166,7 +166,7 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length)
 
       if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
 	{
-	  riscv_block_move_straight (dest, src, INTVAL (length));
+	  riscv_block_move_straight (dest, src, hwi_length);
 	  return true;
 	}
       else if (optimize && align >= BITS_PER_WORD)
diff --git a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c
index ffb7248bfd1..ef95bfb879b 100644
--- a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c
+++ b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c
@@ -25,26 +25,23 @@ COPY_N(15)
 /* Emits 2x {ld,sd} and 1x {lw,sw}.  */
 COPY_N(19)
 
-/* Emits 3x ld and 3x sd.  */
+/* Emits 3x {ld,sd}.  */
 COPY_N(23)
 
 /* The by-pieces infrastructure handles up to 24 bytes.
    So the code below is emitted via cpymemsi/block_move_straight.  */
 
-/* Emits 3x {ld,sd} and 1x {lhu,lbu,sh,sb}.  */
+/* Emits 3x {ld,sd} and 1x {lw,sw}.  */
 COPY_N(27)
 
-/* Emits 3x {ld,sd} and 1x {lw,lbu,sw,sb}.  */
+/* Emits 4x {ld,sd}.  */
 COPY_N(29)
 
-/* Emits 3x {ld,sd} and 2x {lw,sw}.  */
+/* Emits 4x {ld,sd}.  */
 COPY_N(31)
 
-/* { dg-final { scan-assembler-times "ld\t" 21 } } */
-/* { dg-final { scan-assembler-times "sd\t" 21 } } */
+/* { dg-final { scan-assembler-times "ld\t" 23 } } */
+/* { dg-final { scan-assembler-times "sd\t" 23 } } */
 
-/* { dg-final { scan-assembler-times "lw\t" 5 } } */
-/* { dg-final { scan-assembler-times "sw\t" 5 } } */
-
-/* { dg-final { scan-assembler-times "lbu\t" 2 } } */
-/* { dg-final { scan-assembler-times "sb\t" 2 } } */
+/* { dg-final { scan-assembler-times "lw\t" 3 } } */
+/* { dg-final { scan-assembler-times "sw\t" 3 } } */
-- 
2.38.1


  parent reply	other threads:[~2022-11-13 23:05 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-13 23:05 [PATCH 0/7] riscv: Improve builtins expansion Christoph Muellner
2022-11-13 23:05 ` [PATCH 1/7] riscv: bitmanip: add orc.b as an unspec Christoph Muellner
2022-11-14 16:51   ` Jeff Law
2022-11-14 17:53     ` Jeff Law
2022-11-14 19:05     ` Philipp Tomsich
2022-11-13 23:05 ` [PATCH 2/7] riscv: bitmanip/zbb: Add prefix/postfix and enable visiblity Christoph Muellner
2022-11-14 16:55   ` Jeff Law
2022-11-13 23:05 ` [PATCH 3/7] riscv: Enable overlap-by-pieces via tune param Christoph Muellner
2022-11-14  2:48   ` Vineet Gupta
2022-11-14  7:59     ` Philipp Tomsich
2022-11-14  8:29       ` Christoph Müllner
2022-11-14 19:04         ` Jeff Law
2022-11-14 19:07           ` Christoph Müllner
2022-11-13 23:05 ` [PATCH 4/7] riscv: Move riscv_block_move_loop to separate file Christoph Muellner
2022-11-14 16:56   ` Jeff Law
2022-11-13 23:05 ` Christoph Muellner [this message]
2022-11-14 17:16   ` [PATCH 5/7] riscv: Use by-pieces to do overlapping accesses in block_move_straight Jeff Law
2022-11-14 19:01     ` Christoph Müllner
2022-11-14 19:05       ` Jeff Law
2022-11-13 23:05 ` [PATCH 6/7] riscv: Add support for strlen inline expansion Christoph Muellner
2022-11-14 18:17   ` Jeff Law
2022-11-14 21:07     ` Christoph Müllner
2022-11-13 23:05 ` [PATCH 7/7] riscv: Add support for str(n)cmp " Christoph Muellner
2022-11-14 19:28   ` Jeff Law
2022-11-14 21:49     ` Christoph Müllner
2022-11-15  0:22       ` Jeff Law
2022-11-15  0:46   ` Kito Cheng
2022-11-15  0:53     ` Palmer Dabbelt
2022-11-15  1:55       ` Kito Cheng
2022-11-15  3:41       ` Jeff Law
2022-11-15 22:22     ` Christoph Müllner
2022-11-16  0:15     ` Philipp Tomsich
2022-11-21  3:24       ` Kito Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221113230521.712693-6-christoph.muellner@vrull.eu \
    --to=christoph.muellner@vrull.eu \
    --cc=andrew@sifive.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=jim.wilson.gcc@gmail.com \
    --cc=kito.cheng@sifive.com \
    --cc=palmer@dabbelt.com \
    --cc=philipp.tomsich@vrull.eu \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).