public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes
@ 2018-06-25 15:41 Aaron Sawdey
  2018-06-26 16:01 ` Segher Boessenkool
  0 siblings, 1 reply; 2+ messages in thread
From: Aaron Sawdey @ 2018-06-25 15:41 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

In gcc 8 I added support for unaligned vsx in the builtin expansion of
memset(x,0,y). Turns out that for memset of less than 32 bytes, this
doesn't really help much, and it also runs into an egregious load-hit-
store case in CPU2006 components gcc and hmmer.

This patch reverts to the previous (gcc 7) behavior for memset of 16-31 
bytes, which is to use vsx stores only if the target is 16 byte
aligned. For 32 bytes or more, unaligned vsx stores will still be used.
  Performance testing of the memset expansion shows that not much is
given up by using scalar stores for 16-31 bytes, and CPU2006 runs show
the performance regression is fixed.

Regstrap passes on powerpc64le, ok for trunk and backport to 8?

Thanks,
   Aaron

2018-06-25  Aaron Sawdey  <acsawdey@linux.ibm.com>

	* config/rs6000/rs6000-string.c (expand_block_clear): Don't use
	unaligned vsx for 16B memset.


-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

[-- Attachment #2: memset1632.patch --]
[-- Type: text/x-patch, Size: 557 bytes --]

Index: gcc/config/rs6000/rs6000-string.c
===================================================================
--- gcc/config/rs6000/rs6000-string.c	(revision 261808)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -90,7 +90,9 @@
       machine_mode mode = BLKmode;
       rtx dest;
 
-      if (bytes >= 16 && TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX))
+      if (TARGET_ALTIVEC
+	  && ((bytes >= 16 && align >= 128)
+	      || (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX)))
 	{
 	  clear_bytes = 16;
 	  mode = V4SImode;

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes
  2018-06-25 15:41 [PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes Aaron Sawdey
@ 2018-06-26 16:01 ` Segher Boessenkool
  0 siblings, 0 replies; 2+ messages in thread
From: Segher Boessenkool @ 2018-06-26 16:01 UTC (permalink / raw)
  To: Aaron Sawdey; +Cc: GCC Patches

Hi!

On Mon, Jun 25, 2018 at 10:41:32AM -0500, Aaron Sawdey wrote:
> In gcc 8 I added support for unaligned vsx in the builtin expansion of
> memset(x,0,y). Turns out that for memset of less than 32 bytes, this
> doesn't really help much, and it also runs into an egregious load-hit-
> store case in CPU2006 components gcc and hmmer.
> 
> This patch reverts to the previous (gcc 7) behavior for memset of 16-31 
> bytes, which is to use vsx stores only if the target is 16 byte
> aligned. For 32 bytes or more, unaligned vsx stores will still be used.
>   Performance testing of the memset expansion shows that not much is
> given up by using scalar stores for 16-31 bytes, and CPU2006 runs show
> the performance regression is fixed.
> 
> Regstrap passes on powerpc64le, ok for trunk and backport to 8?

Yes, okay for both.  Thanks!


Segher


> 2018-06-25  Aaron Sawdey  <acsawdey@linux.ibm.com>
> 
> 	* config/rs6000/rs6000-string.c (expand_block_clear): Don't use
> 	unaligned vsx for 16B memset.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-06-26 16:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-25 15:41 [PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes Aaron Sawdey
2018-06-26 16:01 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).