public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion
@ 2017-12-12 19:42 Aaron Sawdey
  2017-12-12 19:51 ` Jakub Jelinek
  0 siblings, 1 reply; 5+ messages in thread
From: Aaron Sawdey @ 2017-12-12 19:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David Edelsohn, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]

In the code I put in gcc7 for expanding memcmp and strcmp/strncmp as
builtins, I used set_mem_size to set the size of loads to only the
bytes I was actually going to compare. However this is really incorrect
and the test case for 82190 was failing because alias analysis believed
the bogus size and though there was no conflict between an 8byte load
used for comparing 6 bytes and a later store to the 7th byte. As a
result it eliminated that load from the 7 byte compare that followed
later.

This patch changes the set_mem_size calls in expand_block_move and
expand_strn_compare to set the size to the size of the load being done
regardless of how many bytes are being used.

OK for trunk if bootstrap/regtest passes on ppc64le?

2017-12-12  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>

	PR target/82190
	* config/rs6000/rs6000-string.c (expand_block_move,
	expand_strn_compare): fix set_mem_size() calls.

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

[-- Attachment #2: pr82190.patch --]
[-- Type: text/x-patch, Size: 1492 bytes --]

Index: gcc/config/rs6000/rs6000-string.c
===================================================================
--- gcc/config/rs6000/rs6000-string.c	(revision 255585)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -1247,6 +1247,9 @@
   if (bytes > rs6000_block_move_inline_limit)
     return 0;
 
+  bool isP8 = (rs6000_cpu == PROCESSOR_POWER8);
+  bool isP9 = (rs6000_cpu == PROCESSOR_POWER9);
+
   for (offset = 0; bytes > 0; offset += move_bytes, bytes -= move_bytes)
     {
       union {
@@ -1258,7 +1261,7 @@
 
       /* Altivec first, since it will be faster than a string move
 	 when it applies, and usually not significantly larger.  */
-      if (TARGET_ALTIVEC && bytes >= 16 && align >= 128)
+      if (TARGET_ALTIVEC && bytes >= 16 && (isP8 || isP9 || align >= 128))
 	{
 	  move_bytes = 16;
 	  mode = V4SImode;
Index: gcc/testsuite/gcc.dg/pr82190.c
===================================================================
--- gcc/testsuite/gcc.dg/pr82190.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/pr82190.c	(working copy)
@@ -0,0 +1,22 @@
+/* PR target/82190 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-optimize-strlen -fweb" } */
+
+char src[64] __attribute__ ((aligned)) = "aaaaaaa";
+char dst[64] __attribute__ ((aligned));
+
+int
+main ()
+{
+  __builtin_memcpy (dst, src, 6);
+  if (__builtin_memcmp (dst, src, 6))
+    __builtin_abort ();
+
+  __builtin_memcpy (dst, src, 7);
+  if (__builtin_memcmp (dst, src, 7))
+    __builtin_abort ();
+
+  return 0;
+}
+
+

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion
  2017-12-12 19:42 [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion Aaron Sawdey
@ 2017-12-12 19:51 ` Jakub Jelinek
  2017-12-12 20:01   ` Aaron Sawdey
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2017-12-12 19:51 UTC (permalink / raw)
  To: Aaron Sawdey
  Cc: gcc-patches, Segher Boessenkool, David Edelsohn, Bill Schmidt

On Tue, Dec 12, 2017 at 01:40:41PM -0600, Aaron Sawdey wrote:
> 2017-12-12  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>
> 
> 	PR target/82190
> 	* config/rs6000/rs6000-string.c (expand_block_move,
> 	expand_strn_compare): fix set_mem_size() calls.

That should be capitalized: Fix instead of fix

> --- gcc/config/rs6000/rs6000-string.c	(revision 255585)
> +++ gcc/config/rs6000/rs6000-string.c	(working copy)
> @@ -1247,6 +1247,9 @@
>    if (bytes > rs6000_block_move_inline_limit)
>      return 0;
>  
> +  bool isP8 = (rs6000_cpu == PROCESSOR_POWER8);
> +  bool isP9 = (rs6000_cpu == PROCESSOR_POWER9);
> +
>    for (offset = 0; bytes > 0; offset += move_bytes, bytes -= move_bytes)
>      {
>        union {
> @@ -1258,7 +1261,7 @@
>  
>        /* Altivec first, since it will be faster than a string move
>  	 when it applies, and usually not significantly larger.  */
> -      if (TARGET_ALTIVEC && bytes >= 16 && align >= 128)
> +      if (TARGET_ALTIVEC && bytes >= 16 && (isP8 || isP9 || align >= 128))
>  	{
>  	  move_bytes = 16;
>  	  mode = V4SImode;

Is this the patch you meant to attach?  First of all, it only changes
expand_block_move, not expand_strn_compare, and the change seems more
like an optimization of P8/P9 rather than actual fix (otherwise, wouldn't
it fail on say P7?).

> +  return 0;
> +}
> +
> +

Please avoid unnecessary trailing whitespace.

	Jakub

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion
  2017-12-12 19:51 ` Jakub Jelinek
@ 2017-12-12 20:01   ` Aaron Sawdey
  2017-12-12 20:12     ` Segher Boessenkool
  0 siblings, 1 reply; 5+ messages in thread
From: Aaron Sawdey @ 2017-12-12 20:01 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Segher Boessenkool, David Edelsohn, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 1016 bytes --]

On Tue, 2017-12-12 at 20:50 +0100, Jakub Jelinek wrote:
> On Tue, Dec 12, 2017 at 01:40:41PM -0600, Aaron Sawdey wrote:
> > 2017-12-12  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>
> > 
> > 	PR target/82190
> > 	* config/rs6000/rs6000-string.c (expand_block_move,
> > 	expand_strn_compare): fix set_mem_size() calls.
> 
> That should be capitalized: Fix instead of fix

[wrong patch deleted]

> Is this the patch you meant to attach?  First of all, it only changes
> expand_block_move, not expand_strn_compare, and the change seems more
> like an optimization of P8/P9 rather than actual fix (otherwise,
> wouldn't
> it fail on say P7?).
> 
> > +  return 0;
> > +}
> > +
> > +
> 
> Please avoid unnecessary trailing whitespace.
> 

Jakub,
  Yes that is a different patch unrelated to the 82190 fix. I've
attached the correct patch.

Thanks!
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

[-- Attachment #2: pr82190.patch --]
[-- Type: text/x-patch, Size: 2764 bytes --]

Index: gcc/config/rs6000/rs6000-string.c
===================================================================
--- gcc/config/rs6000/rs6000-string.c   (revision 255585)
+++ gcc/config/rs6000/rs6000-string.c   (working copy)
@@ -459,7 +459,7 @@
          rtx src1_reg = copy_addr_to_reg (XEXP (src1, 0));
          src1 = replace_equiv_address (src1, src1_reg);
        }
-      set_mem_size (src1, cmp_bytes);
+      set_mem_size (src1, load_mode_size);
 
       if (!REG_P (XEXP (src2, 0)))
        {
@@ -466,7 +466,7 @@
          rtx src2_reg = copy_addr_to_reg (XEXP (src2, 0));
          src2 = replace_equiv_address (src2, src2_reg);
        }
-      set_mem_size (src2, cmp_bytes);
+      set_mem_size (src2, load_mode_size);
 
       do_load_for_compare (tmp_reg_src1, src1, load_mode);
       do_load_for_compare (tmp_reg_src2, src2, load_mode);
@@ -937,7 +937,7 @@
          rtx src1_reg = copy_addr_to_reg (XEXP (src1, 0));
          src1 = replace_equiv_address (src1, src1_reg);
        }
-      set_mem_size (src1, cmp_bytes);
+      set_mem_size (src1, load_mode_size);
 
       if (!REG_P (XEXP (src2, 0)))
        {
@@ -944,7 +944,7 @@
          rtx src2_reg = copy_addr_to_reg (XEXP (src2, 0));
          src2 = replace_equiv_address (src2, src2_reg);
        }
-      set_mem_size (src2, cmp_bytes);
+      set_mem_size (src2, load_mode_size);
 
       do_load_for_compare (tmp_reg_src1, src1, load_mode);
       do_load_for_compare (tmp_reg_src2, src2, load_mode);
@@ -1096,7 +1096,7 @@
          rtx src1_reg = copy_addr_to_reg (XEXP (src1, 0));     
          src1 = replace_equiv_address (src1, src1_reg);
        }
-      set_mem_size (src1, cmp_bytes);
+      set_mem_size (src1, load_mode_size);     
 
       if (!REG_P (XEXP (src2, 0)))
        {
@@ -1103,7 +1103,7 @@
          rtx src2_reg = copy_addr_to_reg (XEXP (src2, 0));
          src2 = replace_equiv_address (src2, src2_reg);
        }
-      set_mem_size (src2, cmp_bytes);
+      set_mem_size (src2, load_mode_size);
 
       /* Construct call to strcmp/strncmp to compare the rest of the string.  */
       if (no_length)
Index: gcc/testsuite/gcc.dg/pr82190.c
===================================================================
--- gcc/testsuite/gcc.dg/pr82190.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/pr82190.c	(working copy)
@@ -0,0 +1,20 @@
+/* PR target/82190 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-optimize-strlen -fweb" } */
+
+char src[64] __attribute__ ((aligned)) = "aaaaaaa";
+char dst[64] __attribute__ ((aligned));
+
+int
+main ()
+{
+  __builtin_memcpy (dst, src, 6);
+  if (__builtin_memcmp (dst, src, 6))
+    __builtin_abort ();
+
+  __builtin_memcpy (dst, src, 7);
+  if (__builtin_memcmp (dst, src, 7))
+    __builtin_abort ();
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion
  2017-12-12 20:01   ` Aaron Sawdey
@ 2017-12-12 20:12     ` Segher Boessenkool
  2018-01-02 23:02       ` Aaron Sawdey
  0 siblings, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2017-12-12 20:12 UTC (permalink / raw)
  To: Aaron Sawdey; +Cc: Jakub Jelinek, gcc-patches, David Edelsohn, Bill Schmidt

On Tue, Dec 12, 2017 at 02:01:16PM -0600, Aaron Sawdey wrote:
> > > 2017-12-12  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>
> > > 
> > > 	PR target/82190
> > > 	* config/rs6000/rs6000-string.c (expand_block_move,
> > > 	expand_strn_compare): fix set_mem_size() calls.
> > 
> > That should be capitalized: Fix instead of fix
> 
> [wrong patch deleted]

>   Yes that is a different patch unrelated to the 82190 fix. I've
> attached the correct patch.

That looks better :-)  Okay for trunk, thanks!


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion
  2017-12-12 20:12     ` Segher Boessenkool
@ 2018-01-02 23:02       ` Aaron Sawdey
  0 siblings, 0 replies; 5+ messages in thread
From: Aaron Sawdey @ 2018-01-02 23:02 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Jakub Jelinek, gcc-patches, David Edelsohn, Bill Schmidt

On Tue, 2017-12-12 at 14:12 -0600, Segher Boessenkool wrote:
> That looks better :-)  Okay for trunk, thanks!

As we discussed on IRC before christmas, I've simplified this to use
TARGET_EFFICIENT_UNALIGNED_VSX instead of checking explicitly for P8/P9
processors. Has the same effect but is cleaner/clearer. Committed in
256112.

  Aaron

Index: gcc/config/rs6000/rs6000-string.c
===================================================================
--- gcc/config/rs6000/rs6000-string.c   (revision 256110)
+++ gcc/config/rs6000/rs6000-string.c   (working copy)
@@ -73,7 +73,7 @@
      When optimize_size, avoid any significant code bloat; calling
      memset is about 4 instructions, so allow for one instruction to
      load zero and three to do clearing.  */
-  if (TARGET_ALTIVEC && align >= 128)
+  if (TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX))
     clear_step = 16;
   else if (TARGET_POWERPC64 && (align >= 64 || !STRICT_ALIGNMENT))
     clear_step = 8;
@@ -90,7 +90,7 @@
       machine_mode mode = BLKmode;
       rtx dest;
  -      if (bytes >= 16 && TARGET_ALTIVEC && align >= 128)
+      if (bytes >= 16 && TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX))
        {
          clear_bytes = 16;
          mode = V4SImode;
@@ -1260,7 +1260,7 @@
         /* Altivec first, since it will be faster than a string move
         when it applies, and usually not significantly larger.  */
-      if (TARGET_ALTIVEC && bytes >= 16 && align >= 128)
+      if (TARGET_ALTIVEC && bytes >= 16 && (TARGET_EFFICIENT_UNALIGNED_VSX || align >= 128))
        {
          move_bytes = 16;
          mode = V4SImode;


-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-02 23:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-12 19:42 [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion Aaron Sawdey
2017-12-12 19:51 ` Jakub Jelinek
2017-12-12 20:01   ` Aaron Sawdey
2017-12-12 20:12     ` Segher Boessenkool
2018-01-02 23:02       ` Aaron Sawdey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).