public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] strlen: Fix handle_builtin_string_cmp [PR96758]
@ 2020-08-24 21:30 Jakub Jelinek
  2020-08-25  7:09 ` Richard Biener
  0 siblings, 1 reply; 2+ messages in thread
From: Jakub Jelinek @ 2020-08-24 21:30 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: gcc-patches

Hi!

The following testcase is miscompiled, because handle_builtin_string_cmp
sees a strncmp call with constant last argument 4, where one of the strings
has an upper bound of 5 bytes (due to it being an array of that size) and
the other has a known string length of 1 and the result is used only in
equality comparison.
It is folded into __builtin_strncmp_eq (str1, str2, 4), which is
incorrect, because that means reading 4 bytes from both strings and
comparing that.  When one of the strings has known strlen of 1, we want to
compare just 2 bytes, not 4, as strncmp shouldn't compare any bytes beyond
the null.
So, the last argument to __builtin_strncmp_eq should be the minimum of the
provided strncmp last argument and the known string length + 1 (assuming
the other string has only a known upper bound due to array size).

Besides that, I've noticed the code has been written with the intent to also
support the case where we know exact string length of both strings (but not
the string content, so we can't compute it at compile time).  In that case,
both cstlen1 and cstlen2 are non-negative and both arysiz1 and arysiz2 are
negative.  We wouldn't optimize that, cmpsiz would be either the strncmp
last argument, or for strcmp the first string length, but varsiz would be
-1 and thus cmpsiz would be never < varsiz.  The patch fixes it by using the
correct length, in that case using the minimum of the two and for strncmp
also the last argument.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
For backporting, I think we should keep the
-  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
+  if ((varsiz < 0 || cmpsiz < varsiz) && used_only_for_zero_equality (lhs))
change outso that we don't introduce a new optimization and just fix the
bug.

2020-08-24  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/96758
	* tree-ssa-strlen.c (handle_builtin_string_cmp): If both cstlen1
	and cstlen2 are set, set cmpsiz to their minimum, otherwise use the
	one that is set.  If bound is used and smaller than cmpsiz, set cmpsiz
	to bound.  If both cstlen1 and cstlen2 are set, perform the optimization.

	* gcc.dg/strcmpopt_12.c: New test.

--- gcc/tree-ssa-strlen.c.jj	2020-07-28 15:39:10.078755265 +0200
+++ gcc/tree-ssa-strlen.c	2020-08-24 11:31:05.457832003 +0200
@@ -4485,11 +4485,19 @@ handle_builtin_string_cmp (gimple_stmt_i
     ++cstlen2;
 
   /* The exact number of characters to compare.  */
-  HOST_WIDE_INT cmpsiz = bound < 0 ? cstlen1 < 0 ? cstlen2 : cstlen1 : bound;
+  HOST_WIDE_INT cmpsiz;
+  if (cstlen1 >= 0 && cstlen2 >= 0)
+    cmpsiz = MIN (cstlen1, cstlen2);
+  else if (cstlen1 >= 0)
+    cmpsiz = cstlen1;
+  else
+    cmpsiz = cstlen2;
+  if (bound >= 0)
+    cmpsiz = MIN (cmpsiz, bound);
   /* The size of the array in which the unknown string is stored.  */
   HOST_WIDE_INT varsiz = arysiz1 < 0 ? arysiz2 : arysiz1;
 
-  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
+  if ((varsiz < 0 || cmpsiz < varsiz) && used_only_for_zero_equality (lhs))
     {
       /* If the known length is less than the size of the other array
 	 and the strcmp result is only used to test equality to zero,
--- gcc/testsuite/gcc.dg/strcmpopt_12.c.jj	2020-08-24 11:36:21.380357549 +0200
+++ gcc/testsuite/gcc.dg/strcmpopt_12.c	2020-08-24 11:36:35.405158915 +0200
@@ -0,0 +1,17 @@
+/* PR tree-optimization/96758 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+int v = 1;
+
+int
+main ()
+{
+  const char *s = v ? "a" : "b";
+  char x[5];
+  char y[5] = "a\0a";
+  __builtin_memcpy (x, y, sizeof (y));
+  if (__builtin_strncmp (x, s, 4) != 0)
+    __builtin_abort ();
+  return 0;
+}

	Jakub


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] strlen: Fix handle_builtin_string_cmp [PR96758]
  2020-08-24 21:30 [PATCH] strlen: Fix handle_builtin_string_cmp [PR96758] Jakub Jelinek
@ 2020-08-25  7:09 ` Richard Biener
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Biener @ 2020-08-25  7:09 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, gcc-patches

On Mon, 24 Aug 2020, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled, because handle_builtin_string_cmp
> sees a strncmp call with constant last argument 4, where one of the strings
> has an upper bound of 5 bytes (due to it being an array of that size) and
> the other has a known string length of 1 and the result is used only in
> equality comparison.
> It is folded into __builtin_strncmp_eq (str1, str2, 4), which is
> incorrect, because that means reading 4 bytes from both strings and
> comparing that.  When one of the strings has known strlen of 1, we want to
> compare just 2 bytes, not 4, as strncmp shouldn't compare any bytes beyond
> the null.
> So, the last argument to __builtin_strncmp_eq should be the minimum of the
> provided strncmp last argument and the known string length + 1 (assuming
> the other string has only a known upper bound due to array size).
> 
> Besides that, I've noticed the code has been written with the intent to also
> support the case where we know exact string length of both strings (but not
> the string content, so we can't compute it at compile time).  In that case,
> both cstlen1 and cstlen2 are non-negative and both arysiz1 and arysiz2 are
> negative.  We wouldn't optimize that, cmpsiz would be either the strncmp
> last argument, or for strcmp the first string length, but varsiz would be
> -1 and thus cmpsiz would be never < varsiz.  The patch fixes it by using the
> correct length, in that case using the minimum of the two and for strncmp
> also the last argument.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> For backporting, I think we should keep the
> -  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
> +  if ((varsiz < 0 || cmpsiz < varsiz) && used_only_for_zero_equality (lhs))
> change outso that we don't introduce a new optimization and just fix the
> bug.

Agreed.

Richard.

> 2020-08-24  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/96758
> 	* tree-ssa-strlen.c (handle_builtin_string_cmp): If both cstlen1
> 	and cstlen2 are set, set cmpsiz to their minimum, otherwise use the
> 	one that is set.  If bound is used and smaller than cmpsiz, set cmpsiz
> 	to bound.  If both cstlen1 and cstlen2 are set, perform the optimization.
> 
> 	* gcc.dg/strcmpopt_12.c: New test.
> 
> --- gcc/tree-ssa-strlen.c.jj	2020-07-28 15:39:10.078755265 +0200
> +++ gcc/tree-ssa-strlen.c	2020-08-24 11:31:05.457832003 +0200
> @@ -4485,11 +4485,19 @@ handle_builtin_string_cmp (gimple_stmt_i
>      ++cstlen2;
>  
>    /* The exact number of characters to compare.  */
> -  HOST_WIDE_INT cmpsiz = bound < 0 ? cstlen1 < 0 ? cstlen2 : cstlen1 : bound;
> +  HOST_WIDE_INT cmpsiz;
> +  if (cstlen1 >= 0 && cstlen2 >= 0)
> +    cmpsiz = MIN (cstlen1, cstlen2);
> +  else if (cstlen1 >= 0)
> +    cmpsiz = cstlen1;
> +  else
> +    cmpsiz = cstlen2;
> +  if (bound >= 0)
> +    cmpsiz = MIN (cmpsiz, bound);
>    /* The size of the array in which the unknown string is stored.  */
>    HOST_WIDE_INT varsiz = arysiz1 < 0 ? arysiz2 : arysiz1;
>  
> -  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
> +  if ((varsiz < 0 || cmpsiz < varsiz) && used_only_for_zero_equality (lhs))
>      {
>        /* If the known length is less than the size of the other array
>  	 and the strcmp result is only used to test equality to zero,
> --- gcc/testsuite/gcc.dg/strcmpopt_12.c.jj	2020-08-24 11:36:21.380357549 +0200
> +++ gcc/testsuite/gcc.dg/strcmpopt_12.c	2020-08-24 11:36:35.405158915 +0200
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/96758 */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +int v = 1;
> +
> +int
> +main ()
> +{
> +  const char *s = v ? "a" : "b";
> +  char x[5];
> +  char y[5] = "a\0a";
> +  __builtin_memcpy (x, y, sizeof (y));
> +  if (__builtin_strncmp (x, s, 4) != 0)
> +    __builtin_abort ();
> +  return 0;
> +}
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-08-25  7:09 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-24 21:30 [PATCH] strlen: Fix handle_builtin_string_cmp [PR96758] Jakub Jelinek
2020-08-25  7:09 ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).