public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693)
@ 2019-01-05 12:20 Bernd Edlinger
  2019-01-05 12:31 ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Bernd Edlinger @ 2019-01-05 12:20 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener, gcc-patches, Jeff Law

Hi,

co-incidentally my "Make strlen range computations more conservative" patch
contained a fix on the same spot, I just did not have a test case for it:

@@ -3184,7 +3146,10 @@ get_min_string_length (tree rhs, bool *f
       && TREE_READONLY (rhs))
     rhs = DECL_INITIAL (rhs);

-  if (rhs && TREE_CODE (rhs) == STRING_CST)
+  if (rhs && TREE_CODE (rhs) == STRING_CST
+      && tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (rhs)))) == 1
+      && TREE_STRING_LENGTH (rhs) > 0
+      && TREE_STRING_POINTER (rhs) [TREE_STRING_LENGTH (rhs) - 1] == '\0')
     {
       *full_string_p = true;
       return strlen (TREE_STRING_POINTER (rhs));


additionally to your patch this tests the string is in fact a single-byte string.
since strlen returns garbage otherwise.


Bernd.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693)
  2019-01-05 12:20 [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693) Bernd Edlinger
@ 2019-01-05 12:31 ` Jakub Jelinek
  2019-01-05 13:17   ` Bernd Edlinger
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-01-05 12:31 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Richard Biener, gcc-patches, Jeff Law

On Sat, Jan 05, 2019 at 12:20:24PM +0000, Bernd Edlinger wrote:
> co-incidentally my "Make strlen range computations more conservative" patch
> contained a fix on the same spot, I just did not have a test case for it:
> 
> @@ -3184,7 +3146,10 @@ get_min_string_length (tree rhs, bool *f
>        && TREE_READONLY (rhs))
>      rhs = DECL_INITIAL (rhs);
> 
> -  if (rhs && TREE_CODE (rhs) == STRING_CST)
> +  if (rhs && TREE_CODE (rhs) == STRING_CST
> +      && tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (rhs)))) == 1
> +      && TREE_STRING_LENGTH (rhs) > 0
> +      && TREE_STRING_POINTER (rhs) [TREE_STRING_LENGTH (rhs) - 1] == '\0')
>      {
>        *full_string_p = true;
>        return strlen (TREE_STRING_POINTER (rhs));
> 
> 
> additionally to your patch this tests the string is in fact a single-byte string.
> since strlen returns garbage otherwise.

Multi-byte STRING_CSTs seem to be stored in the target byte order, e.g.
L"abcde" in a x86_64-linux -> powerpc64-linux cross is still
"\000\000\000a\000\000\000b\000\000\000c\000\000\000d\000\000\000\000"
so I think strlen is exactly what we want.  The tree-ssa-strlen.c pass
doesn't track string lengths in whatever units it has, it tracks number of
non-zero bytes followed by zero byte known at certain address, or, if there
is no zero byte known, the minimum amount of non-zero bytes known at certain
address.

And, your patch has also the bad effect that it won't return any length for
the strings that aren't zero terminated.  We do want to return 9 for the
"abcdefghi" string without zero terminator at the end, just need to set
*full_string_p to false.  And, for "abcd\000fghi" string without zero
termination at the end, we do want to return 4 and set *full_string_p to
true.

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693)
  2019-01-05 12:31 ` Jakub Jelinek
@ 2019-01-05 13:17   ` Bernd Edlinger
  2019-01-05 13:23     ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Bernd Edlinger @ 2019-01-05 13:17 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches, Jeff Law

On 1/5/19 1:31 PM, Jakub Jelinek wrote:
> On Sat, Jan 05, 2019 at 12:20:24PM +0000, Bernd Edlinger wrote:
>> co-incidentally my "Make strlen range computations more conservative" patch
>> contained a fix on the same spot, I just did not have a test case for it:
>>
>> @@ -3184,7 +3146,10 @@ get_min_string_length (tree rhs, bool *f
>>        && TREE_READONLY (rhs))
>>      rhs = DECL_INITIAL (rhs);
>>
>> -  if (rhs && TREE_CODE (rhs) == STRING_CST)
>> +  if (rhs && TREE_CODE (rhs) == STRING_CST
>> +      && tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (rhs)))) == 1
>> +      && TREE_STRING_LENGTH (rhs) > 0
>> +      && TREE_STRING_POINTER (rhs) [TREE_STRING_LENGTH (rhs) - 1] == '\0')
>>      {
>>        *full_string_p = true;
>>        return strlen (TREE_STRING_POINTER (rhs));
>>
>>
>> additionally to your patch this tests the string is in fact a single-byte string.
>> since strlen returns garbage otherwise.
> 
> Multi-byte STRING_CSTs seem to be stored in the target byte order, e.g.
> L"abcde" in a x86_64-linux -> powerpc64-linux cross is still
> "\000\000\000a\000\000\000b\000\000\000c\000\000\000d\000\000\000\000"
> so I think strlen is exactly what we want.  The tree-ssa-strlen.c pass
> doesn't track string lengths in whatever units it has, it tracks number of
> non-zero bytes followed by zero byte known at certain address, or, if there
> is no zero byte known, the minimum amount of non-zero bytes known at certain
> address.
> 

Well, yes it works, but this can only optimize invalid code like strlen((char*)L"wide").

However, optimizing invalid stuff away does both defeat warnings in later passes,
and is also not helpful for debugging the resulting code in general.

> And, your patch has also the bad effect that it won't return any length for
> the strings that aren't zero terminated.  We do want to return 9 for the
> "abcdefghi" string without zero terminator at the end, just need to set
> *full_string_p to false.  And, for "abcd\000fghi" string without zero
> termination at the end, we do want to return 4 and set *full_string_p to
> true.
> 

Right, I did not consider the nul in the middle.


Bernd.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693)
  2019-01-05 13:17   ` Bernd Edlinger
@ 2019-01-05 13:23     ` Jakub Jelinek
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Jelinek @ 2019-01-05 13:23 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Richard Biener, gcc-patches, Jeff Law

On Sat, Jan 05, 2019 at 01:17:21PM +0000, Bernd Edlinger wrote:
> Well, yes it works, but this can only optimize invalid code like strlen((char*)L"wide").

There is nothing invalid on it.  Furthermore, the strlen pass doesn't
optimize just strlen, it does many optimizations that rely on knowing how
many non-zero bytes there are and if followed by zero byte.
Initialization from a multi-byte string literal is just an initialization
like any other, from array of ints, or characters, whatever.

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693)
  2019-01-04 22:54 Jakub Jelinek
  2019-01-05  1:10 ` Jeff Law
@ 2019-01-11 18:29 ` Jeff Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2019-01-11 18:29 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: gcc-patches

On 1/4/19 3:54 PM, Jakub Jelinek wrote:
> Hi!
> 
> The following patch fixes ICU miscompilation, where an initial part of a
> buffer is initializer from non-zero terminated string literal (in ICU
> actually from an array of non-zero chars that the FE newly turns into
> non-zero terminated string literal), but the code recently added to
> tree-ssa-strlen.c doesn't consider the possibility of string literals that
> aren't zero terminated.
> 
> STRING_CSTs actually are always zero terminated, but if TREE_STRING_LENGTH
> doesn't include the terminating character, it is just bytes.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
> 
> I was considering using strnlen, but if all STRING_CSTs are actually zero
> terminated, it will mean it reads are most one further byte and on many
> targets strlen is more optimized than strnlen.
> 
> 2019-01-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/88693
> 	* tree-ssa-strlen.c (get_min_string_length): Don't set *full_string_p
> 	for STRING_CSTs that don't contain any NUL characters in the first
> 	TREE_STRING_LENGTH bytes.
> 
> 	* gcc.c-torture/execute/pr88693.c: New test.
OK
jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693)
  2019-01-04 22:54 Jakub Jelinek
@ 2019-01-05  1:10 ` Jeff Law
  2019-01-11 18:29 ` Jeff Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2019-01-05  1:10 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: gcc-patches

On 1/4/19 3:54 PM, Jakub Jelinek wrote:
> Hi!
> 
> The following patch fixes ICU miscompilation, where an initial part of a
> buffer is initializer from non-zero terminated string literal (in ICU
> actually from an array of non-zero chars that the FE newly turns into
> non-zero terminated string literal), but the code recently added to
> tree-ssa-strlen.c doesn't consider the possibility of string literals that
> aren't zero terminated.
> 
> STRING_CSTs actually are always zero terminated, but if TREE_STRING_LENGTH
> doesn't include the terminating character, it is just bytes.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
> 
> I was considering using strnlen, but if all STRING_CSTs are actually zero
> terminated, it will mean it reads are most one further byte and on many
> targets strlen is more optimized than strnlen.
> 
> 2019-01-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/88693
> 	* tree-ssa-strlen.c (get_min_string_length): Don't set *full_string_p
> 	for STRING_CSTs that don't contain any NUL characters in the first
> 	TREE_STRING_LENGTH bytes.
> 
> 	* gcc.c-torture/execute/pr88693.c: New test.
Don't have time to review tonight.  But FYI, my tester flagged ICU due
to a testsuite failure.

Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693)
@ 2019-01-04 22:54 Jakub Jelinek
  2019-01-05  1:10 ` Jeff Law
  2019-01-11 18:29 ` Jeff Law
  0 siblings, 2 replies; 7+ messages in thread
From: Jakub Jelinek @ 2019-01-04 22:54 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: gcc-patches

Hi!

The following patch fixes ICU miscompilation, where an initial part of a
buffer is initializer from non-zero terminated string literal (in ICU
actually from an array of non-zero chars that the FE newly turns into
non-zero terminated string literal), but the code recently added to
tree-ssa-strlen.c doesn't consider the possibility of string literals that
aren't zero terminated.

STRING_CSTs actually are always zero terminated, but if TREE_STRING_LENGTH
doesn't include the terminating character, it is just bytes.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

I was considering using strnlen, but if all STRING_CSTs are actually zero
terminated, it will mean it reads are most one further byte and on many
targets strlen is more optimized than strnlen.

2019-01-04  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/88693
	* tree-ssa-strlen.c (get_min_string_length): Don't set *full_string_p
	for STRING_CSTs that don't contain any NUL characters in the first
	TREE_STRING_LENGTH bytes.

	* gcc.c-torture/execute/pr88693.c: New test.

--- gcc/tree-ssa-strlen.c.jj	2019-01-03 09:47:28.018069413 +0100
+++ gcc/tree-ssa-strlen.c	2019-01-04 16:28:47.549196405 +0100
@@ -3232,8 +3232,9 @@ get_min_string_length (tree rhs, bool *f
 
   if (rhs && TREE_CODE (rhs) == STRING_CST)
     {
-      *full_string_p = true;
-      return strlen (TREE_STRING_POINTER (rhs));
+      HOST_WIDE_INT len = strlen (TREE_STRING_POINTER (rhs));
+      *full_string_p = len < TREE_STRING_LENGTH (rhs);
+      return len;
     }
 
   return -1;
--- gcc/testsuite/gcc.c-torture/execute/pr88693.c.jj	2019-01-04 16:31:50.892218304 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr88693.c	2019-01-04 16:31:09.808885630 +0100
@@ -0,0 +1,54 @@
+/* PR tree-optimization/88693 */
+
+__attribute__((noipa)) void
+foo (char *p)
+{
+  if (__builtin_strlen (p) != 9)
+    __builtin_abort ();
+}
+
+__attribute__((noipa)) void
+quux (char *p)
+{
+  int i;
+  for (i = 0; i < 100; i++)
+    if (p[i] != 'x')
+      __builtin_abort ();
+}
+
+__attribute__((noipa)) void
+qux (void)
+{
+  char b[100];
+  __builtin_memset (b, 'x', sizeof (b));
+  quux (b);
+}
+
+__attribute__((noipa)) void
+bar (void)
+{
+  static unsigned char u[9] = "abcdefghi";
+  char b[100];
+  __builtin_memcpy (b, u, sizeof (u));
+  b[sizeof (u)] = 0;
+  foo (b);
+}
+
+__attribute__((noipa)) void
+baz (void)
+{
+  static unsigned char u[] = { 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r' };
+  char b[100];
+  __builtin_memcpy (b, u, sizeof (u));
+  b[sizeof (u)] = 0;
+  foo (b);
+}
+
+int
+main ()
+{
+  qux ();
+  bar ();
+  baz ();
+  return 0;
+}

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-11 18:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-05 12:20 [PATCH] Fix ICU miscompilation due to tree-ssa-strlen.c bug (PR tree-optimization/88693) Bernd Edlinger
2019-01-05 12:31 ` Jakub Jelinek
2019-01-05 13:17   ` Bernd Edlinger
2019-01-05 13:23     ` Jakub Jelinek
  -- strict thread matches above, loose matches on Subject: below --
2019-01-04 22:54 Jakub Jelinek
2019-01-05  1:10 ` Jeff Law
2019-01-11 18:29 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).