public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] strlen: Punt on UB reads past end of string literal [PR94187]
@ 2020-03-17  8:39 Jakub Jelinek
  2020-03-17  9:05 ` Richard Biener
  2020-03-17 15:01 ` Martin Sebor
  0 siblings, 2 replies; 3+ messages in thread
From: Jakub Jelinek @ 2020-03-17  8:39 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches, Martin Sebor

Hi!

The gcc.dg/pr68785.c test which contains:
int
foo (void)
{
  return *(int *) "";
}
has UB in the program if it is ever called, but causes UB in the compiler
as well as at least in theory non-reproduceable code generation.
The problem is that nbytes is in this case 4, prep is the
TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1 and
we do:
4890		  for (const char *p = prep; p != prep + nbytes; ++p)
4891		    if (*p)
4892		      {
4893			*allnul = false;
4894			break;
4895		      }
and so read the bytes after the STRING_CST payload, which can be random.
I think we should just punt in this case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-03-16  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/94187
	* tree-ssa-strlen.c (count_nonzero_bytes): Punt if
	nchars - offset < nbytes.

--- gcc/tree-ssa-strlen.c.jj	2020-03-14 08:14:47.034742349 +0100
+++ gcc/tree-ssa-strlen.c	2020-03-16 12:23:57.523534887 +0100
@@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
 	   of the access), set it here to the size of the string, including
 	   all internal and trailing nuls if the string has any.  */
 	nbytes = nchars - offset;
+      else if (nchars - offset < nbytes)
+	return false;
 
       prep = TREE_STRING_POINTER (exp) + offset;
     }


	Jakub


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] strlen: Punt on UB reads past end of string literal [PR94187]
  2020-03-17  8:39 [PATCH] strlen: Punt on UB reads past end of string literal [PR94187] Jakub Jelinek
@ 2020-03-17  9:05 ` Richard Biener
  2020-03-17 15:01 ` Martin Sebor
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Biener @ 2020-03-17  9:05 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, GCC Patches

On Tue, Mar 17, 2020 at 9:40 AM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi!
>
> The gcc.dg/pr68785.c test which contains:
> int
> foo (void)
> {
>   return *(int *) "";
> }
> has UB in the program if it is ever called, but causes UB in the compiler
> as well as at least in theory non-reproduceable code generation.
> The problem is that nbytes is in this case 4, prep is the
> TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1 and
> we do:
> 4890              for (const char *p = prep; p != prep + nbytes; ++p)
> 4891                if (*p)
> 4892                  {
> 4893                    *allnul = false;
> 4894                    break;
> 4895                  }
> and so read the bytes after the STRING_CST payload, which can be random.
> I think we should just punt in this case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2020-03-16  Jakub Jelinek  <jakub@redhat.com>
>
>         PR tree-optimization/94187
>         * tree-ssa-strlen.c (count_nonzero_bytes): Punt if
>         nchars - offset < nbytes.
>
> --- gcc/tree-ssa-strlen.c.jj    2020-03-14 08:14:47.034742349 +0100
> +++ gcc/tree-ssa-strlen.c       2020-03-16 12:23:57.523534887 +0100
> @@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
>            of the access), set it here to the size of the string, including
>            all internal and trailing nuls if the string has any.  */
>         nbytes = nchars - offset;
> +      else if (nchars - offset < nbytes)
> +       return false;
>
>        prep = TREE_STRING_POINTER (exp) + offset;
>      }
>
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] strlen: Punt on UB reads past end of string literal [PR94187]
  2020-03-17  8:39 [PATCH] strlen: Punt on UB reads past end of string literal [PR94187] Jakub Jelinek
  2020-03-17  9:05 ` Richard Biener
@ 2020-03-17 15:01 ` Martin Sebor
  1 sibling, 0 replies; 3+ messages in thread
From: Martin Sebor @ 2020-03-17 15:01 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jakub Jelinek, gcc-patches

On 3/17/20 2:39 AM, Jakub Jelinek wrote:
> Hi!
> 
> The gcc.dg/pr68785.c test which contains:
> int
> foo (void)
> {
>    return *(int *) "";
> }
> has UB in the program if it is ever called, but causes UB in the compiler
> as well as at least in theory non-reproduceable code generation.
> The problem is that nbytes is in this case 4, prep is the
> TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1

At least as important as avoiding the Valgrind error is detecting
the bug in the code.  -Warray-bounds has all it needs to diagnose
it, but, regrettably, it doesn't.  I raised PR 94195 to make sure
it does.

Martin


  and
> we do:
> 4890		  for (const char *p = prep; p != prep + nbytes; ++p)
> 4891		    if (*p)
> 4892		      {
> 4893			*allnul = false;
> 4894			break;
> 4895		      }
> and so read the bytes after the STRING_CST payload, which can be random.
> I think we should just punt in this case.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2020-03-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/94187
> 	* tree-ssa-strlen.c (count_nonzero_bytes): Punt if
> 	nchars - offset < nbytes.
> 
> --- gcc/tree-ssa-strlen.c.jj	2020-03-14 08:14:47.034742349 +0100
> +++ gcc/tree-ssa-strlen.c	2020-03-16 12:23:57.523534887 +0100
> @@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
>   	   of the access), set it here to the size of the string, including
>   	   all internal and trailing nuls if the string has any.  */
>   	nbytes = nchars - offset;
> +      else if (nchars - offset < nbytes)
> +	return false;
>   
>         prep = TREE_STRING_POINTER (exp) + offset;
>       }
> 
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-17 15:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-17  8:39 [PATCH] strlen: Punt on UB reads past end of string literal [PR94187] Jakub Jelinek
2020-03-17  9:05 ` Richard Biener
2020-03-17 15:01 ` Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).