* [PATCH] strlen: Punt on UB reads past end of string literal [PR94187]
@ 2020-03-17 8:39 Jakub Jelinek
2020-03-17 9:05 ` Richard Biener
2020-03-17 15:01 ` Martin Sebor
0 siblings, 2 replies; 3+ messages in thread
From: Jakub Jelinek @ 2020-03-17 8:39 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches, Martin Sebor
Hi!
The gcc.dg/pr68785.c test which contains:
int
foo (void)
{
return *(int *) "";
}
has UB in the program if it is ever called, but causes UB in the compiler
as well as at least in theory non-reproduceable code generation.
The problem is that nbytes is in this case 4, prep is the
TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1 and
we do:
4890 for (const char *p = prep; p != prep + nbytes; ++p)
4891 if (*p)
4892 {
4893 *allnul = false;
4894 break;
4895 }
and so read the bytes after the STRING_CST payload, which can be random.
I think we should just punt in this case.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2020-03-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94187
* tree-ssa-strlen.c (count_nonzero_bytes): Punt if
nchars - offset < nbytes.
--- gcc/tree-ssa-strlen.c.jj 2020-03-14 08:14:47.034742349 +0100
+++ gcc/tree-ssa-strlen.c 2020-03-16 12:23:57.523534887 +0100
@@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
of the access), set it here to the size of the string, including
all internal and trailing nuls if the string has any. */
nbytes = nchars - offset;
+ else if (nchars - offset < nbytes)
+ return false;
prep = TREE_STRING_POINTER (exp) + offset;
}
Jakub
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] strlen: Punt on UB reads past end of string literal [PR94187]
2020-03-17 8:39 [PATCH] strlen: Punt on UB reads past end of string literal [PR94187] Jakub Jelinek
@ 2020-03-17 9:05 ` Richard Biener
2020-03-17 15:01 ` Martin Sebor
1 sibling, 0 replies; 3+ messages in thread
From: Richard Biener @ 2020-03-17 9:05 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Jeff Law, GCC Patches
On Tue, Mar 17, 2020 at 9:40 AM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi!
>
> The gcc.dg/pr68785.c test which contains:
> int
> foo (void)
> {
> return *(int *) "";
> }
> has UB in the program if it is ever called, but causes UB in the compiler
> as well as at least in theory non-reproduceable code generation.
> The problem is that nbytes is in this case 4, prep is the
> TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1 and
> we do:
> 4890 for (const char *p = prep; p != prep + nbytes; ++p)
> 4891 if (*p)
> 4892 {
> 4893 *allnul = false;
> 4894 break;
> 4895 }
> and so read the bytes after the STRING_CST payload, which can be random.
> I think we should just punt in this case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
OK.
Richard.
> 2020-03-16 Jakub Jelinek <jakub@redhat.com>
>
> PR tree-optimization/94187
> * tree-ssa-strlen.c (count_nonzero_bytes): Punt if
> nchars - offset < nbytes.
>
> --- gcc/tree-ssa-strlen.c.jj 2020-03-14 08:14:47.034742349 +0100
> +++ gcc/tree-ssa-strlen.c 2020-03-16 12:23:57.523534887 +0100
> @@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
> of the access), set it here to the size of the string, including
> all internal and trailing nuls if the string has any. */
> nbytes = nchars - offset;
> + else if (nchars - offset < nbytes)
> + return false;
>
> prep = TREE_STRING_POINTER (exp) + offset;
> }
>
>
> Jakub
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] strlen: Punt on UB reads past end of string literal [PR94187]
2020-03-17 8:39 [PATCH] strlen: Punt on UB reads past end of string literal [PR94187] Jakub Jelinek
2020-03-17 9:05 ` Richard Biener
@ 2020-03-17 15:01 ` Martin Sebor
1 sibling, 0 replies; 3+ messages in thread
From: Martin Sebor @ 2020-03-17 15:01 UTC (permalink / raw)
To: Jeff Law; +Cc: Jakub Jelinek, gcc-patches
On 3/17/20 2:39 AM, Jakub Jelinek wrote:
> Hi!
>
> The gcc.dg/pr68785.c test which contains:
> int
> foo (void)
> {
> return *(int *) "";
> }
> has UB in the program if it is ever called, but causes UB in the compiler
> as well as at least in theory non-reproduceable code generation.
> The problem is that nbytes is in this case 4, prep is the
> TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1
At least as important as avoiding the Valgrind error is detecting
the bug in the code. -Warray-bounds has all it needs to diagnose
it, but, regrettably, it doesn't. I raised PR 94195 to make sure
it does.
Martin
and
> we do:
> 4890 for (const char *p = prep; p != prep + nbytes; ++p)
> 4891 if (*p)
> 4892 {
> 4893 *allnul = false;
> 4894 break;
> 4895 }
> and so read the bytes after the STRING_CST payload, which can be random.
> I think we should just punt in this case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2020-03-16 Jakub Jelinek <jakub@redhat.com>
>
> PR tree-optimization/94187
> * tree-ssa-strlen.c (count_nonzero_bytes): Punt if
> nchars - offset < nbytes.
>
> --- gcc/tree-ssa-strlen.c.jj 2020-03-14 08:14:47.034742349 +0100
> +++ gcc/tree-ssa-strlen.c 2020-03-16 12:23:57.523534887 +0100
> @@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
> of the access), set it here to the size of the string, including
> all internal and trailing nuls if the string has any. */
> nbytes = nchars - offset;
> + else if (nchars - offset < nbytes)
> + return false;
>
> prep = TREE_STRING_POINTER (exp) + offset;
> }
>
>
> Jakub
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-03-17 15:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-17 8:39 [PATCH] strlen: Punt on UB reads past end of string literal [PR94187] Jakub Jelinek
2020-03-17 9:05 ` Richard Biener
2020-03-17 15:01 ` Martin Sebor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).