public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
@ 2018-07-30 21:12 Bernd Edlinger
  2018-07-31  3:52 ` Martin Sebor
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2018-07-30 21:12 UTC (permalink / raw)
  To: Martin Sebor, gcc-patches

Hi,

>@@ -621,6 +674,12 @@ c_strlen (tree src, int only_value)
> 	maxelts = maxelts / eltsize - 1;
>       }
> 
>+  /* Unless the caller is prepared to handle it by passing in a non-null
>+     ARR, fail if the terminating nul doesn't fit in the array the string
>+     is stored in (as in const char a[3] = "123";  */
>+  if (!arr && maxelts < strelts)
>+    return NULL_TREE;
>+

this is c_strlen, how is the caller ever supposed to handle non-zero terminated strings???
especially if you do this above?

>+c_strlen (tree src, int only_value, tree *arr /* = NULL */)
> {
>   STRIP_NOPS (src);
>+
>+  /* Used to detect non-nul-terminated strings in subexpressions
>+     of a conditional expression.  When ARR is null, point it at
>+     one of the elements for simplicity.  */
>+  tree arrs[] = { NULL_TREE, NULL_TREE };
>+  if (!arr)
>+    arr = arrs;

>@@ -11427,7 +11478,9 @@ string_constant (tree arg, tree *ptr_offset)
>   unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
>   length = string_length (TREE_STRING_POINTER (init), charsize,
> 			  length / charsize);
>-  if (compare_tree_int (array_size, length + 1) < 0)
>+  if (nulterm)
>+    *nulterm = array_elts > length;
>+  else if (array_elts <= length)
>     return NULL_TREE;

I don't understand why you can't use
compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)), TREE_STRING_LENGTH (init))
instead of this convoluted code above ???

Sorry, this patch does not look like it is ready any time soon.


But actually I am totally puzzled by your priorities.
This is what I see right now:

1) We have missing warnings.
2) We have wrong code bugs.
3) We have apparently a specification error on the C Language standard (*)


Why are you prioritizing 1) over 2) thus blocking my attempts to fix a wrong code
issue,and why do you not tackle 3) in your WG14?



(*) which means that GCC is currently removing code from assertions
as I pointed out here: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01695.html

This happens because GCC follows the language standards literally right now.

I would say too literally, and it proves that the language standard's logic is
flawed IMHO.

Thanks
Bernd.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
  2018-07-30 21:12 PING [PATCH] warn for strlen of arrays with missing nul (PR 86552) Bernd Edlinger
@ 2018-07-31  3:52 ` Martin Sebor
  2018-08-01 14:21   ` Bernd Edlinger
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Sebor @ 2018-07-31  3:52 UTC (permalink / raw)
  To: Bernd Edlinger, gcc-patches

On 07/30/2018 03:11 PM, Bernd Edlinger wrote:
> Hi,
>
>> @@ -621,6 +674,12 @@ c_strlen (tree src, int only_value)
>> 	maxelts = maxelts / eltsize - 1;
>>       }
>>
>> +  /* Unless the caller is prepared to handle it by passing in a non-null
>> +     ARR, fail if the terminating nul doesn't fit in the array the string
>> +     is stored in (as in const char a[3] = "123";  */
>> +  if (!arr && maxelts < strelts)
>> +    return NULL_TREE;
>> +
>
> this is c_strlen, how is the caller ever supposed to handle non-zero terminated strings???
> especially if you do this above?

Callers that pass in a non-null ARR handle them by issuing
a warning.  The rest get back a null result.  It should be
evident from the rest of the patch.  It can be debated what
each caller should do when it detects such a missing nul
where one is expected.  Different approaches may be more
or less appropriate for different callers/functions (e.g.,
strcpy vs strlen).

>> +c_strlen (tree src, int only_value, tree *arr /* = NULL */)
>> {
>>   STRIP_NOPS (src);
>> +
>> +  /* Used to detect non-nul-terminated strings in subexpressions
>> +     of a conditional expression.  When ARR is null, point it at
>> +     one of the elements for simplicity.  */
>> +  tree arrs[] = { NULL_TREE, NULL_TREE };
>> +  if (!arr)
>> +    arr = arrs;
>
>> @@ -11427,7 +11478,9 @@ string_constant (tree arg, tree *ptr_offset)
>>   unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
>>   length = string_length (TREE_STRING_POINTER (init), charsize,
>> 			  length / charsize);
>> -  if (compare_tree_int (array_size, length + 1) < 0)
>> +  if (nulterm)
>> +    *nulterm = array_elts > length;
>> +  else if (array_elts <= length)
>>     return NULL_TREE;
>
> I don't understand why you can't use
> compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)), TREE_STRING_LENGTH (init))
> instead of this convoluted code above ???
>
> Sorry, this patch does not look like it is ready any time soon.

I'm open to technical comments on the substance of my changes
but I'm not interested in your opinion of the readiness of
the patch (whatever that might mean), certainly not if you
have formed it after skimming a random handful of lines out
of a 600 line patch.

> But actually I am totally puzzled by your priorities.
> This is what I see right now:
>
> 1) We have missing warnings.
> 2) We have wrong code bugs.
> 3) We have apparently a specification error on the C Language standard (*)
>
>
> Why are you prioritizing 1) over 2) thus blocking my attempts to fix a wrong code
> issue,and why do you not tackle 3) in your WG14?

My priorities are none of your concern.

Your "attempts to fix" issues interfere with my work on a number
of projects.  You are not being helpful -- instead, by submitting
changes that you know fully well conflict with mine, you are
impeding and undermining my work.  That is why I object to them.

> (*) which means that GCC is currently removing code from assertions
> as I pointed out here: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01695.html
>
> This happens because GCC follows the language standards literally right now.
>
> I would say too literally, and it proves that the language standard's logic is
> flawed IMHO.

I have no idea what your point is about standards, but bugs
like the one in the example, including those arising from
uninitialized arrays, could be detected with only minor
enhancements to the tree-ssa-strlen pass.  Implementing some
of this is among the projects I'm expected and expecting to
work on for GCC 9.  This patch is a small step in that
direction.

If you care about detecting bugs I would expect you to be
supportive rather than dismissive of this work, and helpful
in bringing it to fruition rather that putting it down or
questioning my priorities.  Especially since the work was
prompted by your own (valid) complaint that GCC doesn't
diagnose them.

Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
  2018-07-31  3:52 ` Martin Sebor
@ 2018-08-01 14:21   ` Bernd Edlinger
  2018-08-01 16:34     ` Martin Sebor
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2018-08-01 14:21 UTC (permalink / raw)
  To: Martin Sebor, gcc-patches

On 07/31/18 05:51, Martin Sebor wrote:
> On 07/30/2018 03:11 PM, Bernd Edlinger wrote:
>> Hi,
>>
>>> @@ -621,6 +674,12 @@ c_strlen (tree src, int only_value)
>>>     maxelts = maxelts / eltsize - 1;
>>>       }
>>>
>>> +  /* Unless the caller is prepared to handle it by passing in a non-null
>>> +     ARR, fail if the terminating nul doesn't fit in the array the string
>>> +     is stored in (as in const char a[3] = "123";  */
>>> +  if (!arr && maxelts < strelts)
>>> +    return NULL_TREE;
>>> +
>>
>> this is c_strlen, how is the caller ever supposed to handle non-zero terminated strings???
>> especially if you do this above?
> 
> Callers that pass in a non-null ARR handle them by issuing
> a warning.  The rest get back a null result.  It should be
> evident from the rest of the patch.  It can be debated what
> each caller should do when it detects such a missing nul
> where one is expected.  Different approaches may be more
> or less appropriate for different callers/functions (e.g.,
> strcpy vs strlen).
> 

Sorry, right in the beginning you have "if (!add) arr = arrs;"

>>> +c_strlen (tree src, int only_value, tree *arr /* = NULL */)
>>> {
>>>   STRIP_NOPS (src);
>>> +
>>> +  /* Used to detect non-nul-terminated strings in subexpressions
>>> +     of a conditional expression.  When ARR is null, point it at
>>> +     one of the elements for simplicity.  */
>>> +  tree arrs[] = { NULL_TREE, NULL_TREE };
>>> +  if (!arr)
>>> +    arr = arrs;
>>
>>> @@ -11427,7 +11478,9 @@ string_constant (tree arg, tree *ptr_offset)
>>>   unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
>>>   length = string_length (TREE_STRING_POINTER (init), charsize,
>>>               length / charsize);
>>> -  if (compare_tree_int (array_size, length + 1) < 0)
>>> +  if (nulterm)
>>> +    *nulterm = array_elts > length;
>>> +  else if (array_elts <= length)
>>>     return NULL_TREE;
>>
>> I don't understand why you can't use
>> compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)), TREE_STRING_LENGTH (init))
>> instead of this convoluted code above ???
>>
>> Sorry, this patch does not look like it is ready any time soon.
> 
> I'm open to technical comments on the substance of my changes
> but I'm not interested in your opinion of the readiness of
> the patch (whatever that might mean), certainly not if you
> have formed it after skimming a random handful of lines out
> of a 600 line patch.
> 

Sorry, again.  I just meant you should fix the issues, and
maybe make the patch a bit smaller.

>> But actually I am totally puzzled by your priorities.
>> This is what I see right now:
>>
>> 1) We have missing warnings.
>> 2) We have wrong code bugs.
>> 3) We have apparently a specification error on the C Language standard (*)
>>
>>
>> Why are you prioritizing 1) over 2) thus blocking my attempts to fix a wrong code
>> issue,and why do you not tackle 3) in your WG14?
> 
> My priorities are none of your concern.
> 

Sorry, again, but your priorities seem to conflict with mine.

> Your "attempts to fix" issues interfere with my work on a number
> of projects.  You are not being helpful -- instead, by submitting
> changes that you know fully well conflict with mine, you are
> impeding and undermining my work.  That is why I object to them.
> 
>> (*) which means that GCC is currently removing code from assertions
>> as I pointed out here: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01695.html
>>
>> This happens because GCC follows the language standards literally right now.
>>
>> I would say too literally, and it proves that the language standard's logic is
>> flawed IMHO.
> 
> I have no idea what your point is about standards, but bugs
> like the one in the example, including those arising from
> uninitialized arrays, could be detected with only minor
> enhancements to the tree-ssa-strlen pass.  Implementing some
> of this is among the projects I'm expected and expecting to
> work on for GCC 9.  This patch is a small step in that
> direction.
> 
> If you care about detecting bugs I would expect you to be
> supportive rather than dismissive of this work, and helpful
> in bringing it to fruition rather that putting it down or
> questioning my priorities.  Especially since the work was
> prompted by your own (valid) complaint that GCC doesn't
> diagnose them.
> 

You don't really listen to what I am saying, I did not say
that we need another warning instead of fixing the wrong
optimization issue at hand.

But I am in good company, you don't listen to Jakub and Richi
either.


Bernd.

> Martin
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
  2018-08-01 14:21   ` Bernd Edlinger
@ 2018-08-01 16:34     ` Martin Sebor
  2018-08-01 17:16       ` Bernd Edlinger
  2018-08-01 20:33       ` Martin Sebor
  0 siblings, 2 replies; 8+ messages in thread
From: Martin Sebor @ 2018-08-01 16:34 UTC (permalink / raw)
  To: Bernd Edlinger, gcc-patches

>> If you care about detecting bugs I would expect you to be
>> supportive rather than dismissive of this work, and helpful
>> in bringing it to fruition rather that putting it down or
>> questioning my priorities.  Especially since the work was
>> prompted by your own (valid) complaint that GCC doesn't
>> diagnose them.
>>
>
> You don't really listen to what I am saying, I did not say
> that we need another warning instead of fixing the wrong
> optimization issue at hand.
>
> But I am in good company, you don't listen to Jakub and Richi
> either.

I certainly intend to fix bugs I'm responsible for introducing.
I always do if given the chance.  I assume you are referring
to bug 86711 (and 86714).  Fixing the underlying problem has
been on my mind since you first mentioned it, and on my to-do
list since last week (bug 86688).  You have now submitted
a patch for both of the former, plus a follow-on patch, but
you didn't assign either of the bugs to yourself, or indicated
if the patch fixes 86688, or if you intend to work on it too.
I haven't reviewed the patches in any detail except to note
that they touch the same area as mine and likely conflict.
I'm not sure what I should do now.  Work on fixing these bugs
myself?  (I would prefer to.)  Try to rebase my work on top
of yours to see what the conflicts are and try to resolve
them them in my ongoing work?  Or just keep working on my
stuff and deal with the conflicts after your patches have
been committed?  Or continue to debate conflicting priorities
and try to resolve them first?

(Those are mostly rhetorical questions.)  The point is that
if you would just let me fix my bugs we would not have this
conundrum.  Your test cases are helpful.  But as I have said
over and over, submitting patches for the same code at the same
time and even undoing some prior work with no coordination is
a recipe for confusion and conflict.  I don't recall this
happening in the past and I don't really understand what
triggered it in this case.  This isn't an area that normally
sees a lot of activity.

Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
  2018-08-01 16:34     ` Martin Sebor
@ 2018-08-01 17:16       ` Bernd Edlinger
  2018-08-01 20:33       ` Martin Sebor
  1 sibling, 0 replies; 8+ messages in thread
From: Bernd Edlinger @ 2018-08-01 17:16 UTC (permalink / raw)
  To: Martin Sebor, gcc-patches

On 08/01/18 18:34, Martin Sebor wrote:
>>> If you care about detecting bugs I would expect you to be
>>> supportive rather than dismissive of this work, and helpful
>>> in bringing it to fruition rather that putting it down or
>>> questioning my priorities.  Especially since the work was
>>> prompted by your own (valid) complaint that GCC doesn't
>>> diagnose them.
>>>
>>
>> You don't really listen to what I am saying, I did not say
>> that we need another warning instead of fixing the wrong
>> optimization issue at hand.
>>
>> But I am in good company, you don't listen to Jakub and Richi
>> either.
> 
> I certainly intend to fix bugs I'm responsible for introducing.
> I always do if given the chance.  I assume you are referring
> to bug 86711 (and 86714).  Fixing the underlying problem has
> been on my mind since you first mentioned it, and on my to-do
> list since last week (bug 86688).  You have now submitted
> a patch for both of the former, plus a follow-on patch, but
> you didn't assign either of the bugs to yourself, or indicated
> if the patch fixes 86688, or if you intend to work on it too.
> I haven't reviewed the patches in any detail except to note
> that they touch the same area as mine and likely conflict.
> I'm not sure what I should do now.  Work on fixing these bugs
> myself?  (I would prefer to.)  Try to rebase my work on top
> of yours to see what the conflicts are and try to resolve
> them them in my ongoing work?  Or just keep working on my
> stuff and deal with the conflicts after your patches have
> been committed?  Or continue to debate conflicting priorities
> and try to resolve them first?
> 
> (Those are mostly rhetorical questions.)  The point is that
> if you would just let me fix my bugs we would not have this
> conundrum.  Your test cases are helpful.  But as I have said
> over and over, submitting patches for the same code at the same
> time and even undoing some prior work with no coordination is
> a recipe for confusion and conflict.  I don't recall this
> happening in the past and I don't really understand what
> triggered it in this case.  This isn't an area that normally
> sees a lot of activity.
> 

Martin,

I am totally sorry for this confusion.  I would please
ask you to do your work a bit slower, and that we please
can talk over the direction in which we want to go on.
For instance in the moment not so many new warnings, when
we actually should look at correctness and reliability issues.
I do definitely not want to revert your work, but I will have
to hedge it where it goes too far, but that does not mean that
it will be worthless.

What made my alarm bells ring is the speed in which new buggy
features, are being implemented recently, while at the same time
several global reviewers raised concerns, which would not be
honored.  That is not a good thing.

To me it is an serious problem when those global reviewers
do not seem to agree on the way these features are implemented.

To be honest, I do not believe in democracy, or majority decisions.
But I always slow down when there is no consensus, and look for a
solution that is acceptable for all the key players.


Bernd.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
  2018-08-01 16:34     ` Martin Sebor
  2018-08-01 17:16       ` Bernd Edlinger
@ 2018-08-01 20:33       ` Martin Sebor
  1 sibling, 0 replies; 8+ messages in thread
From: Martin Sebor @ 2018-08-01 20:33 UTC (permalink / raw)
  To: Bernd Edlinger, gcc-patches

On 08/01/2018 10:34 AM, Martin Sebor wrote:
>>> If you care about detecting bugs I would expect you to be
>>> supportive rather than dismissive of this work, and helpful
>>> in bringing it to fruition rather that putting it down or
>>> questioning my priorities.  Especially since the work was
>>> prompted by your own (valid) complaint that GCC doesn't
>>> diagnose them.
>>>
>>
>> You don't really listen to what I am saying, I did not say
>> that we need another warning instead of fixing the wrong
>> optimization issue at hand.
>>
>> But I am in good company, you don't listen to Jakub and Richi
>> either.
>
> I certainly intend to fix bugs I'm responsible for introducing.
> I always do if given the chance.  I assume you are referring
> to bug 86711 (and 86714).  Fixing the underlying problem has
> been on my mind since you first mentioned it, and on my to-do
> list since last week (bug 86688).

I've started looking into fixing 86711 but as it turns out,
by avoiding folding non-nul-terminated strings, this patch
already fixes it as well as producing the output you expect
for the test case in 86714, and also fixes 86688.

So unless you intend to pursue your patch I will assign all
these bugs to myself, add the test cases to this patch, and
resubmit it.  (I would normally prefer to deal with each bug
independently, but since I already have a working patch that
does the right thing I'd just as soon save the time and effort
and not try to break it up).

Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
  2018-07-25 23:38 ` PING " Martin Sebor
@ 2018-07-30 19:18   ` Martin Sebor
  0 siblings, 0 replies; 8+ messages in thread
From: Martin Sebor @ 2018-07-30 19:18 UTC (permalink / raw)
  To: Gcc Patch List

[-- Attachment #1: Type: text/plain, Size: 1598 bytes --]

Attached is an updated version of the patch that handles more
instances of calling strlen() on a constant array that is not
a nul-terminated string.

No other functions except strlen are explicitly handled yet,
and neither are constant arrays with braced-initializer lists
like const char a[] = { 'a', 'b', 'c' };  I am testing
an independent solution for those (bug 86552).  Once those
are handled the warning will be able to detect those as well.

Tested on x86_64-linux.

On 07/25/2018 05:38 PM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html
>
> The fix for bug 86532 has been checked in so this enhancement
> can now be applied on top of it (with only minor adjustments).
>
> On 07/19/2018 02:08 PM, Martin Sebor wrote:
>> In the discussion of my patch for pr86532 Bernd noted that
>> GCC silently accepts constant character arrays with no
>> terminating nul as arguments to strlen (and other string
>> functions).
>>
>> The attached patch is a first step in detecting these kinds
>> of bugs in strlen calls by issuing -Wstringop-overflow.
>> The next step is to modify all other handlers of built-in
>> functions to detect the same problem (not part of this patch).
>> Yet another step is to detect these problems in arguments
>> initialized using the non-string form:
>>
>>   const char a[] = { 'a', 'b', 'c' };
>>
>> This patch is meant to apply on top of the one for bug 86532
>> (I tested it with an earlier version of that patch so there
>> is code in the context that does not appear in the latest
>> version of the other diff).
>>
>> Martin
>>
>


[-- Attachment #2: gcc-86552.diff --]
[-- Type: text/x-patch, Size: 33470 bytes --]

PR tree-optimization/86552 - missing warning for reading past the end of non-string arrays

gcc/ChangeLog:

	PR tree-optimization/86552
	* builtins.h (warn_string_no_nul): Declare..
	(c_strlen): Add argument.
	* builtins.c (warn_string_no_nul): New function.
	(fold_builtin_strlen): Add argument.  Detect missing nul.
	(fold_builtin_1): Adjust.
	(string_length): Add argument and use it.
	(c_strlen): Same.
	(expand_builtin_strlen): Detect missing nul.
	* expr.c (string_constant): Add arguments.  Detect missing nul
	terminator and outermost declaration it's missing in.
	* expr.h (string_constant): Add argument.
	* fold-const.c (c_getstr): Change argument to bool*, rename
	other arguments.
	* fold-const-call.c (fold_const_call): Detect missing nul.
	* gimple-fold.c (get_range_strlen): Add argument.
	(get_maxval_strlen): Adjust.
	* gimple-fold.h (get_range_strlen): Add argument.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86552
	* gcc.dg/warn-string-no-nul.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index aa3e0d8..f4924d5 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -150,7 +150,7 @@ static tree stabilize_va_list_loc (location_t, tree, int);
 static rtx expand_builtin_expect (tree, rtx);
 static tree fold_builtin_constant_p (tree);
 static tree fold_builtin_classify_type (tree);
-static tree fold_builtin_strlen (location_t, tree, tree);
+static tree fold_builtin_strlen (location_t, tree, tree, tree);
 static tree fold_builtin_inf (location_t, tree, int);
 static tree rewrite_call_expr (location_t, tree, int, tree, int, ...);
 static bool validate_arg (const_tree, enum tree_code code);
@@ -550,6 +550,36 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
   return n;
 }
 
+/* For a call expression EXP to a function that expects a string argument,
+   issue a diagnostic due to it being a called with an argument NONSTR
+   that is a character array with no terminating NUL.  */
+
+void
+warn_string_no_nul (location_t loc, tree exp, tree fndecl, tree nonstr)
+{
+  loc = expansion_point_location_if_in_system_header (loc);
+
+  bool warned;
+  if (exp)
+    {
+      if (!fndecl)
+	fndecl = get_callee_fndecl (exp);
+      warned = warning_at (loc, OPT_Wstringop_overflow_,
+			   "%K%qD argument missing terminating nul",
+			   exp, fndecl);
+    }
+  else
+    {
+      gcc_assert (fndecl);
+      warned = warning_at (loc, OPT_Wstringop_overflow_,
+			   "%qD argument missing terminating nul",
+			   fndecl);
+    }
+
+  if (warned && DECL_P (nonstr))
+    inform (DECL_SOURCE_LOCATION (nonstr), "referenced argument declared here");
+}
+
 /* Compute the length of a null-terminated character string or wide
    character string handling character sizes of 1, 2, and 4 bytes.
    TREE_STRING_LENGTH is not the right way because it evaluates to
@@ -567,37 +597,60 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
    accesses.  Note that this implies the result is not going to be emitted
    into the instruction stream.
 
+   When ARR is non-null and the string is not properly nul-terminated,
+   set *ARR to the declaration of the outermost constant object whose
+   initializer (or one of its elements) is not nul-terminated.
+
    The value returned is of type `ssizetype'.
 
    Unfortunately, string_constant can't access the values of const char
    arrays with initializers, so neither can we do so here.  */
 
 tree
-c_strlen (tree src, int only_value)
+c_strlen (tree src, int only_value, tree *arr /* = NULL */)
 {
   STRIP_NOPS (src);
+
+  /* Used to detect non-nul-terminated strings in subexpressions
+     of a conditional expression.  When ARR is null, point it at
+     one of the elements for simplicity.  */
+  tree arrs[] = { NULL_TREE, NULL_TREE };
+  if (!arr)
+    arr = arrs;
+
   if (TREE_CODE (src) == COND_EXPR
       && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0))))
     {
-      tree len1, len2;
-
-      len1 = c_strlen (TREE_OPERAND (src, 1), only_value);
-      len2 = c_strlen (TREE_OPERAND (src, 2), only_value);
+      tree len1 = c_strlen (TREE_OPERAND (src, 1), only_value, arrs);
+      tree len2 = c_strlen (TREE_OPERAND (src, 2), only_value, arrs + 1);
       if (tree_int_cst_equal (len1, len2))
-	return len1;
+	{
+	  *arr = arrs[0] ? arrs[0] : arrs[1];
+	  return len1;
+	}
     }
 
   if (TREE_CODE (src) == COMPOUND_EXPR
       && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0))))
-    return c_strlen (TREE_OPERAND (src, 1), only_value);
+    return c_strlen (TREE_OPERAND (src, 1), only_value, arr);
 
   location_t loc = EXPR_LOC_OR_LOC (src, input_location);
 
   /* Offset from the beginning of the string in bytes.  */
   tree byteoff;
-  src = string_constant (src, &byteoff);
-  if (src == 0)
-    return NULL_TREE;
+  /* Set if array is nul-terminated, false otherwise.  */
+  bool nulterm;
+  src = string_constant (src, &byteoff, &nulterm, arr);
+  if (!src)
+    {
+      *arr = arrs[0] ? arrs[0] : arrs[1];
+      return NULL_TREE;
+    }
+
+  /* Clear *ARR when the string is nul-terminated.  It should be
+     of no interest to callers.  */
+  if (nulterm)
+    *arr = NULL_TREE;
 
   /* Determine the size of the string element.  */
   unsigned eltsize
@@ -621,6 +674,12 @@ c_strlen (tree src, int only_value)
 	maxelts = maxelts / eltsize - 1;
       }
 
+  /* Unless the caller is prepared to handle it by passing in a non-null
+     ARR, fail if the terminating nul doesn't fit in the array the string
+     is stored in (as in const char a[3] = "123";  */
+  if (!arr && maxelts < strelts)
+    return NULL_TREE;
+
   /* PTR can point to the byte representation of any string type, including
      char* and wchar_t*.  */
   const char *ptr = TREE_STRING_POINTER (src);
@@ -650,7 +709,8 @@ c_strlen (tree src, int only_value)
       offsave = fold_convert (ssizetype, offsave);
       tree condexp = fold_build2_loc (loc, LE_EXPR, boolean_type_node, offsave,
 				      build_int_cst (ssizetype, len * eltsize));
-      tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), offsave);
+      tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize),
+				     offsave);
       return fold_build3_loc (loc, COND_EXPR, ssizetype, condexp, lenexp,
 			      build_zero_cst (ssizetype));
     }
@@ -690,7 +750,7 @@ c_strlen (tree src, int only_value)
      Since ELTOFF is our starting index into the string, no further
      calculation is needed.  */
   unsigned len = string_length (ptr + eltoff * eltsize, eltsize,
-				maxelts - eltoff);
+				strelts - eltoff);
 
   return ssize_int (len);
 }
@@ -2855,7 +2915,6 @@ expand_builtin_strlen (tree exp, rtx target,
 
   struct expand_operand ops[4];
   rtx pat;
-  tree len;
   tree src = CALL_EXPR_ARG (exp, 0);
   rtx src_reg;
   rtx_insn *before_strlen;
@@ -2864,20 +2923,39 @@ expand_builtin_strlen (tree exp, rtx target,
   unsigned int align;
 
   /* If the length can be computed at compile-time, return it.  */
-  len = c_strlen (src, 0);
+  tree array;
+  tree len = c_strlen (src, 0, &array);
   if (len)
-    return expand_expr (len, target, target_mode, EXPAND_NORMAL);
+    {
+      if (array)
+	{
+	  /* Array refers to the non-nul terminated constant array
+	     whose length is attempted to be computed.  */
+	  warn_string_no_nul (EXPR_LOCATION (exp), exp, NULL_TREE, array);
+	  return NULL_RTX;
+	}
+      return expand_expr (len, target, target_mode, EXPAND_NORMAL);
+    }
 
   /* If the length can be computed at compile-time and is constant
      integer, but there are side-effects in src, evaluate
      src for side-effects, then return len.
      E.g. x = strlen (i++ ? "xfoo" + 1 : "bar");
      can be optimized into: i++; x = 3;  */
-  len = c_strlen (src, 1);
-  if (len && TREE_CODE (len) == INTEGER_CST)
+  len = c_strlen (src, 1, &array);
+  if (len)
     {
-      expand_expr (src, const0_rtx, VOIDmode, EXPAND_NORMAL);
-      return expand_expr (len, target, target_mode, EXPAND_NORMAL);
+      if (array)
+	{
+	  warn_string_no_nul (EXPR_LOCATION (exp), exp, NULL_TREE, array);
+	  return NULL_RTX;
+	}
+
+      if (TREE_CODE (len) == INTEGER_CST)
+	{
+	  expand_expr (src, const0_rtx, VOIDmode, EXPAND_NORMAL);
+	  return expand_expr (len, target, target_mode, EXPAND_NORMAL);
+	}
     }
 
   align = get_pointer_alignment (src) / BITS_PER_UNIT;
@@ -8255,19 +8333,30 @@ fold_builtin_classify_type (tree arg)
   return build_int_cst (integer_type_node, type_to_class (TREE_TYPE (arg)));
 }
 
-/* Fold a call to __builtin_strlen with argument ARG.  */
+/* Fold a strlen call to FNDECL of TYPE, and with argument ARG.  */
 
 static tree
-fold_builtin_strlen (location_t loc, tree type, tree arg)
+fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg)
 {
   if (!validate_arg (arg, POINTER_TYPE))
     return NULL_TREE;
   else
     {
-      tree len = c_strlen (arg, 0);
-
+      tree arr = NULL_TREE;
+      tree len = c_strlen (arg, 0, &arr);
       if (len)
-	return fold_convert_loc (loc, type, len);
+	{
+	  if (loc == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (arg))
+	    loc = EXPR_LOCATION (arg);
+
+	  /* To avoid warning multiple times about non-nul-terminated
+	     strings only warn if their length has been determined
+	     and it's being folded.  */
+	  if (arr)
+	    warn_string_no_nul (loc, NULL_TREE, fndecl, arr);
+
+	  return fold_convert_loc (loc, type, len);
+	}
 
       return NULL_TREE;
     }
@@ -9175,7 +9264,7 @@ fold_builtin_1 (location_t loc, tree fndecl, tree arg0)
       return fold_builtin_classify_type (arg0);
 
     case BUILT_IN_STRLEN:
-      return fold_builtin_strlen (loc, type, arg0);
+      return fold_builtin_strlen (loc, fndecl, type, arg0);
 
     CASE_FLT_FN (BUILT_IN_FABS):
     CASE_FLT_FN_FLOATN_NX (BUILT_IN_FABS):
diff --git a/gcc/builtins.h b/gcc/builtins.h
index 2e0a2f9..73b0b0b 100644
--- a/gcc/builtins.h
+++ b/gcc/builtins.h
@@ -58,7 +58,7 @@ extern bool get_pointer_alignment_1 (tree, unsigned int *,
 				     unsigned HOST_WIDE_INT *);
 extern unsigned int get_pointer_alignment (tree);
 extern unsigned string_length (const void*, unsigned, unsigned);
-extern tree c_strlen (tree, int);
+extern tree c_strlen (tree, int, tree * = NULL);
 extern void expand_builtin_setjmp_setup (rtx, rtx);
 extern void expand_builtin_setjmp_receiver (rtx);
 extern void expand_builtin_update_setjmp_buf (rtx);
@@ -103,7 +103,7 @@ extern bool target_char_cst_p (tree t, char *p);
 
 extern internal_fn associated_internal_fn (tree);
 extern internal_fn replacement_internal_fn (gcall *);
-
+extern void warn_string_no_nul (location_t, tree, tree, tree);
 extern tree max_object_size ();
 
 #endif /* GCC_BUILTINS_H */
diff --git a/gcc/expr.c b/gcc/expr.c
index de6709d..edbd7f8 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11271,10 +11271,14 @@ is_aligning_offset (const_tree offset, const_tree exp)
 /* Return the tree node if an ARG corresponds to a string constant or zero
    if it doesn't.  If we return nonzero, set *PTR_OFFSET to the (possibly
    non-constant) offset in bytes within the string that ARG is accessing.
+   If NULTERM is non-null, consider valid even sequences of characters that
+   aren't nul-terminated strings.  In that case, set NULTERM if ARG refers
+   to such a sequence and clear it otherwise.
    The type of the offset is sizetype.  */
 
 tree
-string_constant (tree arg, tree *ptr_offset)
+string_constant (tree arg, tree *ptr_offset, bool *nulterm /* = NULL */,
+		 tree *decl /* = NULL */)
 {
   tree array;
   STRIP_NOPS (arg);
@@ -11328,7 +11332,7 @@ string_constant (tree arg, tree *ptr_offset)
 	return NULL_TREE;
 
       tree offset;
-      if (tree str = string_constant (arg0, &offset))
+      if (tree str = string_constant (arg0, &offset, nulterm, decl))
 	{
 	  /* Avoid pointers to arrays (see bug 86622).  */
 	  if (POINTER_TYPE_P (TREE_TYPE (arg))
@@ -11368,6 +11372,10 @@ string_constant (tree arg, tree *ptr_offset)
   if (TREE_CODE (array) == STRING_CST)
     {
       *ptr_offset = fold_convert (sizetype, offset);
+      if (nulterm)
+	*nulterm = true;
+      if (decl)
+	*decl = NULL_TREE;
       return array;
     }
 
@@ -11414,6 +11422,49 @@ string_constant (tree arg, tree *ptr_offset)
   if (!array_size || TREE_CODE (array_size) != INTEGER_CST)
     return NULL_TREE;
 
+  unsigned HOST_WIDE_INT array_elts = tree_to_uhwi (array_size);
+
+  /* When ARG refers to an aggregate (of arrays) try to determine
+     the size of the character array within the aggregate.  */
+  tree ref = arg;
+  tree reftype = TREE_TYPE (arg);
+
+  if (TREE_CODE (ref) == MEM_REF)
+    {
+      ref = TREE_OPERAND (ref, 0);
+      if (TREE_CODE (ref) == ADDR_EXPR)
+	{
+	  ref = TREE_OPERAND (ref, 0);
+	  reftype = TREE_TYPE (ref);
+	}
+    }
+  else
+    while (TREE_CODE (ref) == ARRAY_REF)
+      {
+	reftype = TREE_TYPE (ref);
+	ref = TREE_OPERAND (ref, 0);
+      }
+
+  if (TREE_CODE (ref) == COMPONENT_REF)
+    reftype = TREE_TYPE (ref);
+
+  while (TREE_CODE (reftype) == ARRAY_TYPE)
+    {
+      tree next = TREE_TYPE (reftype);
+      if (TREE_CODE (next) == INTEGER_TYPE)
+	{
+	  if (tree size = TYPE_SIZE_UNIT (reftype))
+	    if (tree_fits_uhwi_p (size))
+	      array_elts = tree_to_uhwi (size);
+	  break;
+	}
+
+      reftype = TREE_TYPE (reftype);
+    }
+
+  if (decl)
+    *decl = array;
+
   /* Avoid returning a string that doesn't fit in the array
      it is stored in, like
      const char a[4] = "abcde";
@@ -11427,7 +11478,9 @@ string_constant (tree arg, tree *ptr_offset)
   unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
   length = string_length (TREE_STRING_POINTER (init), charsize,
 			  length / charsize);
-  if (compare_tree_int (array_size, length + 1) < 0)
+  if (nulterm)
+    *nulterm = array_elts > length;
+  else if (array_elts <= length)
     return NULL_TREE;
 
   *ptr_offset = offset;
diff --git a/gcc/expr.h b/gcc/expr.h
index cf047d4..e630979 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -288,7 +288,7 @@ expand_normal (tree exp)
 
 /* Return the tree node and offset if a given argument corresponds to
    a string constant.  */
-extern tree string_constant (tree, tree *);
+extern tree string_constant (tree, tree *, bool * = NULL, tree * = NULL);
 
 /* Two different ways of generating switch statements.  */
 extern int try_casesi (tree, tree, tree, tree, rtx, rtx, rtx, profile_probability);
diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index 06a42060..849a443 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -1199,9 +1199,14 @@ fold_const_call (combined_fn fn, tree type, tree arg)
   switch (fn)
     {
     case CFN_BUILT_IN_STRLEN:
-      if (const char *str = c_getstr (arg))
-	return build_int_cst (type, strlen (str));
-      return NULL_TREE;
+      {
+	bool nulterm;
+	if (const char *str = c_getstr (arg, NULL, &nulterm))
+	  if (nulterm)
+	    return build_int_cst (type, strlen (str));
+
+	return NULL_TREE;
+      }
 
     CASE_CFN_NAN:
     CASE_FLT_FN_FLOATN_NX (CFN_BUILT_IN_NAN):
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index b318fc77..ecbc38c 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -14577,23 +14577,23 @@ fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE_INT off)
 /* Return a pointer P to a NUL-terminated string representing the sequence
    of constant characters referred to by SRC (or a subsequence of such
    characters within it if SRC is a reference to a string plus some
-   constant offset).  If STRLEN is non-null, store stgrlen(P) in *STRLEN.
-   If STRSIZE is non-null, store in *STRSIZE the size of the array
-   the string is stored in; in that case, even though P points to a NUL
-   terminated string, SRC need not refer to one.  This can happen when
-   SRC refers to a constant character array initialized to all non-NUL
-   values, as in the C declaration: char a[4] = "1234";  */
+   constant offset).  If STRSIZE is non-null, store the size of the string
+   literal in *STRSIZE, including any embedded or terminating nuls.  If
+   If NULLTERM is non-null, set *NULLTERM if the referenced string is
+   guaranteed to contain a terminating NUL.  Otherwise clear it.  This
+   can happen in the case of valid C declarations such as:
+   const char a[3] = "123";  */
 
 const char *
-c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */,
-	  unsigned HOST_WIDE_INT *strsize /* = NULL */)
+c_getstr (tree src, unsigned HOST_WIDE_INT *strsize /* = NULL */,
+	  bool *nulterm /* = NULL */)
 {
   tree offset_node;
 
-  if (strlen)
-    *strlen = 0;
+  if (strsize)
+    *strsize = 0;
 
-  src = string_constant (src, &offset_node);
+  src = string_constant (src, &offset_node, nulterm);
   if (src == 0)
     return NULL;
 
@@ -14606,47 +14606,42 @@ c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */,
 	offset = tree_to_uhwi (offset_node);
     }
 
-  /* STRING_LENGTH is the size of the string literal, including any
-     embedded NULs.  STRING_SIZE is the size of the array the string
+  /* STRING_SIZE is the size of the string literal, including any
+     embedded NULs.  ARRAY_SIZE is the size of the array the string
      literal is stored in.  */
-  unsigned HOST_WIDE_INT string_length = TREE_STRING_LENGTH (src);
-  unsigned HOST_WIDE_INT string_size = string_length;
+  unsigned HOST_WIDE_INT string_size = TREE_STRING_LENGTH (src);
+  unsigned HOST_WIDE_INT array_size = string_size;
   tree type = TREE_TYPE (src);
   if (tree size = TYPE_SIZE_UNIT (type))
     if (tree_fits_shwi_p (size))
-      string_size = tree_to_uhwi (size);
+      array_size = tree_to_uhwi (size);
+
+  const char *string = TREE_STRING_POINTER (src);
 
-  if (strlen)
+  if (strsize)
     {
-      /* Compute and store the length of the substring at OFFSET.
+      /* Compute and store the size of the substring at OFFSET.
 	 All offsets past the initial length refer to null strings.  */
-      if (offset <= string_length)
-	*strlen = string_length - offset;
+      if (offset <= string_size)
+	*strsize = string_size - offset;
       else
-	*strlen = 0;
+	*strsize = 0;
     }
 
-  const char *string = TREE_STRING_POINTER (src);
-
-  if (string_length == 0
-      || offset >= string_size)
+  if (string_size == 0
+      || offset >= array_size)
     return NULL;
 
-  if (strsize)
-    {
-      /* Support even constant character arrays that aren't proper
-	 NUL-terminated strings.  */
-      *strsize = string_size;
-    }
-  else if (string[string_length - 1] != '\0')
+  if (!nulterm && string[string_size - 1] != '\0')
     {
-      /* Support only properly NUL-terminated strings but handle
-	 consecutive strings within the same array, such as the six
-	 substrings in "1\0002\0003".  */
+      /* When NULTERM is null, support only properly nul-terminated
+	 strings but handle consecutive strings within the same array,
+	 such as the six substrings in "1\0002\0003".  Otherwise, let
+	 the caller deal with non-nul-terminated arrays.  */
       return NULL;
     }
 
-  return offset <= string_length ? string + offset : "";
+  return offset <= string_size ? string + offset : "";
 }
 
 /* Given a tree T, compute which bits in T may be nonzero.  */
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 1b9ccc0..a58a4a2 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -188,7 +188,7 @@ extern tree const_unop (enum tree_code, tree, tree);
 extern tree const_binop (enum tree_code, tree, tree, tree);
 extern bool negate_mathfn_p (combined_fn);
 extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL,
-			     unsigned HOST_WIDE_INT * = NULL);
+			     bool * = NULL);
 extern wide_int tree_nonzero_bits (const_tree);
 
 /* Return OFF converted to a pointer offset type suitable as offset for
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index c3fa570..9eefb37 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1275,11 +1275,13 @@ gimple_fold_builtin_memset (gimple_stmt_iterator *gsi, tree c, tree len)
    Set *FLEXP to true if the range of the string lengths has been
    obtained from the upper bound of an array at the end of a struct.
    Such an array may hold a string that's longer than its upper bound
-   due to it being used as a poor-man's flexible array member.  */
+   due to it being used as a poor-man's flexible array member.
+   Clear *NULTERM if ARG refers to a constant array that is known
+   not be nul-terminated.  */
 
 static bool
 get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
-		  int fuzzy, bool *flexp)
+		  int fuzzy, bool *flexp, bool *nulterm)
 {
   tree var, val = NULL_TREE;
   gimple *def_stmt;
@@ -1301,7 +1303,8 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
 	      if (TREE_CODE (aop0) == INDIRECT_REF
 		  && TREE_CODE (TREE_OPERAND (aop0, 0)) == SSA_NAME)
 		return get_range_strlen (TREE_OPERAND (aop0, 0),
-					 length, visited, type, fuzzy, flexp);
+					 length, visited, type, fuzzy, flexp,
+					 nulterm);
 	    }
 	  else if (TREE_CODE (TREE_OPERAND (op, 0)) == COMPONENT_REF && fuzzy)
 	    {
@@ -1329,13 +1332,18 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
 	    return false;
 	}
       else
-	val = c_strlen (arg, 1);
+	{
+	  tree arr;
+	  val = c_strlen (arg, 1, &arr);
+	  if (val && arr)
+	    *nulterm = false;
+	}
 
       if (!val && fuzzy)
 	{
 	  if (TREE_CODE (arg) == ADDR_EXPR)
 	    return get_range_strlen (TREE_OPERAND (arg, 0), length,
-				     visited, type, fuzzy, flexp);
+				     visited, type, fuzzy, flexp, nulterm);
 
 	  if (TREE_CODE (arg) == ARRAY_REF)
 	    {
@@ -1477,7 +1485,8 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
             || gimple_assign_unary_nop_p (def_stmt))
           {
             tree rhs = gimple_assign_rhs1 (def_stmt);
-	    return get_range_strlen (rhs, length, visited, type, fuzzy, flexp);
+	    return get_range_strlen (rhs, length, visited, type, fuzzy, flexp,
+				     nulterm);
           }
 	else if (gimple_assign_rhs_code (def_stmt) == COND_EXPR)
 	  {
@@ -1486,7 +1495,7 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
 
 	    for (unsigned int i = 0; i < 2; i++)
 	      if (!get_range_strlen (ops[i], length, visited, type, fuzzy,
-				     flexp))
+				     flexp, nulterm))
 		{
 		  if (fuzzy == 2)
 		    *maxlen = build_all_ones_cst (size_type_node);
@@ -1513,7 +1522,8 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
             if (arg == gimple_phi_result (def_stmt))
               continue;
 
-	    if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp))
+	    if (!get_range_strlen (arg, length, visited, type, fuzzy, flexp,
+				   nulterm))
 	      {
 		if (fuzzy == 2)
 		  *maxlen = build_all_ones_cst (size_type_node);
@@ -1545,19 +1555,27 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
    and false if PHIs and COND_EXPRs are to be handled optimistically,
    if we can determine string length minimum and maximum; it will use
    the minimum from the ones where it can be determined.
-   STRICT false should be only used for warning code.  */
+   STRICT false should be only used for warning code.
+   When non-null, clear *NULTERM if ARG refers to a constant array
+   that is known not be nul-terminated.  Otherwise set it to true.  */
 
 bool
-get_range_strlen (tree arg, tree minmaxlen[2], bool strict)
+get_range_strlen (tree arg, tree minmaxlen[2], bool strict /* = false */,
+		  bool *nulterm /* = NULL */)
 {
   bitmap visited = NULL;
 
   minmaxlen[0] = NULL_TREE;
   minmaxlen[1] = NULL_TREE;
 
+  bool nultermbuf;
+  if (!nulterm)
+    nulterm = &nultermbuf;
+  *nulterm = true;
+
   bool flexarray = false;
   if (!get_range_strlen (arg, minmaxlen, &visited, 1, strict ? 1 : 2,
-			 &flexarray))
+			 &flexarray, nulterm))
     {
       minmaxlen[0] = NULL_TREE;
       minmaxlen[1] = NULL_TREE;
@@ -1576,7 +1594,7 @@ get_maxval_strlen (tree arg, int type)
   tree len[2] = { NULL_TREE, NULL_TREE };
 
   bool dummy;
-  if (!get_range_strlen (arg, len, &visited, type, 0, &dummy))
+  if (!get_range_strlen (arg, len, &visited, type, 0, &dummy, NULL))
     len[1] = NULL_TREE;
   if (visited)
     BITMAP_FREE (visited);
@@ -3496,12 +3514,14 @@ static bool
 gimple_fold_builtin_strlen (gimple_stmt_iterator *gsi)
 {
   gimple *stmt = gsi_stmt (*gsi);
+  tree arg = gimple_call_arg (stmt, 0);
 
   wide_int minlen;
   wide_int maxlen;
 
   tree lenrange[2];
-  if (!get_range_strlen (gimple_call_arg (stmt, 0), lenrange, true)
+  bool nulterm;
+  if (!get_range_strlen (arg, lenrange, true, &nulterm)
       && lenrange[0] && TREE_CODE (lenrange[0]) == INTEGER_CST
       && lenrange[1] && TREE_CODE (lenrange[1]) == INTEGER_CST)
     {
@@ -3523,6 +3543,10 @@ gimple_fold_builtin_strlen (gimple_stmt_iterator *gsi)
 
   if (minlen == maxlen)
     {
+      if (!nulterm)
+	warn_string_no_nul (gimple_location (stmt), NULL_TREE,
+			    gimple_call_fndecl (stmt), arg);
+
       lenrange[0] = force_gimple_operand_gsi (gsi, lenrange[0], true, NULL,
 					      true, GSI_SAME_STMT);
       replace_call_with_value (gsi, lenrange[0]);
diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
index 04e9bfa..fe11728 100644
--- a/gcc/gimple-fold.h
+++ b/gcc/gimple-fold.h
@@ -25,7 +25,7 @@ along with GCC; see the file COPYING3.  If not see
 extern tree create_tmp_reg_or_ssa_name (tree, gimple *stmt = NULL);
 extern tree canonicalize_constructor_val (tree, tree);
 extern tree get_symbol_constant_value (tree);
-extern bool get_range_strlen (tree, tree[2], bool = false);
+extern bool get_range_strlen (tree, tree[2], bool = false, bool * = NULL);
 extern tree get_maxval_strlen (tree, int);
 extern void gimplify_and_update_call_from_tree (gimple_stmt_iterator *, tree);
 extern bool fold_stmt (gimple_stmt_iterator *);
diff --git a/gcc/testsuite/gcc.dg/warn-string-no-nul.c b/gcc/testsuite/gcc.dg/warn-string-no-nul.c
new file mode 100644
index 0000000..838528f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/warn-string-no-nul.c
@@ -0,0 +1,239 @@
+/* PR tree-optimization/86552 - missing warning for reading past the end
+   of non-string arrays
+   { dg-do compile }
+   { dg-options "-O2 -Wall -ftrack-macro-expansion=0" } */
+
+extern __SIZE_TYPE__ strlen (const char*);
+
+const char a[5] = "12345";   /* { dg-message "declared here" } */
+
+int i0 = 0;
+int i1 = 1;
+
+void sink (int, ...);
+
+#define CONCAT(a, b)   a ## b
+#define CAT(a, b)      CONCAT(a, b)
+
+#define T(str)					\
+  __attribute__ ((noipa))			\
+  void CAT (test_, __LINE__) (void) {		\
+    sink (strlen (str));			\
+  } typedef void dummy_type
+
+T (a);                /* { dg-warning "argument missing terminating nul" }  */
+T (&a[0]);            /* { dg-warning "nul" }  */
+T (&a[0] + 1);        /* { dg-warning "nul" }  */
+T (&a[1]);            /* { dg-warning "nul" }  */
+T (&a[i0]);           /* { dg-warning "nul" }  */
+T (&a[i0] + 1);       /* { dg-warning "nul" }  */
+
+
+const char b[][5] = { /* { dg-message "declared here" } */
+  "12", "123", "1234", "54321"
+};
+
+T (b[0]);
+T (b[1]);
+T (b[2]);
+T (b[3]);             /* { dg-warning "nul" }  */
+T (b[i0]);
+
+T (&b[2][1]);
+T (&b[2][1] + 1);
+T (&b[2][i0]);
+T (&b[2][1] + i0);
+
+T (&b[3][1]);         /* { dg-warning "nul" }  */
+T (&b[3][1] + 1);     /* { dg-warning "nul" }  */
+T (&b[3][i0]);        /* { dg-warning "nul" }  */
+T (&b[3][1] + i0);    /* { dg-warning "nul" }  */
+T (&b[3][i0] + i1);   /* { dg-warning "nul" }  */
+
+T (i0 ? "" : b[0]);
+T (i0 ? "" : b[1]);
+T (i0 ? "" : b[2]);
+T (i0 ? "" : b[3]);               /* { dg-warning "nul" }  */
+T (i0 ? b[0] : "");
+T (i0 ? b[1] : "");
+T (i0 ? b[2] : "");
+T (i0 ? b[3] : "");               /* { dg-warning "nul" }  */
+
+T (i0 ? "1234" : b[3]);           /* { dg-warning "nul" }  */
+T (i0 ? b[3] : "1234");           /* { dg-warning "nul" }  */
+
+T (i0 ? a : b[3]);                /* { dg-warning "nul" }  */
+T (i0 ? b[0] : b[2]);
+T (i0 ? b[2] : b[3]);             /* { dg-warning "nul" }  */
+T (i0 ? b[3] : b[2]);             /* { dg-warning "nul" }  */
+
+T (i0 ? b[0] : &b[3][0] + 1);     /* { dg-warning "nul" }  */
+T (i0 ? b[1] : &b[3][1] + i0);    /* { dg-warning "nul" }  */
+
+/* It's possible to detect the missing nul in the following two
+   expressions but GCC doesn't do it yet.  */
+T (i0 ? &b[3][1] + i0 : b[2]);    /* { dg-warning "nul" "bug" { xfail *-*-* } }  */
+T (i0 ? &b[3][i0] : &b[3][i1]);   /* { dg-warning "nul" "bug" { xfail *-*-* } }  */
+
+
+struct A { char a[5], b[5]; };
+
+const struct A s = { "1234", "12345" };
+
+T (s.a);
+T (&s.a[0]);
+T (&s.a[0] + 1);
+T (&s.a[0] + i0);
+T (&s.a[1]);
+T (&s.a[1] + 1);
+T (&s.a[1] + i0);
+
+T (s.b);              /* { dg-warning "nul" }  */
+T (&s.b[0]);          /* { dg-warning "nul" }  */
+T (&s.b[0] + 1);      /* { dg-warning "nul" }  */
+T (&s.b[0] + i0);     /* { dg-warning "nul" }  */
+T (&s.b[1]);          /* { dg-warning "nul" }  */
+T (&s.b[1] + 1);      /* { dg-warning "nul" }  */
+T (&s.b[1] + i0);     /* { dg-warning "nul" }  */
+
+struct B { struct A a[2]; };
+
+const struct B ba[] = {
+  { { { "123", "12345" }, { "12345", "123" } } },
+  { { { "12345", "123" }, { "123", "12345" } } },
+  { { { "1", "12" },      { "123", "1234" } } },
+  { { { "123", "1234" },  { "12345", "12" } } }
+};
+
+T (ba[0].a[0].a);
+T (&ba[0].a[0].a[0]);
+T (&ba[0].a[0].a[0] + 1);
+T (&ba[0].a[0].a[0] + i0);
+T (&ba[0].a[0].a[1]);
+T (&ba[0].a[0].a[1] + 1);
+T (&ba[0].a[0].a[1] + i0);
+
+T (ba[0].a[0].b);           /* { dg-warning "nul" }  */
+T (&ba[0].a[0].b[0]);       /* { dg-warning "nul" }  */
+T (&ba[0].a[0].b[0] + 1);   /* { dg-warning "nul" }  */
+T (&ba[0].a[0].b[0] + i0);  /* { dg-warning "nul" }  */
+T (&ba[0].a[0].b[1]);       /* { dg-warning "nul" }  */
+T (&ba[0].a[0].b[1] + 1);   /* { dg-warning "nul" }  */
+T (&ba[0].a[0].b[1] + i0);  /* { dg-warning "nul" }  */
+
+T (ba[0].a[1].a);           /* { dg-warning "nul" }  */
+T (&ba[0].a[1].a[0]);       /* { dg-warning "nul" }  */
+T (&ba[0].a[1].a[0] + 1);   /* { dg-warning "nul" }  */
+T (&ba[0].a[1].a[0] + i0);  /* { dg-warning "nul" }  */
+T (&ba[0].a[1].a[1]);       /* { dg-warning "nul" }  */
+T (&ba[0].a[1].a[1] + 1);   /* { dg-warning "nul" }  */
+T (&ba[0].a[1].a[1] + i0);  /* { dg-warning "nul" }  */
+
+T (ba[0].a[1].b);
+T (&ba[0].a[1].b[0]);
+T (&ba[0].a[1].b[0] + 1);
+T (&ba[0].a[1].b[0] + i0);
+T (&ba[0].a[1].b[1]);
+T (&ba[0].a[1].b[1] + 1);
+T (&ba[0].a[1].b[1] + i0);
+
+
+T (ba[1].a[0].a);           /* { dg-warning "nul" }  */
+T (&ba[1].a[0].a[0]);       /* { dg-warning "nul" }  */
+T (&ba[1].a[0].a[0] + 1);   /* { dg-warning "nul" }  */
+T (&ba[1].a[0].a[0] + i0);  /* { dg-warning "nul" }  */
+T (&ba[1].a[0].a[1]);       /* { dg-warning "nul" }  */
+T (&ba[1].a[0].a[1] + 1);   /* { dg-warning "nul" }  */
+T (&ba[1].a[0].a[1] + i0);  /* { dg-warning "nul" }  */
+
+T (ba[1].a[0].b);
+T (&ba[1].a[0].b[0]);
+T (&ba[1].a[0].b[0] + 1);
+T (&ba[1].a[0].b[0] + i0);
+T (&ba[1].a[0].b[1]);
+T (&ba[1].a[0].b[1] + 1);
+T (&ba[1].a[0].b[1] + i0);
+
+T (ba[1].a[1].a);
+T (&ba[1].a[1].a[0]);
+T (&ba[1].a[1].a[0] + 1);
+T (&ba[1].a[1].a[0] + i0);
+T (&ba[1].a[1].a[1]);
+T (&ba[1].a[1].a[1] + 1);
+T (&ba[1].a[1].a[1] + i0);
+
+T (ba[1].a[1].b);           /* { dg-warning "nul" }  */
+T (&ba[1].a[1].b[0]);       /* { dg-warning "nul" }  */
+T (&ba[1].a[1].b[0] + 1);   /* { dg-warning "nul" }  */
+T (&ba[1].a[1].b[0] + i0);  /* { dg-warning "nul" }  */
+T (&ba[1].a[1].b[1]);       /* { dg-warning "nul" }  */
+T (&ba[1].a[1].b[1] + 1);   /* { dg-warning "nul" }  */
+T (&ba[1].a[1].b[1] + i0);  /* { dg-warning "nul" }  */
+
+
+T (ba[2].a[0].a);
+T (&ba[2].a[0].a[0]);
+T (&ba[2].a[0].a[0] + 1);
+T (&ba[2].a[0].a[0] + i0);
+T (&ba[2].a[0].a[1]);
+T (&ba[2].a[0].a[1] + 1);
+T (&ba[2].a[0].a[1] + i0);
+
+T (ba[2].a[0].b);
+T (&ba[2].a[0].b[0]);
+T (&ba[2].a[0].b[0] + 1);
+T (&ba[2].a[0].b[0] + i0);
+T (&ba[2].a[0].b[1]);
+T (&ba[2].a[0].b[1] + 1);
+T (&ba[2].a[0].b[1] + i0);
+
+T (ba[2].a[1].a);
+T (&ba[2].a[1].a[0]);
+T (&ba[2].a[1].a[0] + 1);
+T (&ba[2].a[1].a[0] + i0);
+T (&ba[2].a[1].a[1]);
+T (&ba[2].a[1].a[1] + 1);
+T (&ba[2].a[1].a[1] + i0);
+
+
+T (ba[3].a[0].a);
+T (&ba[3].a[0].a[0]);
+T (&ba[3].a[0].a[0] + 1);
+T (&ba[3].a[0].a[0] + i0);
+T (&ba[3].a[0].a[1]);
+T (&ba[3].a[0].a[1] + 1);
+T (&ba[3].a[0].a[1] + i0);
+
+T (ba[3].a[0].b);
+T (&ba[3].a[0].b[0]);
+T (&ba[3].a[0].b[0] + 1);
+T (&ba[3].a[0].b[0] + i0);
+T (&ba[3].a[0].b[1]);
+T (&ba[3].a[0].b[1] + 1);
+T (&ba[3].a[0].b[1] + i0);
+
+T (ba[3].a[1].a);           /* { dg-warning "nul" }  */
+T (&ba[3].a[1].a[0]);	    /* { dg-warning "nul" }  */
+T (&ba[3].a[1].a[0] + 1);   /* { dg-warning "nul" }  */
+T (&ba[3].a[1].a[0] + i0);  /* { dg-warning "nul" }  */
+T (&ba[3].a[1].a[1]);	    /* { dg-warning "nul" }  */
+T (&ba[3].a[1].a[1] + 1);   /* { dg-warning "nul" }  */
+T (&ba[3].a[1].a[1] + i0);  /* { dg-warning "nul" }  */
+
+T (ba[3].a[1].b);
+T (&ba[3].a[1].b[0]);	
+T (&ba[3].a[1].b[0] + 1);
+T (&ba[3].a[1].b[0] + i0);
+T (&ba[3].a[1].b[1]);	
+T (&ba[3].a[1].b[1] + 1);
+T (&ba[3].a[1].b[1] + i0);
+
+
+T (i0 ? ba[0].a[0].a : ba[0].a[0].b);           /* { dg-warning "nul" }  */
+T (i0 ? ba[0].a[0].a : ba[0].a[0].b);           /* { dg-warning "nul" }  */
+
+T (i0 ? &ba[0].a[0].a[0] : &ba[3].a[1].a[0]);   /* { dg-warning "nul" }  */
+T (i0 ? &ba[3].a[1].a[1] :  ba[0].a[0].a);      /* { dg-warning "nul" }  */
+
+T (i0 ? ba[0].a[0].a : ba[0].a[1].b);
+T (i0 ? ba[0].a[1].b : ba[0].a[0].a);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* PING [PATCH] warn for strlen of arrays with missing nul (PR 86552)
  2018-07-19 20:09 Martin Sebor
@ 2018-07-25 23:38 ` Martin Sebor
  2018-07-30 19:18   ` Martin Sebor
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Sebor @ 2018-07-25 23:38 UTC (permalink / raw)
  To: Gcc Patch List

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html

The fix for bug 86532 has been checked in so this enhancement
can now be applied on top of it (with only minor adjustments).

On 07/19/2018 02:08 PM, Martin Sebor wrote:
> In the discussion of my patch for pr86532 Bernd noted that
> GCC silently accepts constant character arrays with no
> terminating nul as arguments to strlen (and other string
> functions).
>
> The attached patch is a first step in detecting these kinds
> of bugs in strlen calls by issuing -Wstringop-overflow.
> The next step is to modify all other handlers of built-in
> functions to detect the same problem (not part of this patch).
> Yet another step is to detect these problems in arguments
> initialized using the non-string form:
>
>   const char a[] = { 'a', 'b', 'c' };
>
> This patch is meant to apply on top of the one for bug 86532
> (I tested it with an earlier version of that patch so there
> is code in the context that does not appear in the latest
> version of the other diff).
>
> Martin
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-08-01 20:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-30 21:12 PING [PATCH] warn for strlen of arrays with missing nul (PR 86552) Bernd Edlinger
2018-07-31  3:52 ` Martin Sebor
2018-08-01 14:21   ` Bernd Edlinger
2018-08-01 16:34     ` Martin Sebor
2018-08-01 17:16       ` Bernd Edlinger
2018-08-01 20:33       ` Martin Sebor
  -- strict thread matches above, loose matches on Subject: below --
2018-07-19 20:09 Martin Sebor
2018-07-25 23:38 ` PING " Martin Sebor
2018-07-30 19:18   ` Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).