public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/94313] New: stores into string literals sometimes silently eliminated
@ 2020-03-24 21:54 msebor at gcc dot gnu.org
  2020-03-25  7:54 ` [Bug middle-end/94313] " rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: msebor at gcc dot gnu.org @ 2020-03-24 21:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94313

            Bug ID: 94313
           Summary: stores into string literals sometimes silently
                    eliminated
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

Modifying a const-qualified object or a string literal is undefined in both C
and C++ and can result in subtle bugs.  Detecting attempts to do so as
requested in pr90404 will help prevent such hard-to-find bugs.  However,
reliable detection is made difficult by optimizations that silently eliminate
stores to such objects.  The test case below shows that the second of the four
invalid stores below (in g1) is silently eliminated (by DSE) while the others
are retained.  (None of them is diagnosed.)  The first two will likely crash if
the strings are stored in ROM.  The second two could lead in unexpected results
as a result of other optimizations (such as the strlen pass) that assume that
const objects cannot change.

The last revision I have access to that did not eliminate the store is r145753.
 The first revision that does is r145825, so the change was introduced sometime
in between these two.

$ cat z.c && gcc -O2 -S -Wall -Wwrite-strings -fdump-tree-optimized=/dev/stdout
z.c
void f (void*);

static char* str (void) { return (char*)"abc"; }

void g0 (int x)
{
  char *s = __builtin_strchr (str (), 'a');
  *s = 'x';   // not eliminated (okay)
  f (s);
}

void g1 (int x)
{
  char *s = __builtin_strchr (str (), x);
  *s = 'x';   // eliminated
  f (s);
}

const char a[] = "bcd";

void g2 (void)
{
  char *s = __builtin_strchr (a, 'b');
  *s = 'x';   // not eliminated (okay)
  f (s);
}

void g3 (int x)
{
  const char a[] = "cde";
  char *s = __builtin_strchr (a, 'c');
  *s = 'x';   // not eliminated (okay)
  f (s);
}


;; Function g0 (g0, funcdef_no=1, decl_uid=1935, cgraph_uid=2, symbol_order=1)

g0 (int x)
{
  <bb 2> [local count: 1073741824]:
  MEM[(char *)"abc"] = 120;
  f ("abc"); [tail call]
  return;

}



;; Function g1 (g1, funcdef_no=2, decl_uid=1939, cgraph_uid=3, symbol_order=2)

g1 (int x)
{
  char * s;

  <bb 2> [local count: 1073741824]:
  s_3 = __builtin_strchr ("abc", x_2(D));
  f (s_3); [tail call]
  return;

}



;; Function g2 (g2, funcdef_no=3, decl_uid=1944, cgraph_uid=4, symbol_order=4)

g2 ()
{
  <bb 2> [local count: 1073741824]:
  MEM[(char *)&a] = 120;
  f (&a); [tail call]
  return;

}



;; Function g3 (g3, funcdef_no=4, decl_uid=1948, cgraph_uid=5, symbol_order=5)

g3 (int x)
{
  char * s;
  const char a[4];

  <bb 2> [local count: 1073741824]:
  a = "cde";
  s_3 = __builtin_strchr (&a, 99);
  *s_3 = 120;
  f (s_3);
  a ={v} {CLOBBER};
  return;

}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug middle-end/94313] stores into string literals sometimes silently eliminated
  2020-03-24 21:54 [Bug middle-end/94313] New: stores into string literals sometimes silently eliminated msebor at gcc dot gnu.org
@ 2020-03-25  7:54 ` rguenth at gcc dot gnu.org
  2020-03-25 16:33 ` msebor at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-03-25  7:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94313

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is likely because points-to computes the result of the string functions,
for example in g1 it computes

s_5 = { NULL STRING }

and somehow ref_maybe_used_by_stmt_p computes false for *s and f(s).
That's from

      if (pt_solutions_intersect (gimple_call_use_set (call), &pi->pt))
        return true;

where the points-to set for s_5 demotes to

$8 = {anything = 0, nonlocal = 0, escaped = 0, ipa_escaped = 0, null = 1, 
  vars_contains_nonlocal = 0, vars_contains_escaped = 0, 
  vars_contains_escaped_heap = 0, vars_contains_restrict = 0, 
  vars_contains_interposable = 0, vars = 0x7ffff6d7c520}

so we dropped STRING which we do as follows:

          else if (vi->id == string_id)
            /* Nobody cares - STRING_CSTs are read-only entities.  */
            ;

which is of course true, since nothing can modify STRING_CSTs we can
technically ignore uses of appearant modifications.

Now the inconsistency is that we treat

  MEM[(char *)"abc"] = 120;
  f ("abc"); [tail call]

differently - a missed optimization(?).  Note that before one of my
recent fixes we didn't even treat such stmts as stores (now we do).

So I see nothing inherently wrong here.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug middle-end/94313] stores into string literals sometimes silently eliminated
  2020-03-24 21:54 [Bug middle-end/94313] New: stores into string literals sometimes silently eliminated msebor at gcc dot gnu.org
  2020-03-25  7:54 ` [Bug middle-end/94313] " rguenth at gcc dot gnu.org
@ 2020-03-25 16:33 ` msebor at gcc dot gnu.org
  2020-03-25 17:49 ` rguenther at suse dot de
  2020-03-25 20:08 ` msebor at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: msebor at gcc dot gnu.org @ 2020-03-25 16:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94313

--- Comment #2 from Martin Sebor <msebor at gcc dot gnu.org> ---
Removing invalid code not isn't wrong (as in non-conforming), but it's
decidedly unhelpful in avoiding the undefined behavior that doesn't necessarily
go away just because the invalid statement is gone.  It makes finding the
underlying bugs difficult.  My solution for pr90404 detects most of these bugs
(and can also eliminate the buggy code) except those that are removed by DSE. 
The intent of this report isn't so much to complain that GCC is doing something
wrong but to document the rationale for doing something different (i.e.,
diagnose these bugs consistently and leave it up to the user whether to
eliminate the code or replace it with a trap or something else).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug middle-end/94313] stores into string literals sometimes silently eliminated
  2020-03-24 21:54 [Bug middle-end/94313] New: stores into string literals sometimes silently eliminated msebor at gcc dot gnu.org
  2020-03-25  7:54 ` [Bug middle-end/94313] " rguenth at gcc dot gnu.org
  2020-03-25 16:33 ` msebor at gcc dot gnu.org
@ 2020-03-25 17:49 ` rguenther at suse dot de
  2020-03-25 20:08 ` msebor at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenther at suse dot de @ 2020-03-25 17:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94313

--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 25 Mar 2020, msebor at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94313
> 
> --- Comment #2 from Martin Sebor <msebor at gcc dot gnu.org> ---
> Removing invalid code not isn't wrong (as in non-conforming), but it's
> decidedly unhelpful in avoiding the undefined behavior that doesn't necessarily
> go away just because the invalid statement is gone.  It makes finding the
> underlying bugs difficult.  My solution for pr90404 detects most of these bugs
> (and can also eliminate the buggy code) except those that are removed by DSE. 
> The intent of this report isn't so much to complain that GCC is doing something
> wrong but to document the rationale for doing something different (i.e.,
> diagnose these bugs consistently and leave it up to the user whether to
> eliminate the code or replace it with a trap or something else).

Usually the whole point of undefined behavior is that you do not
need to detect it.  When you detect undefined behavior you can try
to do something sensible (try to second guess DWIM) like for example
how we allow invalid type-punning when we see a trivial must-alias.

In this case points-to analysis does not bother to invent something
to represent not useful information because it's not useful and just
wastes resources.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug middle-end/94313] stores into string literals sometimes silently eliminated
  2020-03-24 21:54 [Bug middle-end/94313] New: stores into string literals sometimes silently eliminated msebor at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-03-25 17:49 ` rguenther at suse dot de
@ 2020-03-25 20:08 ` msebor at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: msebor at gcc dot gnu.org @ 2020-03-25 20:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94313

--- Comment #4 from Martin Sebor <msebor at gcc dot gnu.org> ---
An implementation is free to do whatever it wants when it finds
invalid/undefined code.  A quality implementation will also let the user know
about it so it can be fixed.  An even better one will let the user choose what
to do in response (as we discussed in Manchester, it could either trap,
substitute zero, or simply remove the code and proceed as if it was never
there; other alternatives might be possible).  This is all perfectly compatible
with your preference.  As long as the code is diagnosed it also meets my goal.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-03-25 20:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-24 21:54 [Bug middle-end/94313] New: stores into string literals sometimes silently eliminated msebor at gcc dot gnu.org
2020-03-25  7:54 ` [Bug middle-end/94313] " rguenth at gcc dot gnu.org
2020-03-25 16:33 ` msebor at gcc dot gnu.org
2020-03-25 17:49 ` rguenther at suse dot de
2020-03-25 20:08 ` msebor at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).