public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/110884] New: strncmp false positive with -Wstringop-overread on coreutils
@ 2023-08-03  6:41 eggert at cs dot ucla.edu
  2023-08-03  6:51 ` [Bug middle-end/110884] " pinskia at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: eggert at cs dot ucla.edu @ 2023-08-03  6:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110884

            Bug ID: 110884
           Summary: strncmp false positive with -Wstringop-overread on
                    coreutils
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: eggert at cs dot ucla.edu
  Target Milestone: ---

I have this problem with both GCC 13.2.0 (x86-64) and GCC 13.1.1 20230614 (Red
Hat 13.1.1-4) x86-64. I see that bug 104854 was filed for something similar,
but was then closed with a request for "test cases, preferably from real code".
I hope the test case below helps.

I ran into the problem when hacking on GNU coreutils, which needs to deal with
a data structure that on some platforms is a fixed size array suitable for
strnlen, strncmp, etc., and on other platforms is a null-terminated string.
Consider the following code, abstracted from coreutils:

  int strncmp (char const *, char const *, unsigned long);

  #ifdef FIXED_SIZE_BUFFERS
  enum { SIZE = 512 };
  #else
  enum { SIZE = -1 }; /* no limit */
  #endif

  _Bool
  equal_buffers (char *a, char *b)
  {
    return strncmp (a, b, SIZE) == 0;
  }

Although this code is correct, "gcc -O2 -S t.c" complains:

  t.c: In function ‘equal_buffers’:
  t.c:12:10: warning: ‘strncmp’ specified bound 18446744073709551615 exceeds
maximum object size 9223372036854775807 [-Wstringop-overread]
     12 |   return strncmp (a, b, SIZE) == 0;
        |          ^~~~~~~~~~~~~~~~~~~~

This warning is irrelevant to strncmp since it doesn't matter that (size_t) -1
is greater than any object size; strncmp will stop at the first '\0' byte,
which is what is wanted when FIXED_SIZE_BUFFERS is not defined.

Although I can work around the bug by "#define SIZE 9223372036854775807" on
this platform, that's awkward and confusing.

I assume there are similar problems with strnlen etc.

I think I'll work around the problem by disabling -Wstringop-overread until the
bug is fixed somehow or a better workaround is suggested.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/110884] strncmp false positive with -Wstringop-overread on coreutils
  2023-08-03  6:41 [Bug middle-end/110884] New: strncmp false positive with -Wstringop-overread on coreutils eggert at cs dot ucla.edu
@ 2023-08-03  6:51 ` pinskia at gcc dot gnu.org
  2023-08-03 11:45 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-03  6:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110884

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
You could use SSIZE_MAX here ...
That is what 9223372036854775807 is in this case ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/110884] strncmp false positive with -Wstringop-overread on coreutils
  2023-08-03  6:41 [Bug middle-end/110884] New: strncmp false positive with -Wstringop-overread on coreutils eggert at cs dot ucla.edu
  2023-08-03  6:51 ` [Bug middle-end/110884] " pinskia at gcc dot gnu.org
@ 2023-08-03 11:45 ` rguenth at gcc dot gnu.org
  2023-08-06  2:49 ` eggert at cs dot ucla.edu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-03 11:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110884

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |97048

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note you are basically trying to use strcmp (a, b), so why not do that?

 _Bool
  equal_buffers (char *a, char *b)
  {
#ifdef FIXED_SIZE_BUFFERS
    return strncmp (a, b, SIZE) == 0;
#else
    return strcmp (a, b) == 0;
#endif
  }

?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97048
[Bug 97048] [meta-bug] bogus/missing -Wstringop-overread warnings

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/110884] strncmp false positive with -Wstringop-overread on coreutils
  2023-08-03  6:41 [Bug middle-end/110884] New: strncmp false positive with -Wstringop-overread on coreutils eggert at cs dot ucla.edu
  2023-08-03  6:51 ` [Bug middle-end/110884] " pinskia at gcc dot gnu.org
  2023-08-03 11:45 ` rguenth at gcc dot gnu.org
@ 2023-08-06  2:49 ` eggert at cs dot ucla.edu
  2023-08-06  2:59 ` pinskia at gcc dot gnu.org
  2023-08-06  6:54 ` eggert at cs dot ucla.edu
  4 siblings, 0 replies; 6+ messages in thread
From: eggert at cs dot ucla.edu @ 2023-08-06  2:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110884

--- Comment #3 from Paul Eggert <eggert at cs dot ucla.edu> ---
(In reply to Richard Biener from comment #2)
> you are basically trying to use strcmp (a, b), so why not do that?

strcmp would not work on Fedora 38, or on most current coreutils platforms.

In the full coreutils code, src/system.h contains this definition[1]:

  /* n==-1 means unbounded */
  #define STREQ_LEN(a, b, n) (strncmp (a, b, n) == 0)

and src/who.c contains this expression[2]:

  STREQ_LEN (ttyname_b, utmp_buf->ut_line, UT_LINE_SIZE)

where UT_LINE_SIZE is an enum that is a small positive number on Fedora and
most platforms (where the arguments are UT_LINE_SIZE-byte character arrays,
possibly not null-terminated), and UT_LINE_SIZE is -1 on rare platforms (where
the arguments are char * pointers to null-terminated strings of unbounded
length).

By the way, we tried to work around the GCC false positive this way:

  #define STREQ_LEN(a, b, n) \
    ((n) < 0 ? strcmp (a, b) == 0 : strncmp (a, b, n) == 0)

but this didn't pacify GCC. When N is a negative constant expression, GCC emits
a -Wstringop-overread false positive even on the strncmp call that cannot
possibly be executed.

GCC -Wstringop-overread is supposed to (quoting the GCC manual) "Warn for calls
to string manipulation functions such as ‘memchr’, or ‘strcpy’ that are
determined to read past the end of the source sequence." But the abovementioned
calls to strncpy do not read past the end of the source sequence. So this is a
situation where either GCC's code or its documentation is incorrect.

Getting back to the code that prompted this bug report: strncmp is a primitive
designed for an oddball data structure designed back in the 1970s for tiny
machines. This data structure is a fixed-size character array that either
contains a null-terminated string, or contains all non-null bytes. Nobody liked
this data structure back in the day; people put up with it only because the
machines were so small (and hey! I was there! I wrote code for those
machines!). Nowadays the only code that should use strncmp is code like
coreutils/src/who.c that is still stick with ancient data structures designed
for 1970s machines. People who maintain legacy strncmp-using code should know
what they're doing, and if you know what you're doing (as I hope is the case
with the GNU coreutils maintainers :-) it's OK to pass SIZE_MAX to strncmp.

[1]: https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/system.h#n195
[2]: https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/who.c#n612



(In reply to Andrew Pinski from comment #1)
> You could use SSIZE_MAX here ...
> That is what 9223372036854775807 is in this case ...

Although that would work on GNU/Linux, POSIX doesn't guarantee it to be
portable, since ssize_t be 32 bits on platforms with 64-bit size_t (and some
historical implementations did that; on these systems syscalls like 'read'
never returned a value greater than 2**31 - 1).

Using (PTRDIFF_MAX < SIZE_MAX ? PTRDIFF_MAX : SIZE_MAX) would work, but is (as
I wrote) awkward. Among other things, the coreutils code uses an enum to record
the size (or -1), and enums can't portably hold values as large as SIZE_MAX or
PTRDIFF_MAX. So we'd have to use a macro instead, and that would have its own
problems.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/110884] strncmp false positive with -Wstringop-overread on coreutils
  2023-08-03  6:41 [Bug middle-end/110884] New: strncmp false positive with -Wstringop-overread on coreutils eggert at cs dot ucla.edu
                   ` (2 preceding siblings ...)
  2023-08-06  2:49 ` eggert at cs dot ucla.edu
@ 2023-08-06  2:59 ` pinskia at gcc dot gnu.org
  2023-08-06  6:54 ` eggert at cs dot ucla.edu
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-06  2:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110884

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Paul Eggert from comment #3)
> 
> Using (PTRDIFF_MAX < SIZE_MAX ? PTRDIFF_MAX : SIZE_MAX) would work, but is
> (as I wrote) awkward. Among other things, the coreutils code uses an enum to
> record the size (or -1), and enums can't portably hold values as large as
> SIZE_MAX or PTRDIFF_MAX. So we'd have to use a macro instead, and that would
> have its own problems.

PTRDIFF_MAX is required to be less than SIZE_MAX and is the max size of an
array because otherwise a-b would be undefined ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/110884] strncmp false positive with -Wstringop-overread on coreutils
  2023-08-03  6:41 [Bug middle-end/110884] New: strncmp false positive with -Wstringop-overread on coreutils eggert at cs dot ucla.edu
                   ` (3 preceding siblings ...)
  2023-08-06  2:59 ` pinskia at gcc dot gnu.org
@ 2023-08-06  6:54 ` eggert at cs dot ucla.edu
  4 siblings, 0 replies; 6+ messages in thread
From: eggert at cs dot ucla.edu @ 2023-08-06  6:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110884

--- Comment #5 from Paul Eggert <eggert at cs dot ucla.edu> ---
(In reply to Andrew Pinski from comment #4)
> PTRDIFF_MAX is required to be less than SIZE_MAX and is the max size of an
> array because otherwise a-b would be undefined ...

That is true for glibc, but it's not guaranteed by the C standard or by POSIX,
and coreutils tries to be portable to odd but conforming platforms. In theory
size_t can be 32 bits while ptrdiff_t is 64 bits. It's not much trouble to
write MIN (PTRDIFF_MAX, SIZE_MAX) in the few places where it matters.

C and POSIX also allow arrays with more than PTRDIFF_MAX elements. However,
coreutils takes pains to never create such an array, even on the non-glibc
platforms that allow them; this avoids the undefined behavior you mentioned.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-06  6:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-03  6:41 [Bug middle-end/110884] New: strncmp false positive with -Wstringop-overread on coreutils eggert at cs dot ucla.edu
2023-08-03  6:51 ` [Bug middle-end/110884] " pinskia at gcc dot gnu.org
2023-08-03 11:45 ` rguenth at gcc dot gnu.org
2023-08-06  2:49 ` eggert at cs dot ucla.edu
2023-08-06  2:59 ` pinskia at gcc dot gnu.org
2023-08-06  6:54 ` eggert at cs dot ucla.edu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).