public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access)
@ 2023-03-27 19:24 pmorf at apple dot com
  2023-03-27 19:27 ` [Bug other/109306] The strstr implementation in libiberty might have " pinskia at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: pmorf at apple dot com @ 2023-03-27 19:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

            Bug ID: 109306
           Summary: The strstr function might do undefined behavior (out
                    of bounds mem access)
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pmorf at apple dot com
  Target Milestone: ---

Hi,

I think strstr

https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=libiberty/strstr.c;hb=HEAD

may run into undefined behavior by reading the process memory way past the end
of s1.
The implementation will still run memcmp even starting on the last character of
s1, for the full length of s2.
That also means returning null if len(s2) > len(s1) is not enough.

Something along those lines would possibly work :

    char* strstr (const char* s1, const char* s2)
    {
        const size_t s1Len = strlen (s1);
        const size_t s2Len = strlen (s2);

        if (s1Len < s2Len) return (0);

        const char* const endSearchPtr = s1 + s1Len - s2Len;
        while (s1 <= endSearchPtr && *s1)
        {
            if (!memcmp (s1, s2, s2Len))
                return s1;
            ++ s1;
        }
        return (0);
    }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
@ 2023-03-27 19:27 ` pinskia at gcc dot gnu.org
  2023-03-27 19:34 ` pmorf at apple dot com
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-27 19:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|The strstr function might   |The strstr implementation
                   |do undefined behavior (out  |in libiberty might have
                   |of bounds mem access)       |undefined behavior (out of
                   |                            |bounds mem access)
          Component|c                           |other

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I highly doubt strstr.c inside libiberty is used on any host in the last 10
years or so since strstr is C90/c++98 and GCC requires a C++11 compiler/library
as a host.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
  2023-03-27 19:27 ` [Bug other/109306] The strstr implementation in libiberty might have " pinskia at gcc dot gnu.org
@ 2023-03-27 19:34 ` pmorf at apple dot com
  2023-03-27 19:37 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pmorf at apple dot com @ 2023-03-27 19:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #2 from pmorf at apple dot com ---
Oh. Is there a different implementation I'll link against when telling gcc to
use the c++11 standard ?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
  2023-03-27 19:27 ` [Bug other/109306] The strstr implementation in libiberty might have " pinskia at gcc dot gnu.org
  2023-03-27 19:34 ` pmorf at apple dot com
@ 2023-03-27 19:37 ` pinskia at gcc dot gnu.org
  2023-03-27 19:40 ` pmorf at apple dot com
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-27 19:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to pmorf from comment #2)
> Oh. Is there a different implementation I'll link against when telling gcc
> to use the c++11 standard ?

I am saying that implementation is only used if the host's libc does not have
strstr for compiling/linking GCC itself; it is not used otherwise.

Otherwise the strstr implementation is in glibc on Linux or Apple's libc on Mac
OS X or musl on some Linux distro. Or in Microsoft's libc on Windows.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (2 preceding siblings ...)
  2023-03-27 19:37 ` pinskia at gcc dot gnu.org
@ 2023-03-27 19:40 ` pmorf at apple dot com
  2023-03-27 19:46 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pmorf at apple dot com @ 2023-03-27 19:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #4 from pmorf at apple dot com ---
Ha got it. thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (3 preceding siblings ...)
  2023-03-27 19:40 ` pmorf at apple dot com
@ 2023-03-27 19:46 ` jakub at gcc dot gnu.org
  2023-03-27 19:48 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-27 19:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Plus, not sure if doing strlen on the first argument is desirable, that often
is quite long string.
Perhaps we should just use there strncmp instead of memcmp.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (4 preceding siblings ...)
  2023-03-27 19:46 ` jakub at gcc dot gnu.org
@ 2023-03-27 19:48 ` jakub at gcc dot gnu.org
  2023-04-02 18:07 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-27 19:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Or we could take strstr from gnulib, which is much larger (and faster).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (5 preceding siblings ...)
  2023-03-27 19:48 ` jakub at gcc dot gnu.org
@ 2023-04-02 18:07 ` cvs-commit at gcc dot gnu.org
  2023-04-02 18:14 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-02 18:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:1719fa40c4ee4def60a2ce2f27e17f8168cf28ba

commit r13-6975-g1719fa40c4ee4def60a2ce2f27e17f8168cf28ba
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sun Apr 2 20:05:31 2023 +0200

    libiberty: Make strstr.c in libiberty ANSI compliant

    On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrote:
    >
    > On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote:
    > > The original code in libiberty says "FIXME" and then says it has not
been
    > > validated to be ANSI compliant. However, this patch changes the
function to
    > > match implementations that ARE compliant, and such code is in the
public
    > > domain.
    > >
    > > I ran the test results, and there are no test failures.
    >
    > Thanks.  This seems to be the standard "simple" strstr implementation. 
    > There's significantly faster implementations available, but I doubt it's
    > worth the effort as the version in this file only gets used if there is
    > no system strstr.c.

    Except that PR109306 says the new version is non-compliant and
    is certainly slower than what we used to have.  The only problem I see
    on the old version (sure, it is not very fast version) is that for
    strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because
    strchr in that case changed p to point to the last character and then
    strncmp returned 0.

    The question reported in PR109306 is whether memcmp is required not to
    access characters beyond the first difference or not.
    For all of memcmp/strcmp/strncmp, C17 says:
    "The sign of a nonzero value returned by the comparison functions memcmp,
strcmp, and strncmp
    is determined by the sign of the difference between the values of the first
pair of characters (both
    interpreted as unsigned char) that differ in the objects being compared."
    but then in memcmp description says:
    "The memcmp function compares the first n characters of the object pointed
to by s1 to the first n
    characters of the object pointed to by s2."
    rather than something similar to strncmp wording:
    "The strncmp function compares not more than n characters (characters that
follow a null character
    are not compared) from the array pointed to by s1 to the array pointed to
by
    s2."
    So, while for strncmp it seems clearly well defined when there is zero
    terminator before reaching the n, for memcmp it is unclear if say
    int
    memcmp (const void *s1, const void *s2, size_t n)
    {
      int ret = 0;
      size_t i;
      const unsigned char *p1 = (const unsigned char *) s1;
      const unsigned char *p2 = (const unsigned char *) s2;

      for (i = n; i; i--)
        if (p1[i - 1] != p2[i - 1])
          ret = p1[i - 1] < p2[i - 1] ? -1 : 1;
      return ret;
    }
    wouldn't be valid implementation (one which always compares all characters
    and just returns non-zero from the first one that differs).

    So, shouldn't we just revert and handle the len == 0 case correctly?

    I think almost nothing really uses it, but still, the old version
    at least worked nicer with a fast strchr.
    Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred
    because strchr already compared the first character.

    2023-04-02  Jakub Jelinek  <jakub@redhat.com>

            PR other/109306
            * strstr.c: Revert the 2020-11-13 changes.
            (strstr): Return s1 if len is 0.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (6 preceding siblings ...)
  2023-04-02 18:07 ` cvs-commit at gcc dot gnu.org
@ 2023-04-02 18:14 ` jakub at gcc dot gnu.org
  2023-04-18  7:16 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-04-02 18:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed now.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (7 preceding siblings ...)
  2023-04-02 18:14 ` jakub at gcc dot gnu.org
@ 2023-04-18  7:16 ` cvs-commit at gcc dot gnu.org
  2023-05-02 20:16 ` cvs-commit at gcc dot gnu.org
  2023-05-03 15:23 ` cvs-commit at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-18  7:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:1c7d418661e385d7428842088b93016e2be43ce4

commit r12-9426-g1c7d418661e385d7428842088b93016e2be43ce4
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sun Apr 2 20:05:31 2023 +0200

    libiberty: Make strstr.c in libiberty ANSI compliant

    On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrote:
    >
    > On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote:
    > > The original code in libiberty says "FIXME" and then says it has not
been
    > > validated to be ANSI compliant. However, this patch changes the
function to
    > > match implementations that ARE compliant, and such code is in the
public
    > > domain.
    > >
    > > I ran the test results, and there are no test failures.
    >
    > Thanks.  This seems to be the standard "simple" strstr implementation. 
    > There's significantly faster implementations available, but I doubt it's
    > worth the effort as the version in this file only gets used if there is
    > no system strstr.c.

    Except that PR109306 says the new version is non-compliant and
    is certainly slower than what we used to have.  The only problem I see
    on the old version (sure, it is not very fast version) is that for
    strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because
    strchr in that case changed p to point to the last character and then
    strncmp returned 0.

    The question reported in PR109306 is whether memcmp is required not to
    access characters beyond the first difference or not.
    For all of memcmp/strcmp/strncmp, C17 says:
    "The sign of a nonzero value returned by the comparison functions memcmp,
strcmp, and strncmp
    is determined by the sign of the difference between the values of the first
pair of characters (both
    interpreted as unsigned char) that differ in the objects being compared."
    but then in memcmp description says:
    "The memcmp function compares the first n characters of the object pointed
to by s1 to the first n
    characters of the object pointed to by s2."
    rather than something similar to strncmp wording:
    "The strncmp function compares not more than n characters (characters that
follow a null character
    are not compared) from the array pointed to by s1 to the array pointed to
by
    s2."
    So, while for strncmp it seems clearly well defined when there is zero
    terminator before reaching the n, for memcmp it is unclear if say
    int
    memcmp (const void *s1, const void *s2, size_t n)
    {
      int ret = 0;
      size_t i;
      const unsigned char *p1 = (const unsigned char *) s1;
      const unsigned char *p2 = (const unsigned char *) s2;

      for (i = n; i; i--)
        if (p1[i - 1] != p2[i - 1])
          ret = p1[i - 1] < p2[i - 1] ? -1 : 1;
      return ret;
    }
    wouldn't be valid implementation (one which always compares all characters
    and just returns non-zero from the first one that differs).

    So, shouldn't we just revert and handle the len == 0 case correctly?

    I think almost nothing really uses it, but still, the old version
    at least worked nicer with a fast strchr.
    Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred
    because strchr already compared the first character.

    2023-04-02  Jakub Jelinek  <jakub@redhat.com>

            PR other/109306
            * strstr.c: Revert the 2020-11-13 changes.
            (strstr): Return s1 if len is 0.

    (cherry picked from commit 1719fa40c4ee4def60a2ce2f27e17f8168cf28ba)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (8 preceding siblings ...)
  2023-04-18  7:16 ` cvs-commit at gcc dot gnu.org
@ 2023-05-02 20:16 ` cvs-commit at gcc dot gnu.org
  2023-05-03 15:23 ` cvs-commit at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-02 20:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:83fe692a373f6e6738f4bdc6661d1d58e4bf95c6

commit r11-10733-g83fe692a373f6e6738f4bdc6661d1d58e4bf95c6
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sun Apr 2 20:05:31 2023 +0200

    libiberty: Make strstr.c in libiberty ANSI compliant

    On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrote:
    >
    > On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote:
    > > The original code in libiberty says "FIXME" and then says it has not
been
    > > validated to be ANSI compliant. However, this patch changes the
function to
    > > match implementations that ARE compliant, and such code is in the
public
    > > domain.
    > >
    > > I ran the test results, and there are no test failures.
    >
    > Thanks.  This seems to be the standard "simple" strstr implementation. 
    > There's significantly faster implementations available, but I doubt it's
    > worth the effort as the version in this file only gets used if there is
    > no system strstr.c.

    Except that PR109306 says the new version is non-compliant and
    is certainly slower than what we used to have.  The only problem I see
    on the old version (sure, it is not very fast version) is that for
    strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because
    strchr in that case changed p to point to the last character and then
    strncmp returned 0.

    The question reported in PR109306 is whether memcmp is required not to
    access characters beyond the first difference or not.
    For all of memcmp/strcmp/strncmp, C17 says:
    "The sign of a nonzero value returned by the comparison functions memcmp,
strcmp, and strncmp
    is determined by the sign of the difference between the values of the first
pair of characters (both
    interpreted as unsigned char) that differ in the objects being compared."
    but then in memcmp description says:
    "The memcmp function compares the first n characters of the object pointed
to by s1 to the first n
    characters of the object pointed to by s2."
    rather than something similar to strncmp wording:
    "The strncmp function compares not more than n characters (characters that
follow a null character
    are not compared) from the array pointed to by s1 to the array pointed to
by
    s2."
    So, while for strncmp it seems clearly well defined when there is zero
    terminator before reaching the n, for memcmp it is unclear if say
    int
    memcmp (const void *s1, const void *s2, size_t n)
    {
      int ret = 0;
      size_t i;
      const unsigned char *p1 = (const unsigned char *) s1;
      const unsigned char *p2 = (const unsigned char *) s2;

      for (i = n; i; i--)
        if (p1[i - 1] != p2[i - 1])
          ret = p1[i - 1] < p2[i - 1] ? -1 : 1;
      return ret;
    }
    wouldn't be valid implementation (one which always compares all characters
    and just returns non-zero from the first one that differs).

    So, shouldn't we just revert and handle the len == 0 case correctly?

    I think almost nothing really uses it, but still, the old version
    at least worked nicer with a fast strchr.
    Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred
    because strchr already compared the first character.

    2023-04-02  Jakub Jelinek  <jakub@redhat.com>

            PR other/109306
            * strstr.c: Revert the 2020-11-13 changes.
            (strstr): Return s1 if len is 0.

    (cherry picked from commit 1719fa40c4ee4def60a2ce2f27e17f8168cf28ba)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access)
  2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
                   ` (9 preceding siblings ...)
  2023-05-02 20:16 ` cvs-commit at gcc dot gnu.org
@ 2023-05-03 15:23 ` cvs-commit at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-03 15:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109306

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:9ae5c50751ef4217b22cc24b67a44a55efef59b9

commit r10-11384-g9ae5c50751ef4217b22cc24b67a44a55efef59b9
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sun Apr 2 20:05:31 2023 +0200

    libiberty: Make strstr.c in libiberty ANSI compliant

    On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrote:
    >
    > On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote:
    > > The original code in libiberty says "FIXME" and then says it has not
been
    > > validated to be ANSI compliant. However, this patch changes the
function to
    > > match implementations that ARE compliant, and such code is in the
public
    > > domain.
    > >
    > > I ran the test results, and there are no test failures.
    >
    > Thanks.  This seems to be the standard "simple" strstr implementation. 
    > There's significantly faster implementations available, but I doubt it's
    > worth the effort as the version in this file only gets used if there is
    > no system strstr.c.

    Except that PR109306 says the new version is non-compliant and
    is certainly slower than what we used to have.  The only problem I see
    on the old version (sure, it is not very fast version) is that for
    strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because
    strchr in that case changed p to point to the last character and then
    strncmp returned 0.

    The question reported in PR109306 is whether memcmp is required not to
    access characters beyond the first difference or not.
    For all of memcmp/strcmp/strncmp, C17 says:
    "The sign of a nonzero value returned by the comparison functions memcmp,
strcmp, and strncmp
    is determined by the sign of the difference between the values of the first
pair of characters (both
    interpreted as unsigned char) that differ in the objects being compared."
    but then in memcmp description says:
    "The memcmp function compares the first n characters of the object pointed
to by s1 to the first n
    characters of the object pointed to by s2."
    rather than something similar to strncmp wording:
    "The strncmp function compares not more than n characters (characters that
follow a null character
    are not compared) from the array pointed to by s1 to the array pointed to
by
    s2."
    So, while for strncmp it seems clearly well defined when there is zero
    terminator before reaching the n, for memcmp it is unclear if say
    int
    memcmp (const void *s1, const void *s2, size_t n)
    {
      int ret = 0;
      size_t i;
      const unsigned char *p1 = (const unsigned char *) s1;
      const unsigned char *p2 = (const unsigned char *) s2;

      for (i = n; i; i--)
        if (p1[i - 1] != p2[i - 1])
          ret = p1[i - 1] < p2[i - 1] ? -1 : 1;
      return ret;
    }
    wouldn't be valid implementation (one which always compares all characters
    and just returns non-zero from the first one that differs).

    So, shouldn't we just revert and handle the len == 0 case correctly?

    I think almost nothing really uses it, but still, the old version
    at least worked nicer with a fast strchr.
    Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred
    because strchr already compared the first character.

    2023-04-02  Jakub Jelinek  <jakub@redhat.com>

            PR other/109306
            * strstr.c (strstr): Return s1 if len is 0.

    (cherry picked from commit 1719fa40c4ee4def60a2ce2f27e17f8168cf28ba)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-05-03 15:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-27 19:24 [Bug c/109306] New: The strstr function might do undefined behavior (out of bounds mem access) pmorf at apple dot com
2023-03-27 19:27 ` [Bug other/109306] The strstr implementation in libiberty might have " pinskia at gcc dot gnu.org
2023-03-27 19:34 ` pmorf at apple dot com
2023-03-27 19:37 ` pinskia at gcc dot gnu.org
2023-03-27 19:40 ` pmorf at apple dot com
2023-03-27 19:46 ` jakub at gcc dot gnu.org
2023-03-27 19:48 ` jakub at gcc dot gnu.org
2023-04-02 18:07 ` cvs-commit at gcc dot gnu.org
2023-04-02 18:14 ` jakub at gcc dot gnu.org
2023-04-18  7:16 ` cvs-commit at gcc dot gnu.org
2023-05-02 20:16 ` cvs-commit at gcc dot gnu.org
2023-05-03 15:23 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).