public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <jeffreyalaw@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>, Jeff Law <law@redhat.com>
Cc: Seija Kijin <doremylover456@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libiberty: Make strstr.c in libiberty ANSI compliant
Date: Sun, 2 Apr 2023 11:54:30 -0600	[thread overview]
Message-ID: <044acb6b-7564-d03d-218d-7d40e3a6d4ad@gmail.com> (raw)
In-Reply-To: <ZCRSf2vTxWSIqlsG@tucnak>



On 3/29/23 09:00, Jakub Jelinek via Gcc-patches wrote:
> On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrote:
>>
>> On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote:
>>> The original code in libiberty says "FIXME" and then says it has not been
>>> validated to be ANSI compliant. However, this patch changes the function to
>>> match implementations that ARE compliant, and such code is in the public
>>> domain.
>>>
>>> I ran the test results, and there are no test failures.
>>
>> Thanks.  This seems to be the standard "simple" strstr implementation.
>> There's significantly faster implementations available, but I doubt it's
>> worth the effort as the version in this file only gets used if there is
>> no system strstr.c.
> 
> Except that PR109306 says the new version is non-compliant and
> is certainly slower than what we used to have.  The only problem I see
> on the old version (sure, it is not very fast version) is that for
> strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because
> strchr in that case changed p to point to the last character and then
> strncmp returned 0.
> 
> The question reported in PR109306 is whether memcmp is required not to
> access characters beyond the first difference or not.
> For all of memcmp/strcmp/strncmp, C17 says:
> "The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp
> is determined by the sign of the difference between the values of the first pair of characters (both
> interpreted as unsigned char) that differ in the objects being compared."
> but then in memcmp description says:
> "The memcmp function compares the first n characters of the object pointed to by s1 to the first n
> characters of the object pointed to by s2."
> rather than something similar to strncmp wording:
> "The strncmp function compares not more than n characters (characters that follow a null character
> are not compared) from the array pointed to by s1 to the array pointed to by
> s2."
> So, while for strncmp it seems clearly well defined when there is zero
> terminator before reaching the n, for memcmp it is unclear if say
> int
> memcmp (const void *s1, const void *s2, size_t n)
> {
>    int ret = 0;
>    size_t i;
>    const unsigned char *p1 = (const unsigned char *) s1;
>    const unsigned char *p2 = (const unsigned char *) s2;
> 
>    for (i = n; i; i--)
>      if (p1[i - 1] != p2[i - 1])
>        ret = p1[i - 1] < p2[i - 1] ? -1 : 1;
>    return ret;
> }
> wouldn't be valid implementation (one which always compares all characters
> and just returns non-zero from the first one that differs).
> 
> So, shouldn't we just revert and handle the len == 0 case correctly?
> 
> I think almost nothing really uses it, but still, the old version
> at least worked nicer with a fast strchr.
> Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred
> because strchr already compared the first character.
> 
> 2023-03-29  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR other/109306
> 	* strstr.c: Revert the 2020-11-13 changes.
> 	(strstr): Return s1 if len is 0.
I really don't think the performance of this code matters.  If this 
fixes the compliance issue, then it's fine with me.
Jeff

      reply	other threads:[~2023-04-02 17:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-02  0:06 Seija Kijin
2020-11-13 18:53 ` Jeff Law
2023-03-29 15:00   ` Jakub Jelinek
2023-04-02 17:54     ` Jeff Law [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=044acb6b-7564-d03d-218d-7d40e3a6d4ad@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=doremylover456@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).