From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 81CE93954451; Tue, 2 May 2023 20:16:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 81CE93954451 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683058596; bh=bb2q6+aOXPZP8iSD63/FubZ4PfqN7oK7RfkEypDjPp4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=tjo7fPk+O1A3ZOU4OAUlNlHyFOcjwoFnucDZ1L3+oNUGkNDCCuSc15GJ7CrNu+u30 6057JyxhjulvSYvNW+hHmUc3KBirIJHPj9jRVj7h8cWyikF7pk6uyXamZ9on6tSDpM Kj5wB61tIN9uI8yMtfAyXLZf5bqAAqziMdqgrk+0= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug other/109306] The strstr implementation in libiberty might have undefined behavior (out of bounds mem access) Date: Tue, 02 May 2023 20:16:35 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: other X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109306 --- Comment #10 from CVS Commits --- The releases/gcc-11 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:83fe692a373f6e6738f4bdc6661d1d58e4bf95c6 commit r11-10733-g83fe692a373f6e6738f4bdc6661d1d58e4bf95c6 Author: Jakub Jelinek Date: Sun Apr 2 20:05:31 2023 +0200 libiberty: Make strstr.c in libiberty ANSI compliant On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrot= e: > > On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote: > > The original code in libiberty says "FIXME" and then says it has not been > > validated to be ANSI compliant. However, this patch changes the function to > > match implementations that ARE compliant, and such code is in the public > > domain. > > > > I ran the test results, and there are no test failures. > > Thanks.=C2=A0 This seems to be the standard "simple" strstr implement= ation.=C2=A0 > There's significantly faster implementations available, but I doubt i= t's > worth the effort as the version in this file only gets used if there = is > no system strstr.c. Except that PR109306 says the new version is non-compliant and is certainly slower than what we used to have. The only problem I see on the old version (sure, it is not very fast version) is that for strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because strchr in that case changed p to point to the last character and then strncmp returned 0. The question reported in PR109306 is whether memcmp is required not to access characters beyond the first difference or not. For all of memcmp/strcmp/strncmp, C17 says: "The sign of a nonzero value returned by the comparison functions memcm= p, strcmp, and strncmp is determined by the sign of the difference between the values of the f= irst pair of characters (both interpreted as unsigned char) that differ in the objects being compared= ." but then in memcmp description says: "The memcmp function compares the first n characters of the object poin= ted to by s1 to the first n characters of the object pointed to by s2." rather than something similar to strncmp wording: "The strncmp function compares not more than n characters (characters t= hat follow a null character are not compared) from the array pointed to by s1 to the array pointed = to by s2." So, while for strncmp it seems clearly well defined when there is zero terminator before reaching the n, for memcmp it is unclear if say int memcmp (const void *s1, const void *s2, size_t n) { int ret =3D 0; size_t i; const unsigned char *p1 =3D (const unsigned char *) s1; const unsigned char *p2 =3D (const unsigned char *) s2; for (i =3D n; i; i--) if (p1[i - 1] !=3D p2[i - 1]) ret =3D p1[i - 1] < p2[i - 1] ? -1 : 1; return ret; } wouldn't be valid implementation (one which always compares all charact= ers and just returns non-zero from the first one that differs). So, shouldn't we just revert and handle the len =3D=3D 0 case correctly? I think almost nothing really uses it, but still, the old version at least worked nicer with a fast strchr. Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred because strchr already compared the first character. 2023-04-02 Jakub Jelinek PR other/109306 * strstr.c: Revert the 2020-11-13 changes. (strstr): Return s1 if len is 0. (cherry picked from commit 1719fa40c4ee4def60a2ce2f27e17f8168cf28ba)=