public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2] string: Fix OOB read on generic strncmp
@ 2023-02-22 16:31 Adhemerval Zanella
  2023-02-22 17:21 ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella @ 2023-02-22 16:31 UTC (permalink / raw)
  To: libc-alpha, Szabolcs Nagy

For unaligned case, reading ahead can only be done if parting reads
matches the aligned input.

Also extend the stratcliff tests to check such cases.

Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
and powerpc-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
---
 string/stratcliff.c | 17 ++++++++++++++++-
 string/strncmp.c    | 13 ++++++++++++-
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/string/stratcliff.c b/string/stratcliff.c
index 74d64cc03d..88ac787088 100644
--- a/string/stratcliff.c
+++ b/string/stratcliff.c
@@ -401,12 +401,27 @@ do_test (void)
 		result = 1;
 	      }
 
-	    if (STRNCMP (dest + nchars - outer, adr + middle, outer) <= 0)
+	    /* Also check for size larger than the string.  */
+	    if (STRNCMP (adr + middle, dest + nchars - outer, outer + 99) >= 0)
 	      {
 		printf ("%s 2 flunked for outer = %zu, middle = %zu, full\n",
+			STRINGIFY (STRNCMP), outer + 99, middle);
+		result = 1;
+	      }
+
+	    if (STRNCMP (dest + nchars - outer, adr + middle, outer) <= 0)
+	      {
+		printf ("%s 3 flunked for outer = %zu, middle = %zu, full\n",
 			STRINGIFY (STRNCMP), outer, middle);
 		result = 1;
 	      }
+
+	    if (STRNCMP (dest + nchars - outer, adr + middle, outer + 99) <= 0)
+	      {
+		printf ("%s 4 flunked for outer = %zu, middle = %zu, full\n",
+			STRINGIFY (STRNCMP), outer + 99, middle);
+		result = 1;
+	      }
 	  }
 
       /* strncpy/wcsncpy tests */
diff --git a/string/strncmp.c b/string/strncmp.c
index 4c8bf36bb9..751bf53d55 100644
--- a/string/strncmp.c
+++ b/string/strncmp.c
@@ -73,7 +73,11 @@ strncmp_unaligned_loop (const op_t *x1, const op_t *x2, op_t w1, uintptr_t ofs,
   uintptr_t sh_2 = sizeof(op_t) * CHAR_BIT - sh_1;
 
   op_t w2 = MERGE (w2a, sh_1, (op_t)-1, sh_2);
-  if (!has_zero (w2) && n > (sizeof (op_t) - ofs))
+
+  /* Reading ahead is wrong if w1 and w2 already differs.  */
+  op_t w1a = MERGE (w1, 0, (op_t)-1, sh_2);
+
+  if (!has_zero (w2) && w2 == w1a && n >= (sizeof (op_t) - ofs))
     {
       op_t w2b;
 
@@ -90,6 +94,13 @@ strncmp_unaligned_loop (const op_t *x1, const op_t *x2, op_t w1, uintptr_t ofs,
 	  if (has_zero (w2b) || n <= (sizeof (op_t) - ofs))
 	    break;
 	  w1 = *x1++;
+
+	  /* Reading ahead is wrong if w1 and w2 already differs.  */
+	  w2 = MERGE (w2b, sh_1, (op_t)-1, sh_2);
+	  w1a = MERGE (w1, 0, (op_t)-1, sh_2);
+	  if (w2 != w1a)
+	    return final_cmp (w1a, w2, n);
+
 	  w2a = w2b;
 	}
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
  2023-02-22 16:31 [PATCH v2] string: Fix OOB read on generic strncmp Adhemerval Zanella
@ 2023-02-22 17:21 ` Szabolcs Nagy
  2023-02-23 18:15   ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2023-02-22 17:21 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

The 02/22/2023 13:31, Adhemerval Zanella wrote:
> For unaligned case, reading ahead can only be done if parting reads
> matches the aligned input.
> 
> Also extend the stratcliff tests to check such cases.
> 
> Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
> and powerpc-linux-gnu by removing the arch-specific assembly
> implementation and disabling multi-arch (it covers both LE and BE
> for 64 and 32 bits).

thanks this looks good.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>


> ---
>  string/stratcliff.c | 17 ++++++++++++++++-
>  string/strncmp.c    | 13 ++++++++++++-
>  2 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/string/stratcliff.c b/string/stratcliff.c
> index 74d64cc03d..88ac787088 100644
> --- a/string/stratcliff.c
> +++ b/string/stratcliff.c
> @@ -401,12 +401,27 @@ do_test (void)
>  		result = 1;
>  	      }
>  
> -	    if (STRNCMP (dest + nchars - outer, adr + middle, outer) <= 0)
> +	    /* Also check for size larger than the string.  */
> +	    if (STRNCMP (adr + middle, dest + nchars - outer, outer + 99) >= 0)
>  	      {
>  		printf ("%s 2 flunked for outer = %zu, middle = %zu, full\n",
> +			STRINGIFY (STRNCMP), outer + 99, middle);
> +		result = 1;
> +	      }
> +
> +	    if (STRNCMP (dest + nchars - outer, adr + middle, outer) <= 0)
> +	      {
> +		printf ("%s 3 flunked for outer = %zu, middle = %zu, full\n",
>  			STRINGIFY (STRNCMP), outer, middle);
>  		result = 1;
>  	      }
> +
> +	    if (STRNCMP (dest + nchars - outer, adr + middle, outer + 99) <= 0)
> +	      {
> +		printf ("%s 4 flunked for outer = %zu, middle = %zu, full\n",
> +			STRINGIFY (STRNCMP), outer + 99, middle);
> +		result = 1;
> +	      }
>  	  }
>  
>        /* strncpy/wcsncpy tests */
> diff --git a/string/strncmp.c b/string/strncmp.c
> index 4c8bf36bb9..751bf53d55 100644
> --- a/string/strncmp.c
> +++ b/string/strncmp.c
> @@ -73,7 +73,11 @@ strncmp_unaligned_loop (const op_t *x1, const op_t *x2, op_t w1, uintptr_t ofs,
>    uintptr_t sh_2 = sizeof(op_t) * CHAR_BIT - sh_1;
>  
>    op_t w2 = MERGE (w2a, sh_1, (op_t)-1, sh_2);
> -  if (!has_zero (w2) && n > (sizeof (op_t) - ofs))
> +
> +  /* Reading ahead is wrong if w1 and w2 already differs.  */
> +  op_t w1a = MERGE (w1, 0, (op_t)-1, sh_2);
> +
> +  if (!has_zero (w2) && w2 == w1a && n >= (sizeof (op_t) - ofs))
>      {
>        op_t w2b;
>  
> @@ -90,6 +94,13 @@ strncmp_unaligned_loop (const op_t *x1, const op_t *x2, op_t w1, uintptr_t ofs,
>  	  if (has_zero (w2b) || n <= (sizeof (op_t) - ofs))
>  	    break;
>  	  w1 = *x1++;
> +
> +	  /* Reading ahead is wrong if w1 and w2 already differs.  */
> +	  w2 = MERGE (w2b, sh_1, (op_t)-1, sh_2);
> +	  w1a = MERGE (w1, 0, (op_t)-1, sh_2);
> +	  if (w2 != w1a)
> +	    return final_cmp (w1a, w2, n);
> +
>  	  w2a = w2b;
>  	}
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
  2023-02-22 17:21 ` Szabolcs Nagy
@ 2023-02-23 18:15   ` Adhemerval Zanella Netto
  2023-02-24 10:19     ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-23 18:15 UTC (permalink / raw)
  To: Szabolcs Nagy, Florian Weimer, H.J. Lu, Noah Goldstein; +Cc: libc-alpha



On 22/02/23 14:21, Szabolcs Nagy wrote:
> The 02/22/2023 13:31, Adhemerval Zanella wrote:
>> For unaligned case, reading ahead can only be done if parting reads
>> matches the aligned input.
>>
>> Also extend the stratcliff tests to check such cases.
>>
>> Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
>> and powerpc-linux-gnu by removing the arch-specific assembly
>> implementation and disabling multi-arch (it covers both LE and BE
>> for 64 and 32 bits).
> 
> thanks this looks good.
> 
> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> 

So before I push the fix along with the testcase, I checked all strncmp
optimization and found out that some implementations also do not handle 
this correctly as expected:

sysdeps/x86_64/multiarch/strncmp-sse2.S				FAIL
sysdeps/x86_64/multiarch/strncmp-sse4_2.S			FAIL
sysdeps/x86_64/multiarch/strncmp-avx2.S				OK
sysdeps/x86_64/multiarch/strncmp-evex.S				?
sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S			?

sysdeps/ia64/strncmp.S						?

sysdeps/sparc/sparc32/sparcv9/strncmp.S				OK
sysdeps/sparc/sparc64/strncmp.S					OK

sysdeps/aarch64/strncmp.S					OK

sysdeps/powerpc/powerpc32/power7/strncmp.S			FAIL
sysdeps/powerpc/powerpc32/405/strncmp.S				?	
sysdeps/powerpc/powerpc32/strncmp.S				FAIL
sysdeps/powerpc/powerpc32/power4/strncmp.S			FAIL
sysdeps/powerpc/powerpc64/power7/strncmp.S			FAIL
sysdeps/powerpc/powerpc64/power8/strncmp.S			OK
sysdeps/powerpc/powerpc64/strncmp.S				FAIL
sysdeps/powerpc/powerpc64/le/power9/strncmp.S			OK

sysdeps/alpha/strncmp.S						FAIL

sysdeps/i386/i686/multiarch/strncmp-sse4.S			OK
sysdeps/i386/i686/multiarch/strncmp-ssse3.S			FAIL

sysdeps/s390/strncmp-vx.S					OK	

(the ? are implementations that I can really test, even qemu static
thrown illegal instruction).

Noah has brought to my attention that he tried to add similar tests,
but they were rejected by strncmp string must be null-terminated [1].

The working drafts for C standard I have access (n1256.pdf for C99 and 
n3047.pdf for c2x) do not say possibly null-terminated array (as some
stackoverflow answer state [2]) they refer only as array. So I tend
to follow Florian understanding that strncmp inputs should be NULL
terminated.

So should we really consider this a OOB read on generic strncmp?

[1] https://sourceware.org/pipermail/libc-alpha/2022-January/135130.html
[2] https://stackoverflow.com/questions/41418766/is-it-legal-to-pass-a-non-null-terminated-string-to-strncmp-in-c

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
  2023-02-23 18:15   ` Adhemerval Zanella Netto
@ 2023-02-24 10:19     ` Szabolcs Nagy
  2023-02-24 10:58       ` Florian Weimer
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2023-02-24 10:19 UTC (permalink / raw)
  To: Adhemerval Zanella Netto
  Cc: Florian Weimer, H.J. Lu, Noah Goldstein, libc-alpha

The 02/23/2023 15:15, Adhemerval Zanella Netto wrote:
> Noah has brought to my attention that he tried to add similar tests,
> but they were rejected by strncmp string must be null-terminated [1].
> 
> The working drafts for C standard I have access (n1256.pdf for C99 and 
> n3047.pdf for c2x) do not say possibly null-terminated array (as some
> stackoverflow answer state [2]) they refer only as array. So I tend
> to follow Florian understanding that strncmp inputs should be NULL
> terminated.

c11 draft is n1570.pdf
https://open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

i dont understand what extension Florian is talking about.
(i think that was about strcmp not strncmp)

c11 and c23 are clear that strncmp args may *not* be null-terminated
so i think we should be careful not to overread.

glibc itself has test code that relies on this: crypt/badsalttest

> 
> So should we really consider this a OOB read on generic strncmp?
> 
> [1] https://sourceware.org/pipermail/libc-alpha/2022-January/135130.html
> [2] https://stackoverflow.com/questions/41418766/is-it-legal-to-pass-a-non-null-terminated-string-to-strncmp-in-c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
  2023-02-24 10:19     ` Szabolcs Nagy
@ 2023-02-24 10:58       ` Florian Weimer
  0 siblings, 0 replies; 10+ messages in thread
From: Florian Weimer @ 2023-02-24 10:58 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Adhemerval Zanella Netto, H.J. Lu, Noah Goldstein, libc-alpha

* Szabolcs Nagy:

> The 02/23/2023 15:15, Adhemerval Zanella Netto wrote:
>> Noah has brought to my attention that he tried to add similar tests,
>> but they were rejected by strncmp string must be null-terminated [1].
>> 
>> The working drafts for C standard I have access (n1256.pdf for C99 and 
>> n3047.pdf for c2x) do not say possibly null-terminated array (as some
>> stackoverflow answer state [2]) they refer only as array. So I tend
>> to follow Florian understanding that strncmp inputs should be NULL
>> terminated.
>
> c11 draft is n1570.pdf
> https://open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
>
> i dont understand what extension Florian is talking about.
> (i think that was about strcmp not strncmp)
>
> c11 and c23 are clear that strncmp args may *not* be null-terminated
> so i think we should be careful not to overread.

It's common to use strncmp as a starts-with predicate, like this:

  strncmp (s, "prefix", 5)

This requires that reading stops at the first null byte.  C11 wording
makes this kind of usage inadvisable because it treats the inputs as
arrays, and this means that mean that implementations could read past
the null byte.  But that doesn't match current programming practice.

The strnlen function has the same problem if you want to use it to limit
readhead, e.g. in sscanf to fix bug 17577.  (POSIX also speaks of an
array argument.)  In stdio-common/Xprintf_buffer_puts_1.c, we already
use it to avoid a similar performance glitch.  It's not the first such
uses, there is already a similar call (with similar rationale) in
string/strcasestr.c, and for wcsnlen in
stdio-common/vfprintf-process-arg.c.

I think we should support all these as extensions.  The alternative
would be to add new functions that stop reading after the first null
byte (particularly for the strnlen optimization).

Unfortunately, the last time I brought this up, the manual was updated
to double down on the array interpretation.  I think this was the
result:

commit 2cc4b9cce51643ec299e97450ccde4deeecfb083
Author: Paul Eggert <eggert@cs.ucla.edu>
Date:   Fri Dec 4 08:27:14 2015 -0800

    Consistency about byte vs character in string.texi

I think that's just not the right direction for glibc.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
  2023-02-24 12:43 ` Szabolcs Nagy
@ 2023-02-24 12:57   ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 10+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-24 12:57 UTC (permalink / raw)
  To: Szabolcs Nagy, Wilco Dijkstra; +Cc: Florian Weimer, 'GNU C Library'



On 24/02/23 09:43, Szabolcs Nagy wrote:
> The 02/24/2023 12:24, Wilco Dijkstra wrote:
>>> It's common to use strncmp as a starts-with predicate, like this:
>>>
>>>  strncmp (s, "prefix", 5)
>>>
>>> This requires that reading stops at the first null byte.  C11 wording
>>> makes this kind of usage inadvisable because it treats the inputs as
>>> arrays, and this means that mean that implementations could read past
>>> the null byte.  But that doesn't match current programming practice.
>>
>> C11 says you can't read past the end of the array, but it doesn't say that
>> you *must* read past a NUL-byte. My interpretation is that this is a valid
>> strncmp implementation:
>>
>> int
>> simple_strncmp (const char *s1, const char *s2, size_t size)
>> {
>>    size_t len1 = strnlen (s1, size);
>>    size_t len2 = strnlen (s2, size);
>>    if (len1 < len2)
>>      len1++;
>>    else if (len1 > len2)
>>      len1 = len2 + 1;
>>    return memcmp (s1, s2, len1);
>> }
>>
>> Ie. both strings must be valid up until the given size or NUL-terminated if
>> smaller. This works on the example above even if the first string is only 1 byte.
> 
> note that the crypt/badsalttest test case that started
> this discussion relies on strncmp stopping at the first
> mismatch byte, not just at null byte:
> 
> https://sourceware.org/git/?p=glibc.git;a=blob;f=crypt/badsalttest.c;h=bc1e5c1442c731b597551919bec9174ecd2cce61;hb=HEAD#l60
> https://sourceware.org/git/?p=glibc.git;a=blob;f=crypt/crypt-entry.c;h=c24c0110a66a0b6a8e885b5a3d0e942da46b1325;hb=HEAD#l84
> 
> i think it's reasonable to support stopping the read at
> null byte but allow reading past the first mismatch.
> (memcmp allows reading past mismatches too).
> 
> i.e. the strncmp patch and testcase patch is not needed.
> 
> then the question is if non-null-termnated input is valid
> as crypt salt argument (if not the test should be fixed,
> if yes, it should not use strncmp).

I think the problem is using a non-NULL terminated string as crypt
input, thus the test is not really valid.  In fact, is not even valid
for glibc for all strncmp implementations (some do fail and read past
first mismatch depending of input alignment).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
  2023-02-24 12:24 Wilco Dijkstra
  2023-02-24 12:34 ` Adhemerval Zanella Netto
@ 2023-02-24 12:43 ` Szabolcs Nagy
  2023-02-24 12:57   ` Adhemerval Zanella Netto
  1 sibling, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2023-02-24 12:43 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: Florian Weimer, Adhemerval Zanella, 'GNU C Library'

The 02/24/2023 12:24, Wilco Dijkstra wrote:
> > It's common to use strncmp as a starts-with predicate, like this:
> >
> >  strncmp (s, "prefix", 5)
> >
> > This requires that reading stops at the first null byte.  C11 wording
> > makes this kind of usage inadvisable because it treats the inputs as
> > arrays, and this means that mean that implementations could read past
> > the null byte.  But that doesn't match current programming practice.
> 
> C11 says you can't read past the end of the array, but it doesn't say that
> you *must* read past a NUL-byte. My interpretation is that this is a valid
> strncmp implementation:
> 
> int
> simple_strncmp (const char *s1, const char *s2, size_t size)
> {
>    size_t len1 = strnlen (s1, size);
>    size_t len2 = strnlen (s2, size);
>    if (len1 < len2)
>      len1++;
>    else if (len1 > len2)
>      len1 = len2 + 1;
>    return memcmp (s1, s2, len1);
> }
> 
> Ie. both strings must be valid up until the given size or NUL-terminated if
> smaller. This works on the example above even if the first string is only 1 byte.

note that the crypt/badsalttest test case that started
this discussion relies on strncmp stopping at the first
mismatch byte, not just at null byte:

https://sourceware.org/git/?p=glibc.git;a=blob;f=crypt/badsalttest.c;h=bc1e5c1442c731b597551919bec9174ecd2cce61;hb=HEAD#l60
https://sourceware.org/git/?p=glibc.git;a=blob;f=crypt/crypt-entry.c;h=c24c0110a66a0b6a8e885b5a3d0e942da46b1325;hb=HEAD#l84

i think it's reasonable to support stopping the read at
null byte but allow reading past the first mismatch.
(memcmp allows reading past mismatches too).

i.e. the strncmp patch and testcase patch is not needed.

then the question is if non-null-termnated input is valid
as crypt salt argument (if not the test should be fixed,
if yes, it should not use strncmp).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
  2023-02-24 12:24 Wilco Dijkstra
@ 2023-02-24 12:34 ` Adhemerval Zanella Netto
  2023-02-24 12:43 ` Szabolcs Nagy
  1 sibling, 0 replies; 10+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-24 12:34 UTC (permalink / raw)
  To: Wilco Dijkstra, Florian Weimer, Szabolcs Nagy, H.J. Lu, Noah Goldstein
  Cc: 'GNU C Library'



On 24/02/23 09:24, Wilco Dijkstra wrote:
> Hi,
> 
>> It's common to use strncmp as a starts-with predicate, like this:
>>
>>   strncmp (s, "prefix", 5)
>>
>> This requires that reading stops at the first null byte.  C11 wording
>> makes this kind of usage inadvisable because it treats the inputs as
>> arrays, and this means that mean that implementations could read past
>> the null byte.  But that doesn't match current programming practice.
> 
> C11 says you can't read past the end of the array, but it doesn't say that
> you *must* read past a NUL-byte. My interpretation is that this is a valid
> strncmp implementation:
> 
> int
> simple_strncmp (const char *s1, const char *s2, size_t size)
> {
>    size_t len1 = strnlen (s1, size);
>    size_t len2 = strnlen (s2, size);
>    if (len1 < len2)
>      len1++;
>    else if (len1 > len2)
>      len1 = len2 + 1;
>    return memcmp (s1, s2, len1);
> }
> 
> Ie. both strings must be valid up until the given size or NUL-terminated if
> smaller. This works on the example above even if the first string is only 1 byte.

That was my understanding as well when I wrote the generic strncmp, and
multiple strncmp implementation also follows this same idea. 

So another option is to keep the generic strncmp as is and just remove the 
crypt/badsalttest.c, since it passing invalid inputs on crypt (non null 
terminated strings).

> 
>> The strnlen function has the same problem if you want to use it to limit
>> readhead, e.g. in sscanf to fix bug 17577.  (POSIX also speaks of an
>> array argument.)  In stdio-common/Xprintf_buffer_puts_1.c, we already
>> use it to avoid a similar performance glitch.  It's not the first such
>> uses, there is already a similar call (with similar rationale) in
>> string/strcasestr.c, and for wcsnlen in
>> stdio-common/vfprintf-process-arg.c.
>>
>> I think we should support all these as extensions.  The alternative
>> would be to add new functions that stop reading after the first null
>> byte (particularly for the strnlen optimization).
> 
> Existing functions already do stop after the first NUL byte. Even if the C
> standard doesn't explicitly disallow it, it doesn't seem valid to read beyond
> it (a lot of code could fail if we did).
> 
> Cheers,
> Wilco

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
@ 2023-02-24 12:24 Wilco Dijkstra
  2023-02-24 12:34 ` Adhemerval Zanella Netto
  2023-02-24 12:43 ` Szabolcs Nagy
  0 siblings, 2 replies; 10+ messages in thread
From: Wilco Dijkstra @ 2023-02-24 12:24 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella, Szabolcs Nagy; +Cc: 'GNU C Library'

Hi,

> It's common to use strncmp as a starts-with predicate, like this:
>
>  strncmp (s, "prefix", 5)
>
> This requires that reading stops at the first null byte.  C11 wording
> makes this kind of usage inadvisable because it treats the inputs as
> arrays, and this means that mean that implementations could read past
> the null byte.  But that doesn't match current programming practice.

C11 says you can't read past the end of the array, but it doesn't say that
you *must* read past a NUL-byte. My interpretation is that this is a valid
strncmp implementation:

int
simple_strncmp (const char *s1, const char *s2, size_t size)
{
   size_t len1 = strnlen (s1, size);
   size_t len2 = strnlen (s2, size);
   if (len1 < len2)
     len1++;
   else if (len1 > len2)
     len1 = len2 + 1;
   return memcmp (s1, s2, len1);
}

Ie. both strings must be valid up until the given size or NUL-terminated if
smaller. This works on the example above even if the first string is only 1 byte.

> The strnlen function has the same problem if you want to use it to limit
> readhead, e.g. in sscanf to fix bug 17577.  (POSIX also speaks of an
> array argument.)  In stdio-common/Xprintf_buffer_puts_1.c, we already
> use it to avoid a similar performance glitch.  It's not the first such
> uses, there is already a similar call (with similar rationale) in
> string/strcasestr.c, and for wcsnlen in
> stdio-common/vfprintf-process-arg.c.
>
> I think we should support all these as extensions.  The alternative
> would be to add new functions that stop reading after the first null
> byte (particularly for the strnlen optimization).

Existing functions already do stop after the first NUL byte. Even if the C
standard doesn't explicitly disallow it, it doesn't seem valid to read beyond
it (a lot of code could fail if we did).

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] string: Fix OOB read on generic strncmp
@ 2023-02-23 19:10 Wilco Dijkstra
  0 siblings, 0 replies; 10+ messages in thread
From: Wilco Dijkstra @ 2023-02-23 19:10 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: 'GNU C Library'

Hi Adhemerval,

What kind of readahead are you trying to test/fix here? It's not clear from the
patch... So with strcmp you can safely call strlen on either input, and similarly I
believe you can call strnlen on both inputs of strncmp without getting a crash.
That means it is possible to read ahead as long as you never read past a zero
byte or past the given size.

If you couldn't read even a single byte past the first mismatch it would be hard
to create efficient implementations, especially with wide vectors...

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-02-24 12:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-22 16:31 [PATCH v2] string: Fix OOB read on generic strncmp Adhemerval Zanella
2023-02-22 17:21 ` Szabolcs Nagy
2023-02-23 18:15   ` Adhemerval Zanella Netto
2023-02-24 10:19     ` Szabolcs Nagy
2023-02-24 10:58       ` Florian Weimer
2023-02-23 19:10 Wilco Dijkstra
2023-02-24 12:24 Wilco Dijkstra
2023-02-24 12:34 ` Adhemerval Zanella Netto
2023-02-24 12:43 ` Szabolcs Nagy
2023-02-24 12:57   ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).