public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Fix strstr bug with huge needles (bug 23637)
@ 2018-09-12 13:02 Wilco Dijkstra
  2018-09-12 13:22 ` H.J. Lu
  2018-09-12 19:31 ` Paul Pluzhnikov
  0 siblings, 2 replies; 9+ messages in thread
From: Wilco Dijkstra @ 2018-09-12 13:02 UTC (permalink / raw)
  To: libc-alpha; +Cc: nd

As reported in https://sourceware.org/ml/libc-help/2018-09/msg00000.html , 
the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
repeatedly checking for the end of the string.  However if the needle length
is larger than this, two_way_long_needle may confuse this as meaning the end
of the string and return NULL.  This is fixed by adding the needle length to
the amount to read ahead.

Passes GLIBC regress on AArch64, new testcase now passes.

ChangeLog:
2018-06-30  Wilco Dijkstra  <wdijkstr@arm.com>

	[BZ #21286]
	* string/Makefile: Add bug-strstr1.
	* string/bug-strstr1.c: New test.
	* string/strcasestr.c (AVAILABLE): Fix readahead distance.
	* string/strstr.c (AVAILABLE): Likewise.
---

diff --git a/string/Makefile b/string/Makefile
index 680431f92185f9143710809b4105e12daa2daaa5..47f901784dcce397fc5f9887fd7e55b618c6ea80 100644
--- a/string/Makefile
+++ b/string/Makefile
@@ -59,7 +59,7 @@ tests		:= tester inl-tester noinl-tester testcopy test-ffs	\
 		   bug-envz1 tst-strxfrm2 tst-endian tst-svc2		\
 		   tst-strtok_r bug-strcoll2 tst-cmp tst-xbzero-opt	\
 		   test-endian-types test-endian-file-scope		\
-		   test-endian-sign-conversion
+		   test-endian-sign-conversion bug-strstr1
 
 # This test allocates a lot of memory and can run for a long time.
 xtests = tst-strcoll-overflow
diff --git a/string/bug-strstr1.c b/string/bug-strstr1.c
new file mode 100644
index 0000000000000000000000000000000000000000..02d02f9298fa87154ec1be07a2a318f4a5f72a05
--- /dev/null
+++ b/string/bug-strstr1.c
@@ -0,0 +1,32 @@
+/* See BZ #23637.  */
+#include <string.h>
+#include <stdio.h>
+
+#define N 1024
+
+int
+do_test (void)
+{
+  puts ("test strstr huge needle");
+  char h[N * 2 + 1];
+  char n[N + 1];
+  char *s;
+  int i;
+
+  for (i = 0; i < N; i++)
+    {
+      n[i] = 'x';
+      h[i] = ' ';
+      h[i + N] = 'x';
+    }
+  n[N] = '\0';
+  h[N * 2] = '\0';
+
+  /* Ensure we don't match at the first 'x'.  */
+  h[0] = 'x';
+
+  s = strstr (h, n);
+  return s == NULL || strcmp (s, n) != 0;
+}
+
+#include <support/test-driver.c>
diff --git a/string/strcasestr.c b/string/strcasestr.c
index 1f6b7b846f81d0d8fe77b390062d2d86be11b056..8aa76037dcc052f34ee4a8594e017ec822ce892b 100644
--- a/string/strcasestr.c
+++ b/string/strcasestr.c
@@ -37,8 +37,9 @@
 /* Two-Way algorithm.  */
 #define RETURN_TYPE char *
 #define AVAILABLE(h, h_l, j, n_l)			\
-  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
-			      (j) + (n_l) <= (h_l)))
+  (((j) + (n_l) <= (h_l)) \
+   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
+       (j) + (n_l) <= (h_l)))
 #define CHECK_EOL (1)
 #define RET0_IF_0(a) if (!a) goto ret0
 #define CANON_ELEMENT(c) TOLOWER (c)
diff --git a/string/strstr.c b/string/strstr.c
index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644
--- a/string/strstr.c
+++ b/string/strstr.c
@@ -33,8 +33,9 @@
 
 #define RETURN_TYPE char *
 #define AVAILABLE(h, h_l, j, n_l)			\
-  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
-			      (j) + (n_l) <= (h_l)))
+  (((j) + (n_l) <= (h_l)) \
+   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
+       (j) + (n_l) <= (h_l)))
 #define CHECK_EOL (1)
 #define RET0_IF_0(a) if (!a) goto ret0
 #define FASTSEARCH(S,C,N) (void*) strchr ((void*)(S), (C))

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix strstr bug with huge needles (bug 23637)
  2018-09-12 13:02 [PATCH] Fix strstr bug with huge needles (bug 23637) Wilco Dijkstra
@ 2018-09-12 13:22 ` H.J. Lu
  2018-09-12 19:31 ` Paul Pluzhnikov
  1 sibling, 0 replies; 9+ messages in thread
From: H.J. Lu @ 2018-09-12 13:22 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: libc-alpha, nd

On Wed, Sep 12, 2018 at 6:01 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> As reported in https://sourceware.org/ml/libc-help/2018-09/msg00000.html ,
> the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
> AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
> repeatedly checking for the end of the string.  However if the needle length
> is larger than this, two_way_long_needle may confuse this as meaning the end
> of the string and return NULL.  This is fixed by adding the needle length to
> the amount to read ahead.
>
> Passes GLIBC regress on AArch64, new testcase now passes.
>
> ChangeLog:
> 2018-06-30  Wilco Dijkstra  <wdijkstr@arm.com>
>
>         [BZ #21286]
>         * string/Makefile: Add bug-strstr1.
>         * string/bug-strstr1.c: New test.

Please add the new test to test-strstr.c so that all implementations
will be tested.


-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix strstr bug with huge needles (bug 23637)
  2018-09-12 13:02 [PATCH] Fix strstr bug with huge needles (bug 23637) Wilco Dijkstra
  2018-09-12 13:22 ` H.J. Lu
@ 2018-09-12 19:31 ` Paul Pluzhnikov
  2018-09-13 16:56   ` [PATCH v2] " Wilco Dijkstra
  1 sibling, 1 reply; 9+ messages in thread
From: Paul Pluzhnikov @ 2018-09-12 19:31 UTC (permalink / raw)
  To: Wilco.Dijkstra; +Cc: GLIBC Devel, nd

On Wed, Sep 12, 2018 at 6:02 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:

> ChangeLog:
> 2018-06-30  Wilco Dijkstra  <wdijkstr@arm.com>
>
>         [BZ #21286]

Wrong bugzilla reference. Should be [BZ #23637].

-- 
Paul Pluzhnikov

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] Fix strstr bug with huge needles (bug 23637)
  2018-09-12 19:31 ` Paul Pluzhnikov
@ 2018-09-13 16:56   ` Wilco Dijkstra
  2018-09-17 10:37     ` Wilco Dijkstra
  0 siblings, 1 reply; 9+ messages in thread
From: Wilco Dijkstra @ 2018-09-13 16:56 UTC (permalink / raw)
  To: GLIBC Devel; +Cc: nd

As reported in https://sourceware.org/ml/libc-help/2018-09/msg00000.html , 
the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
repeatedly checking for the end of the string.  However if the needle length
is larger than this, two_way_long_needle may confuse this as meaning the end
of the string and return NULL.  This is fixed by adding the needle length to
the amount to read ahead.

Passes GLIBC regress on AArch64/x64, new tests pass.

ChangeLog:
2018-08-13  Wilco Dijkstra  <wdijkstr@arm.com>

	[BZ #23637]
	* string/test-strstr.c (pr23637): New function.
	(test_main): Add tests with longer needles.
	* string/strcasestr.c (AVAILABLE): Fix readahead distance.
	* string/strstr.c (AVAILABLE): Likewise.

--
diff --git a/string/strcasestr.c b/string/strcasestr.c
index 1f6b7b846f81d0d8fe77b390062d2d86be11b056..8aa76037dcc052f34ee4a8594e017ec822ce892b 100644
--- a/string/strcasestr.c
+++ b/string/strcasestr.c
@@ -37,8 +37,9 @@
 /* Two-Way algorithm.  */
 #define RETURN_TYPE char *
 #define AVAILABLE(h, h_l, j, n_l)			\
-  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
-			      (j) + (n_l) <= (h_l)))
+  (((j) + (n_l) <= (h_l)) \
+   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
+       (j) + (n_l) <= (h_l)))
 #define CHECK_EOL (1)
 #define RET0_IF_0(a) if (!a) goto ret0
 #define CANON_ELEMENT(c) TOLOWER (c)
diff --git a/string/strstr.c b/string/strstr.c
index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644
--- a/string/strstr.c
+++ b/string/strstr.c
@@ -33,8 +33,9 @@
 
 #define RETURN_TYPE char *
 #define AVAILABLE(h, h_l, j, n_l)			\
-  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
-			      (j) + (n_l) <= (h_l)))
+  (((j) + (n_l) <= (h_l)) \
+   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
+       (j) + (n_l) <= (h_l)))
 #define CHECK_EOL (1)
 #define RET0_IF_0(a) if (!a) goto ret0
 #define FASTSEARCH(S,C,N) (void*) strchr ((void*)(S), (C))
diff --git a/string/test-strstr.c b/string/test-strstr.c
index 8d99716ff39cc2c22ebebdacb5f30438f9b32ffc..5861b01b73e4c315c471d68300f2ba25e3533ac1 100644
--- a/string/test-strstr.c
+++ b/string/test-strstr.c
@@ -151,6 +151,32 @@ check2 (void)
     }
 }
 
+#define N 1024
+
+static void
+pr23637 (void)
+{
+  char *h = (char*) buf1;
+  char *n = (char*) buf2;
+
+  for (int i = 0; i < N; i++)
+    {
+      n[i] = 'x';
+      h[i] = ' ';
+      h[i + N] = 'x';
+    }
+
+  n[N] = '\0';
+  h[N * 2] = '\0';
+
+  /* Ensure we don't match at the first 'x'.  */
+  h[0] = 'x';
+
+  char *exp_result = stupid_strstr (h, n);
+  FOR_EACH_IMPL (impl, 0)
+    check_result (impl, h, n, exp_result);
+}
+
 static int
 test_main (void)
 {
@@ -158,6 +184,7 @@ test_main (void)
 
   check1 ();
   check2 ();
+  pr23637 ();
 
   printf ("%23s", "");
   FOR_EACH_IMPL (impl, 0)
@@ -202,6 +229,9 @@ test_main (void)
 	do_test (15, 9, hlen, klen, 1);
 	do_test (15, 15, hlen, klen, 0);
 	do_test (15, 15, hlen, klen, 1);
+
+	do_test (15, 15, hlen + klen * 4, klen * 4, 0);
+	do_test (15, 15, hlen + klen * 4, klen * 4, 1);
       }
 
   do_test (0, 0, page_size - 1, 16, 0);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] Fix strstr bug with huge needles (bug 23637)
  2018-09-13 16:56   ` [PATCH v2] " Wilco Dijkstra
@ 2018-09-17 10:37     ` Wilco Dijkstra
  2018-09-19 15:32       ` Szabolcs Nagy
  0 siblings, 1 reply; 9+ messages in thread
From: Wilco Dijkstra @ 2018-09-17 10:37 UTC (permalink / raw)
  To: GLIBC Devel; +Cc: nd


ping
  

As reported in  https://sourceware.org/ml/libc-help/2018-09/msg00000.html , 
the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
repeatedly checking for the end of the string.  However if the needle length
is larger than this, two_way_long_needle may confuse this as meaning the end
of the string and return NULL.  This is fixed by adding the needle length to
the amount to read ahead.

Passes GLIBC regress on AArch64/x64, new tests pass.

ChangeLog:
2018-08-13  Wilco Dijkstra  <wdijkstr@arm.com>

        [BZ #23637]
        * string/test-strstr.c (pr23637): New function.
        (test_main): Add tests with longer needles.
        * string/strcasestr.c (AVAILABLE): Fix readahead distance.
        * string/strstr.c (AVAILABLE): Likewise.

--
diff --git a/string/strcasestr.c b/string/strcasestr.c
index 1f6b7b846f81d0d8fe77b390062d2d86be11b056..8aa76037dcc052f34ee4a8594e017ec822ce892b 100644
--- a/string/strcasestr.c
+++ b/string/strcasestr.c
@@ -37,8 +37,9 @@
 /* Two-Way algorithm.  */
 #define RETURN_TYPE char *
 #define AVAILABLE(h, h_l, j, n_l)                       \
-  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
-                             (j) + (n_l) <= (h_l)))
+  (((j) + (n_l) <= (h_l)) \
+   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
+       (j) + (n_l) <= (h_l)))
 #define CHECK_EOL (1)
 #define RET0_IF_0(a) if (!a) goto ret0
 #define CANON_ELEMENT(c) TOLOWER (c)
diff --git a/string/strstr.c b/string/strstr.c
index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644
--- a/string/strstr.c
+++ b/string/strstr.c
@@ -33,8 +33,9 @@
 
 #define RETURN_TYPE char *
 #define AVAILABLE(h, h_l, j, n_l)                       \
-  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
-                             (j) + (n_l) <= (h_l)))
+  (((j) + (n_l) <= (h_l)) \
+   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
+       (j) + (n_l) <= (h_l)))
 #define CHECK_EOL (1)
 #define RET0_IF_0(a) if (!a) goto ret0
 #define FASTSEARCH(S,C,N) (void*) strchr ((void*)(S), (C))
diff --git a/string/test-strstr.c b/string/test-strstr.c
index 8d99716ff39cc2c22ebebdacb5f30438f9b32ffc..5861b01b73e4c315c471d68300f2ba25e3533ac1 100644
--- a/string/test-strstr.c
+++ b/string/test-strstr.c
@@ -151,6 +151,32 @@ check2 (void)
     }
 }
 
+#define N 1024
+
+static void
+pr23637 (void)
+{
+  char *h = (char*) buf1;
+  char *n = (char*) buf2;
+
+  for (int i = 0; i < N; i++)
+    {
+      n[i] = 'x';
+      h[i] = ' ';
+      h[i + N] = 'x';
+    }
+
+  n[N] = '\0';
+  h[N * 2] = '\0';
+
+  /* Ensure we don't match at the first 'x'.  */
+  h[0] = 'x';
+
+  char *exp_result = stupid_strstr (h, n);
+  FOR_EACH_IMPL (impl, 0)
+    check_result (impl, h, n, exp_result);
+}
+
 static int
 test_main (void)
 {
@@ -158,6 +184,7 @@ test_main (void)
 
   check1 ();
   check2 ();
+  pr23637 ();
 
   printf ("%23s", "");
   FOR_EACH_IMPL (impl, 0)
@@ -202,6 +229,9 @@ test_main (void)
         do_test (15, 9, hlen, klen, 1);
         do_test (15, 15, hlen, klen, 0);
         do_test (15, 15, hlen, klen, 1);
+
+       do_test (15, 15, hlen + klen * 4, klen * 4, 0);
+       do_test (15, 15, hlen + klen * 4, klen * 4, 1);
       }
 
   do_test (0, 0, page_size - 1, 16, 0);
    

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] Fix strstr bug with huge needles (bug 23637)
  2018-09-17 10:37     ` Wilco Dijkstra
@ 2018-09-19 15:32       ` Szabolcs Nagy
  2018-09-19 15:59         ` Adhemerval Zanella
  0 siblings, 1 reply; 9+ messages in thread
From: Szabolcs Nagy @ 2018-09-19 15:32 UTC (permalink / raw)
  To: Wilco Dijkstra, GLIBC Devel; +Cc: nd

On 17/09/18 11:37, Wilco Dijkstra wrote:
> As reported in  https://sourceware.org/ml/libc-help/2018-09/msg00000.html ,
> the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
> AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
> repeatedly checking for the end of the string.  However if the needle length
> is larger than this, two_way_long_needle may confuse this as meaning the end
> of the string and return NULL.  This is fixed by adding the needle length to
> the amount to read ahead.

i think this should be fixed and backported asap as it can
easily cause user visible misbehaviour.

> diff --git a/string/strstr.c b/string/strstr.c
> index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644
> --- a/string/strstr.c
> +++ b/string/strstr.c
> @@ -33,8 +33,9 @@
>   
>   #define RETURN_TYPE char *
>   #define AVAILABLE(h, h_l, j, n_l)                       \
> -  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
> -                             (j) + (n_l) <= (h_l)))
> +  (((j) + (n_l) <= (h_l)) \
> +   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
> +       (j) + (n_l) <= (h_l)))

so the only diff is the +n_l in the strnlen limit.
(same in strcasestr).

i think we can safely assume that n_l + 512 does not
overflow. (user object cannot be so close to SIZE_MAX,
but may be glibc should document its assumptions about
the maximum object size somewhere).

the fix looks ok to me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] Fix strstr bug with huge needles (bug 23637)
  2018-09-19 15:32       ` Szabolcs Nagy
@ 2018-09-19 15:59         ` Adhemerval Zanella
  2018-09-19 16:25           ` Szabolcs Nagy
  0 siblings, 1 reply; 9+ messages in thread
From: Adhemerval Zanella @ 2018-09-19 15:59 UTC (permalink / raw)
  To: libc-alpha



On 19/09/2018 08:31, Szabolcs Nagy wrote:
> On 17/09/18 11:37, Wilco Dijkstra wrote:
>> As reported in  https://sourceware.org/ml/libc-help/2018-09/msg00000.html ,
>> the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
>> AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
>> repeatedly checking for the end of the string.  However if the needle length
>> is larger than this, two_way_long_needle may confuse this as meaning the end
>> of the string and return NULL.  This is fixed by adding the needle length to
>> the amount to read ahead.
> 
> i think this should be fixed and backported asap as it can
> easily cause user visible misbehaviour.
> 
>> diff --git a/string/strstr.c b/string/strstr.c
>> index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644
>> --- a/string/strstr.c
>> +++ b/string/strstr.c
>> @@ -33,8 +33,9 @@
>>     #define RETURN_TYPE char *
>>   #define AVAILABLE(h, h_l, j, n_l)                       \
>> -  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
>> -                             (j) + (n_l) <= (h_l)))
>> +  (((j) + (n_l) <= (h_l)) \
>> +   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
>> +       (j) + (n_l) <= (h_l)))
> 
> so the only diff is the +n_l in the strnlen limit.
> (same in strcasestr).
> 
> i think we can safely assume that n_l + 512 does not
> overflow. (user object cannot be so close to SIZE_MAX,
> but may be glibc should document its assumptions about
> the maximum object size somewhere).
> 
> the fix looks ok to me.

Good catch, I would prefer to use a saturated math here instead and not
make such assumptions (even though it would incur in a small performance
hit for a usercase unlikely to happen).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] Fix strstr bug with huge needles (bug 23637)
  2018-09-19 15:59         ` Adhemerval Zanella
@ 2018-09-19 16:25           ` Szabolcs Nagy
  2018-09-19 22:07             ` Adhemerval Zanella
  0 siblings, 1 reply; 9+ messages in thread
From: Szabolcs Nagy @ 2018-09-19 16:25 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha; +Cc: nd

On 19/09/18 16:59, Adhemerval Zanella wrote:
> On 19/09/2018 08:31, Szabolcs Nagy wrote:
>> On 17/09/18 11:37, Wilco Dijkstra wrote:
>>> As reported in  https://sourceware.org/ml/libc-help/2018-09/msg00000.html ,
>>> the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
>>> AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
>>> repeatedly checking for the end of the string.  However if the needle length
>>> is larger than this, two_way_long_needle may confuse this as meaning the end
>>> of the string and return NULL.  This is fixed by adding the needle length to
>>> the amount to read ahead.
>>
>> i think this should be fixed and backported asap as it can
>> easily cause user visible misbehaviour.
>>
>>> diff --git a/string/strstr.c b/string/strstr.c
>>> index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644
>>> --- a/string/strstr.c
>>> +++ b/string/strstr.c
>>> @@ -33,8 +33,9 @@
>>>      #define RETURN_TYPE char *
>>>    #define AVAILABLE(h, h_l, j, n_l)                       \
>>> -  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
>>> -                             (j) + (n_l) <= (h_l)))
>>> +  (((j) + (n_l) <= (h_l)) \
>>> +   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
>>> +       (j) + (n_l) <= (h_l)))
>>
>> so the only diff is the +n_l in the strnlen limit.
>> (same in strcasestr).
>>
>> i think we can safely assume that n_l + 512 does not
>> overflow. (user object cannot be so close to SIZE_MAX,
>> but may be glibc should document its assumptions about
>> the maximum object size somewhere).
>>
>> the fix looks ok to me.
> 
> Good catch, I would prefer to use a saturated math here instead and not
> make such assumptions (even though it would incur in a small performance
> hit for a usercase unlikely to happen).
> 

note that even if all of the address space is available to
userspace, it has to contain glibc .text and .data as well
as a stack and the needle cannot cover all those.

there is existing n_l + 256 elsewhere in the code as well.

and needle and haystack have to overlap for needle to be that big
and it's not clear if in that case this code path can be hit.

if you still think it should be fixed then n_l | 512 works too,
but i think this is not worth changing, i mentioned the
problem because it's easier to review such code if glibc
has an explicit statement that size+1M < SIZE_MAX or similar.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] Fix strstr bug with huge needles (bug 23637)
  2018-09-19 16:25           ` Szabolcs Nagy
@ 2018-09-19 22:07             ` Adhemerval Zanella
  0 siblings, 0 replies; 9+ messages in thread
From: Adhemerval Zanella @ 2018-09-19 22:07 UTC (permalink / raw)
  To: Szabolcs Nagy, libc-alpha; +Cc: nd



On 19/09/2018 09:24, Szabolcs Nagy wrote:
> On 19/09/18 16:59, Adhemerval Zanella wrote:
>> On 19/09/2018 08:31, Szabolcs Nagy wrote:
>>> On 17/09/18 11:37, Wilco Dijkstra wrote:
>>>> As reported in  https://sourceware.org/ml/libc-help/2018-09/msg00000.html ,
>>>> the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized
>>>> AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of
>>>> repeatedly checking for the end of the string.  However if the needle length
>>>> is larger than this, two_way_long_needle may confuse this as meaning the end
>>>> of the string and return NULL.  This is fixed by adding the needle length to
>>>> the amount to read ahead.
>>>
>>> i think this should be fixed and backported asap as it can
>>> easily cause user visible misbehaviour.
>>>
>>>> diff --git a/string/strstr.c b/string/strstr.c
>>>> index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644
>>>> --- a/string/strstr.c
>>>> +++ b/string/strstr.c
>>>> @@ -33,8 +33,9 @@
>>>>      #define RETURN_TYPE char *
>>>>    #define AVAILABLE(h, h_l, j, n_l)                       \
>>>> -  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \
>>>> -                             (j) + (n_l) <= (h_l)))
>>>> +  (((j) + (n_l) <= (h_l)) \
>>>> +   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \
>>>> +       (j) + (n_l) <= (h_l)))
>>>
>>> so the only diff is the +n_l in the strnlen limit.
>>> (same in strcasestr).
>>>
>>> i think we can safely assume that n_l + 512 does not
>>> overflow. (user object cannot be so close to SIZE_MAX,
>>> but may be glibc should document its assumptions about
>>> the maximum object size somewhere).
>>>
>>> the fix looks ok to me.
>>
>> Good catch, I would prefer to use a saturated math here instead and not
>> make such assumptions (even though it would incur in a small performance
>> hit for a usercase unlikely to happen)
> 
> note that even if all of the address space is available to
> userspace, it has to contain glibc .text and .data as well
> as a stack and the needle cannot cover all those.
> 
> there is existing n_l + 256 elsewhere in the code as well.
> 
> and needle and haystack have to overlap for needle to be that big
> and it's not clear if in that case this code path can be hit.
> 
> if you still think it should be fixed then n_l | 512 works too,
> but i think this is not worth changing, i mentioned the
> problem because it's easier to review such code if glibc
> has an explicit statement that size+1M < SIZE_MAX or similar.
> 

I agree that a possible overflow issue won't happen and I don't
have a strong opinion whether indeed changing the code to handle
would yield any gain. Your rationale is good enough to keep the
code as is.

As a side note, I think we should backport it to 2.28.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-09-19 22:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-12 13:02 [PATCH] Fix strstr bug with huge needles (bug 23637) Wilco Dijkstra
2018-09-12 13:22 ` H.J. Lu
2018-09-12 19:31 ` Paul Pluzhnikov
2018-09-13 16:56   ` [PATCH v2] " Wilco Dijkstra
2018-09-17 10:37     ` Wilco Dijkstra
2018-09-19 15:32       ` Szabolcs Nagy
2018-09-19 15:59         ` Adhemerval Zanella
2018-09-19 16:25           ` Szabolcs Nagy
2018-09-19 22:07             ` Adhemerval Zanella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).