public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Improve wcsstr
@ 2024-03-19 13:15 Adhemerval Zanella
  2024-03-19 13:15 ` [PATCH v4 1/2] wcsmbs: Add test-wcsstr Adhemerval Zanella
  2024-03-19 13:15 ` [PATCH v4 2/2] wcsmbs: Ensure wcstr worst-case linear execution time (BZ 23865) Adhemerval Zanella
  0 siblings, 2 replies; 5+ messages in thread
From: Adhemerval Zanella @ 2024-03-19 13:15 UTC (permalink / raw)
  To: libc-alpha; +Cc: DJ Delorie

Different than strstr, wcsstr still uses an O(m*n) algorithm that might
be considered a security issue (although BZ 23865 was marked security-
since there is no actual application impact). 

The gnulib recently added a wrapper to fix it [1] and it is used as the
base de str-two-way.h implementation. This patch adds a similar
implementation, and different than strstr, neither the "shift table"
optimization nor the self-adapting filtering check is used because it
would result in a too-large shift table (and it also simplifies the
implementation bit).  The patchset also added a proper tests for wcsstr,
based on strstr one.

With this fix, and with the removal of the powerpc strcasestr
optimization [2], it seems that only x86_64 still provides a non
O(m*n) implementation [3].  Noah already gave a +1, so it would be
good to have some confirmation that this implementation can really
show some quadradic behaviour before propose a removal.

[1] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=9411c5e467cf60f6295b9fed306029f341a0f24f
[2] https://sourceware.org/git/?p=glibc.git;a=commit;h=4a76fb1da8b7e7fa472741921f49ef32f81bc0a0
[3] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/strstr-avx512.c;h=3ac53accbdde0b400dfd19a2070fbb579aff4177;hb=4a76fb1da8b7e7fa472741921f49ef32f81bc0a0


--
Changes from v3:
* Fixed check-abi regression.

Changes from v2:
* Remove the test repetition.

Changes from v1:
* Add more tests from gnulib.
* Removed unused macros from wcsstr.
--

Adhemerval Zanella (2):
  wcsmbs: Add test-wcsstr
  wcsmbs: Ensure wcstr worst-case linear execution time (BZ 23865)

 string/test-strstr.c | 316 +++++++++++++++++++++++++++++++++++--------
 wcsmbs/Makefile      |   1 +
 wcsmbs/test-wcsstr.c |  20 +++
 wcsmbs/wcs-two-way.h | 312 ++++++++++++++++++++++++++++++++++++++++++
 wcsmbs/wcsstr.c      | 101 ++++----------
 5 files changed, 624 insertions(+), 126 deletions(-)
 create mode 100644 wcsmbs/test-wcsstr.c
 create mode 100644 wcsmbs/wcs-two-way.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 5+ messages in thread
* [PATCH v4 0/2] Improve wcsstr
@ 2024-03-19 15:07 Wilco Dijkstra
  0 siblings, 0 replies; 5+ messages in thread
From: Wilco Dijkstra @ 2024-03-19 15:07 UTC (permalink / raw)
  To: Adhemerval Zanella, Noah Goldstein; +Cc: 'GNU C Library'

Hi Adhemerval,

> With this fix, and with the removal of the powerpc strcasestr
> optimization [2], it seems that only x86_64 still provides a non
> O(m*n) implementation [3].  Noah already gave a +1, so it would be
> good to have some confirmation that this implementation can really
> show some quadradic behaviour before propose a removal.

Yes it is a simple brute-force algorithm that checks the whole needle
at a matching character pair (and does so 1 byte at a time after the
first 64 bytes of a needle). Also it never skips ahead and thus can match
at every haystack position after trying to match all of the needle.

I added a quick test for this (every different implementation requires
a unique test for its worst-case), and I got:

  "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512", "__strstr_sse2_unaligned", "__strstr_generic"],

    {
     "len_haystack": 65536,
     "len_needle": 1024,
     "align_haystack": 0,
     "align_needle": 0,
     "fail": 1,
     "desc": "Difficult bruteforce needle",
     "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2]
    },
    {
     "len_haystack": 1048576,
     "len_needle": 4096,
     "align_haystack": 0,
     "align_needle": 0,
     "fail": 1,
     "desc": "Difficult bruteforce needle",
     "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9]
    }

So a 4x larger needle and 16x larger haystack gives a clear 65x slowdown on
both basic_strstr and __strstr_avx512.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-10 23:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-19 13:15 [PATCH v4 0/2] Improve wcsstr Adhemerval Zanella
2024-03-19 13:15 ` [PATCH v4 1/2] wcsmbs: Add test-wcsstr Adhemerval Zanella
2024-03-19 13:15 ` [PATCH v4 2/2] wcsmbs: Ensure wcstr worst-case linear execution time (BZ 23865) Adhemerval Zanella
2024-04-10 23:20   ` DJ Delorie
2024-03-19 15:07 [PATCH v4 0/2] Improve wcsstr Wilco Dijkstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).