public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Fastest String Search Algorithm.
@ 2021-06-09 10:31 Wilco Dijkstra
  2021-06-09 11:42 ` Amit Choudhary
  0 siblings, 1 reply; 45+ messages in thread
From: Wilco Dijkstra @ 2021-06-09 10:31 UTC (permalink / raw)
  To: Amit Choudhary; +Cc: 'GNU C Library'

Hi Amit,

> If my algorithm is beating strstr() for long search patterns then I think
> that my algorithm should be included in glibc for long search patterns.

I tried it in bench-strstr.c, strstr is often 10 times faster on many inputs.
I only noticed a case with a 32KB needle and 64KB text where it beats strstr by
a factor 3. On complex needles it behaves like basic_strstr (ie. quadratic time).

The existing code that handles large needles isn't highly optimized, so it is
certainly possible to improve its performance. Large needles are rare - a size
like 4KB will never be used in practice. A key issue with large needles is quadratic
performance, and any new algorithm will need to address that somehow.

So it's not as simple as showing your algorithm is faster in one case, you have to
show it is faster on average for all large needles and avoid quadratic performance
on specially crafted worst-case inputs. As others have pointed out, this is not an
easy task...

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2021-06-13  4:06 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-09 10:31 Fastest String Search Algorithm Wilco Dijkstra
2021-06-09 11:42 ` Amit Choudhary
2021-06-09 12:05   ` Phil Blundell
2021-06-09 12:11     ` Adhemerval Zanella
2021-06-09 12:22       ` Amit Choudhary
2021-06-09 12:55         ` Adhemerval Zanella
2021-06-09 13:18           ` Amit Choudhary
2021-06-09 13:32             ` Amit Choudhary
2021-06-09 13:40             ` Adhemerval Zanella
2021-06-09 14:08               ` Amit Choudhary
2021-06-09 14:25                 ` Phil Blundell
2021-06-09 12:18     ` Amit Choudhary
2021-06-09 13:48       ` Phil Blundell
2021-06-09 21:44   ` Wilco Dijkstra
2021-06-10  2:42     ` Amit Choudhary
2021-06-10  3:23       ` Dan Raymond
2021-06-10  4:19         ` Amit Choudhary
2021-06-10  9:44       ` Wilco Dijkstra
2021-06-10 11:08         ` Amit Choudhary
2021-06-10 16:53           ` Amit Choudhary
2021-06-10 20:04             ` Amit Choudhary
2021-06-11  8:28               ` Amit Choudhary
2021-06-11 11:45           ` Wilco Dijkstra
2021-06-11 12:42             ` Amit Choudhary
2021-06-11 13:39               ` Amit Choudhary
2021-06-11 13:46               ` Wilco Dijkstra
2021-06-11 15:52                 ` Amit Choudhary
2021-06-11 19:14                   ` Amit Choudhary
2021-06-11 19:26                     ` Paul Eggert
2021-06-11 19:41                       ` Amit Choudhary
2021-06-11 20:00                         ` Dan Raymond
2021-06-12  4:00                           ` Amit Choudhary
2021-06-12  4:18                             ` Amit Choudhary
2021-06-12  7:16                               ` [Final testing with benchmark tests] " Amit Choudhary
2021-06-12  8:01                                 ` Paul Zimmermann
2021-06-12  8:41                                   ` Amit Choudhary
2021-06-12 13:29                                     ` Amit Choudhary
2021-06-12 10:56                                   ` Siddhesh Poyarekar
2021-06-12 13:47                                     ` Amit Choudhary
2021-06-13  3:32                                       ` Siddhesh Poyarekar
2021-06-13  4:06                                         ` Amit Choudhary
2021-06-11 13:57               ` Paul Zimmermann
2021-06-11 14:04                 ` Amit Choudhary
2021-06-11 14:16                   ` Paul Zimmermann
2021-06-11 14:27                     ` Amit Choudhary

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).