public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/5514] New: memmem is O(n^2), but should be O(n)
@ 2007-12-20 13:30 ebb9 at byu dot net
  2007-12-20 19:53 ` [Bug libc/5514] " roland at gnu dot org
                   ` (14 more replies)
  0 siblings, 15 replies; 21+ messages in thread
From: ebb9 at byu dot net @ 2007-12-20 13:30 UTC (permalink / raw)
  To: glibc-bugs

The worst-case complexity of memmem is currently quadratic, O(m*n) where m is
the length of the haystack and n the length of the needle.  The worst case
occurs when the distinguishing feature of the needle is at the end, the haystack
consists of repetitions of the needle's prefix, and the needle is either not
present or occurs late in the haystack.  Since each iteration examines the bulk
of the needle, but only advances one byte per iteration, you have quadratic
performance.  Using the Knuth-Morris-Pratt algorithm or Boyer-Moore algorithm
would guarantee worst-case complexity of O(m+n), at the expense of a memory
allocation of O(n).

Gnulib recently converted its memmem implementation to use KMP, along with an
allocation amortization that avoids KMP until after a bounded number of failed
naive iterations, and falling back to the naive algorithm when KMP cannot
allocate enough memory.

http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/memmem.c;h=69d2edc;hb=fc068cf

It looks like strstr could also benefit from this algorithmic speedup.

-- 
           Summary: memmem is O(n^2), but should be O(n)
           Product: glibc
           Version: 2.4
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper at redhat dot com
        ReportedBy: ebb9 at byu dot net
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=5514

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-05-28 19:45 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-5514-131@http.sourceware.org/bugzilla/>
2010-10-05 17:06 ` [Bug libc/5514] memmem is O(n^2), but should be O(n) ppluzhnikov at google dot com
2010-10-05 17:12 ` eblake at redhat dot com
2010-10-05 17:14 ` ppluzhnikov at google dot com
2010-10-06 15:52 ` eblake at redhat dot com
2014-02-16 19:43 ` jackie.rosen at hushmail dot com
2014-05-28 19:45 ` schwab at sourceware dot org
2007-12-20 13:30 [Bug libc/5514] New: " ebb9 at byu dot net
2007-12-20 19:53 ` [Bug libc/5514] " roland at gnu dot org
2007-12-21 13:38 ` bruno at clisp dot org
2007-12-21 13:50 ` bruno at clisp dot org
2007-12-21 14:00 ` jakub at redhat dot com
2007-12-26 16:18 ` bruno at clisp dot org
2007-12-26 16:19 ` bruno at clisp dot org
2007-12-26 16:20 ` bruno at clisp dot org
2008-01-04 23:05 ` ebb9 at byu dot net
2008-01-05  0:10 ` ebb9 at byu dot net
2008-01-08 21:51 ` ebb9 at byu dot net
2008-03-29 16:44 ` ebb9 at byu dot net
2008-03-29 16:45 ` ebb9 at byu dot net
2008-04-07 18:28 ` drepper at redhat dot com
2008-04-21 12:07 ` ebb9 at byu dot net
2008-05-15  4:43 ` drepper at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).