public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: Amit Choudhary <amitchoudhary0523@gmail.com>
Cc: GNU C Library <libc-alpha@sourceware.org>
Subject: Re: Fastest String Search Algorithm.
Date: Fri, 11 Jun 2021 11:45:31 +0000	[thread overview]
Message-ID: <VE1PR08MB559979384BE6CA29580E7EE583349@VE1PR08MB5599.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <CAFf+5zirJdcEQXHx021uRW-aN=h5zOADuzRBuv3M5uh67_iQvg@mail.gmail.com>

Hi Amit,

> In my opinion, needle should be in the middle so as not to give any
> algorithm any unfair advantage - like Boyer-Moore which searches from the end.

There is no unfair advantage to any algorithm since they all start searching
from the start (note BM does not start at the end).

> I searched for the needle in the haystack using document editor's search
> and found that needle was located in the beginning.

No, it's always located at the end - it is even tested explicitly.

> The worst case scenario of my algorithm is O(mn) where m is the length of
> the text and n is the length of the pattern.

Exactly. And that is problematic if both m and n are large.

> And this worst case scenario will be hit in only one case:

There is not just one worst-case, the number is practically infinite.
A typical worst case is AAAAAAAAAAAAAAA with needle AAAAAAABA.

> But my theory is this: We should be fastest 100% if the time. If that is not possible,
> then we should try to get as close to 100% as possible.
> 
> This means that if for average case, some algorithm is faster 99% of the time but
> slower 1% of the time in the worst case scenario then we should go for this
> algorithm rather than fretting over worst case scenario.

For large needles and haystacks the worst case becomes more important because
the slowdown becomes larger with the length. It's bad even if there is a 0.001%
chance that your computer locks up for minutes.

Cheers,
Wilco

  parent reply	other threads:[~2021-06-11 11:45 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-09 10:31 Wilco Dijkstra
2021-06-09 11:42 ` Amit Choudhary
2021-06-09 12:05   ` Phil Blundell
2021-06-09 12:11     ` Adhemerval Zanella
2021-06-09 12:22       ` Amit Choudhary
2021-06-09 12:55         ` Adhemerval Zanella
2021-06-09 13:18           ` Amit Choudhary
2021-06-09 13:32             ` Amit Choudhary
2021-06-09 13:40             ` Adhemerval Zanella
2021-06-09 14:08               ` Amit Choudhary
2021-06-09 14:25                 ` Phil Blundell
2021-06-09 12:18     ` Amit Choudhary
2021-06-09 13:48       ` Phil Blundell
2021-06-09 21:44   ` Wilco Dijkstra
2021-06-10  2:42     ` Amit Choudhary
2021-06-10  3:23       ` Dan Raymond
2021-06-10  4:19         ` Amit Choudhary
2021-06-10  9:44       ` Wilco Dijkstra
2021-06-10 11:08         ` Amit Choudhary
2021-06-10 16:53           ` Amit Choudhary
2021-06-10 20:04             ` Amit Choudhary
2021-06-11  8:28               ` Amit Choudhary
2021-06-11 11:45           ` Wilco Dijkstra [this message]
2021-06-11 12:42             ` Amit Choudhary
2021-06-11 13:39               ` Amit Choudhary
2021-06-11 13:46               ` Wilco Dijkstra
2021-06-11 15:52                 ` Amit Choudhary
2021-06-11 19:14                   ` Amit Choudhary
2021-06-11 19:26                     ` Paul Eggert
2021-06-11 19:41                       ` Amit Choudhary
2021-06-11 20:00                         ` Dan Raymond
2021-06-12  4:00                           ` Amit Choudhary
2021-06-12  4:18                             ` Amit Choudhary
2021-06-12  7:16                               ` [Final testing with benchmark tests] " Amit Choudhary
2021-06-12  8:01                                 ` Paul Zimmermann
2021-06-12  8:41                                   ` Amit Choudhary
2021-06-12 13:29                                     ` Amit Choudhary
2021-06-12 10:56                                   ` Siddhesh Poyarekar
2021-06-12 13:47                                     ` Amit Choudhary
2021-06-13  3:32                                       ` Siddhesh Poyarekar
2021-06-13  4:06                                         ` Amit Choudhary
2021-06-11 13:57               ` Paul Zimmermann
2021-06-11 14:04                 ` Amit Choudhary
2021-06-11 14:16                   ` Paul Zimmermann
2021-06-11 14:27                     ` Amit Choudhary
  -- strict thread matches above, loose matches on Subject: below --
2021-06-07 16:17 Wilco Dijkstra
2021-06-07 17:33 ` Florian Weimer
2021-06-08  4:52   ` Siddhesh Poyarekar
2021-06-07 14:46 Wilco Dijkstra
2021-06-07 15:21 ` Amit Choudhary
2021-06-07 16:02   ` Amit Choudhary
2021-06-07 16:04   ` Wilco Dijkstra
2021-06-07 10:53 Wilco Dijkstra
2021-06-07 12:33 ` Amit Choudhary
2021-06-08  6:33 ` Amit Choudhary
2021-06-08  7:25   ` Amit Choudhary
2021-06-06 14:15 Amit Choudhary
2021-06-06 14:30 ` H.J. Lu
2021-06-06 18:38   ` Amit Choudhary
2021-06-09  5:33   ` Guo, Wangyang
2021-06-09  6:18     ` Amit Choudhary
2021-06-09  6:50       ` Siddhesh Poyarekar
2021-06-09  7:38       ` Phil Blundell
2021-06-09  7:54         ` Amit Choudhary
2021-06-09  9:26           ` Phil Blundell
2021-06-06 17:58 ` Paul Eggert
2021-06-06 18:26   ` Amit Choudhary
2021-06-07 11:48     ` Amit Choudhary
2021-06-07 20:12       ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VE1PR08MB559979384BE6CA29580E7EE583349@VE1PR08MB5599.eurprd08.prod.outlook.com \
    --to=wilco.dijkstra@arm.com \
    --cc=amitchoudhary0523@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).