From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by sourceware.org (Postfix) with ESMTPS id BDB693857027 for ; Sat, 12 Jun 2021 08:42:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BDB693857027 Received: by mail-lj1-x242.google.com with SMTP id l4so309937ljg.0 for ; Sat, 12 Jun 2021 01:42:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=x5wz8w5o+6fQzfmEoA8cXQGQcTGLQsP1oXWdl4tpi/s=; b=el1Ikssakuv/7ShLtRC3r8YF5i0Q8x8vcelTGg9S1VZfKXSpQQ7WoJ5M20SFFVy8bO rLlrkOEjb8xrsXX4M0X3iNATg0BUR+w1hQEgGLObmaADIHm3R8Z1rLsnvWx7A1RtBPdU 6qkvzN/FK0QqcLYY+WLudYEZbjIewp8GivnEsRhiQaW+ULolBIL1U4w62Ola+HC3fDLZ 0AcM8ewcg+LbR3nZEBB5qfYJnUCPK9G7dWdBwW+gW7YgkAyxzue98D7C29kqcgDe9yET LcrzzoMxJD9qKADwCM8zt6fUBJboNQW8OJv0BiS3cxDNlq3rYAo6WZJHO31rpPIjg80R gK2A== X-Gm-Message-State: AOAM531h3U3nXf/uzUTQmaBa8QqzJl1T/JMhjI+wZG3ze8V3eWMRTio9 VKm+jIE0a+n1EzfPyvjEZTP/9t8WOL9olnFAVaU= X-Google-Smtp-Source: ABdhPJwLUgqvBI8g7gZsJOwXIWZO4n0ESOnpge7/qBLgmUEAEL6WV0tClZbq/UJCkz+nJZLbn9RbTQ4TUZ7+S9+iOoA= X-Received: by 2002:a2e:6e06:: with SMTP id j6mr6028010ljc.3.1623487324578; Sat, 12 Jun 2021 01:42:04 -0700 (PDT) MIME-Version: 1.0 References: <39bd7d71-eebf-6b7c-b6e7-6816037a4492@cs.ucla.edu> In-Reply-To: From: Amit Choudhary Date: Sat, 12 Jun 2021 14:11:53 +0530 Message-ID: Subject: Re: [Final testing with benchmark tests] Fastest String Search Algorithm. To: Paul Zimmermann Cc: GNU C Library X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Jun 2021 08:42:07 -0000 On Sat, Jun 12, 2021 at 1:31 PM Paul Zimmermann wrote: > Dear Amit, > > > I ran the benchmark tests for my algorithm and for strstr(). > > > > Total number of tests = 332. > > > > My algorithm is faster 182 times. > > > > strstr() is faster 150 times. > > and your algorithm can be up to 3531 times slower than strstr(): > > s: 3.75006e+08 106175 > > The largest strstr() time is 1.5e6, whereas the largest time of your > algorithm is 9.8e8, i.e., 656 times larger. > > Can you improve your algorithm so that the worst case time is similar to > that > of strstr()? > > Paul > I don't think my algorithm can be optimized more. Anyways, I had invented my algorithm thinking about normal average case scenarios and not for worst case scenarios. I was talking with Dr. Moore (inventor of Boyer-Moore algorithm) and he suggested me to demonstrate that my algorithm is better for average case by doing exhaustive testing across a wide variety of patterns and texts. He also told me that he had invented his algorithm for average case too and not thinking much about worst case scenario. ==================What Dr. Moore said======================== If your algorithm is worse case O(mn) then you should somehow argue (and demonstrate with exhausting testing across a wide variety of patterns and texts) that its ``average'' behavior is better! The point is that the naive algorithm is worst-case O(mn) and why should a person use yours rather than that simpler algorithm? ========================================================== Regards, Amit