public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: noah <goldstein.w.n@gmail.com>
Cc: GNU C Library <libc-alpha@sourceware.org>,
	"Carlos O'Donell" <carlos@systemhalted.org>
Subject: Re: [PATCH v2 2/2] x86: Add additional benchmarks for strchr
Date: Mon, 1 Feb 2021 09:10:36 -0800	[thread overview]
Message-ID: <CAMe9rOop=no5vopFXMjGkXozwqWosaoYeq-3-aZ39hHXwY-fVA@mail.gmail.com> (raw)
In-Reply-To: <20210201003014.785099-2-goldstein.w.n@gmail.com>

On Sun, Jan 31, 2021 at 4:30 PM noah <goldstein.w.n@gmail.com> wrote:
>
> This patch adds additional benchmarks for string size of 4096 and
> several benchmarks for string size 256 with different alignments.
>
> Signed-off-by: noah <goldstein.w.n@gmail.com>
> ---
> Added 2 additional benchmark sizes:
>
> 4096: Just feels like a natural "large" size to test
>
> 256 with multiple alignments: This essentially is to test how
> expensive the initial work prior to the 4x loop is depending on
> different alignments.
>
> results from bench-strchr: All times are in seconds and the medium of
> 100 runs.  Old is current strchr-avx2.S implementation. New is this
> patch.
>
> Summary: New is definetly faster for medium -> large sizes. Once the
> 4x loop is hit there is a 10%+ speedup and New always wins out. For
> smaller sizes there is more variance as to which is faster and the
> differences are small. Generally it seems the New version wins
> out. This is likely because 0 - 31 sized strings are the fast path for
> new (no jmp).
>
> Benchmarking CPU:
> Icelake: Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
>
> size, algn, Old T , New T  -------- Win  Dif
> 0   , 0   , 2.54  , 2.52   -------- New  -0.02
> 1   , 0   , 2.57  , 2.52   -------- New  -0.05
> 2   , 0   , 2.56  , 2.52   -------- New  -0.04
> 3   , 0   , 2.58  , 2.54   -------- New  -0.04
> 4   , 0   , 2.61  , 2.55   -------- New  -0.06
> 5   , 0   , 2.65  , 2.62   -------- New  -0.03
> 6   , 0   , 2.73  , 2.74   -------- Old  -0.01
> 7   , 0   , 2.75  , 2.74   -------- New  -0.01
> 8   , 0   , 2.62  , 2.6    -------- New  -0.02
> 9   , 0   , 2.73  , 2.75   -------- Old  -0.02
> 10  , 0   , 2.74  , 2.74   -------- Eq    N/A
> 11  , 0   , 2.76  , 2.72   -------- New  -0.04
> 12  , 0   , 2.74  , 2.72   -------- New  -0.02
> 13  , 0   , 2.75  , 2.72   -------- New  -0.03
> 14  , 0   , 2.74  , 2.73   -------- New  -0.01
> 15  , 0   , 2.74  , 2.73   -------- New  -0.01
> 16  , 0   , 2.74  , 2.73   -------- New  -0.01
> 17  , 0   , 2.74  , 2.74   -------- Eq    N/A
> 18  , 0   , 2.73  , 2.73   -------- Eq    N/A
> 19  , 0   , 2.73  , 2.73   -------- Eq    N/A
> 20  , 0   , 2.73  , 2.73   -------- Eq    N/A
> 21  , 0   , 2.73  , 2.72   -------- New  -0.01
> 22  , 0   , 2.71  , 2.74   -------- Old  -0.03
> 23  , 0   , 2.71  , 2.69   -------- New  -0.02
> 24  , 0   , 2.68  , 2.67   -------- New  -0.01
> 25  , 0   , 2.66  , 2.62   -------- New  -0.04
> 26  , 0   , 2.64  , 2.62   -------- New  -0.02
> 27  , 0   , 2.71  , 2.64   -------- New  -0.07
> 28  , 0   , 2.67  , 2.69   -------- Old  -0.02
> 29  , 0   , 2.72  , 2.72   -------- Eq    N/A
> 30  , 0   , 2.68  , 2.69   -------- Old  -0.01
> 31  , 0   , 2.68  , 2.68   -------- Eq    N/A
> 32  , 0   , 3.51  , 3.52   -------- Old  -0.01
> 32  , 1   , 3.52  , 3.51   -------- New  -0.01
> 64  , 0   , 3.97  , 3.93   -------- New  -0.04
> 64  , 2   , 3.95  , 3.9    -------- New  -0.05
> 64  , 1   , 4.0   , 3.93   -------- New  -0.07
> 64  , 3   , 3.97  , 3.88   -------- New  -0.09
> 64  , 4   , 3.95  , 3.89   -------- New  -0.06
> 64  , 5   , 3.94  , 3.9    -------- New  -0.04
> 64  , 6   , 3.97  , 3.9    -------- New  -0.07
> 64  , 7   , 3.97  , 3.91   -------- New  -0.06
> 96  , 0   , 4.74  , 4.52   -------- New  -0.22
> 128 , 0   , 5.29  , 5.19   -------- New  -0.1
> 128 , 2   , 5.29  , 5.15   -------- New  -0.14
> 128 , 3   , 5.31  , 5.22   -------- New  -0.09
> 256 , 0   , 11.19 , 9.81   -------- New  -1.38
> 256 , 3   , 11.19 , 9.84   -------- New  -1.35
> 256 , 4   , 11.2  , 9.88   -------- New  -1.32
> 256 , 16  , 11.21 , 9.79   -------- New  -1.42
> 256 , 32  , 11.39 , 10.34  -------- New  -1.05
> 256 , 48  , 11.88 , 10.56  -------- New  -1.32
> 256 , 64  , 11.82 , 10.83  -------- New  -0.99
> 256 , 80  , 11.85 , 10.86  -------- New  -0.99
> 256 , 96  , 9.56  , 8.76   -------- New  -0.8
> 256 , 112 , 9.55  , 8.9    -------- New  -0.65
> 512 , 0   , 15.76 , 13.72  -------- New  -2.04
> 512 , 4   , 15.72 , 13.74  -------- New  -1.98
> 512 , 5   , 15.73 , 13.74  -------- New  -1.99
> 1024, 0   , 24.85 , 21.33  -------- New  -3.52
> 1024, 5   , 24.86 , 21.27  -------- New  -3.59
> 1024, 6   , 24.87 , 21.32  -------- New  -3.55
> 2048, 0   , 45.75 , 36.7   -------- New  -9.05
> 2048, 6   , 43.91 , 35.42  -------- New  -8.49
> 2048, 7   , 44.43 , 36.37  -------- New  -8.06
> 4096, 0   , 96.94 , 81.34  -------- New  -15.6
> 4096, 7   , 97.01 , 81.32  -------- New  -15.69
>
>
>
>  benchtests/bench-strchr.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/benchtests/bench-strchr.c b/benchtests/bench-strchr.c
> index bf493fe458..5fd98a5d43 100644
> --- a/benchtests/bench-strchr.c
> +++ b/benchtests/bench-strchr.c
> @@ -100,9 +100,13 @@ do_test (size_t align, size_t pos, size_t len, int seek_char, int max_char)
>    size_t i;
>    CHAR *result;
>    CHAR *buf = (CHAR *) buf1;
> -  align &= 15;
> +
> +  align &= 127;
>    if ((align + len) * sizeof (CHAR) >= page_size)
> -    return;
> +    {
> +      return;
> +    }
> +
>
>    for (i = 0; i < len; ++i)
>      {
> @@ -151,12 +155,24 @@ test_main (void)
>        do_test (i, 16 << i, 2048, SMALL_CHAR, MIDDLE_CHAR);
>      }
>
> +  for (i = 1; i < 8; ++i)
> +    {
> +      do_test (0, 16 << i, 4096, SMALL_CHAR, MIDDLE_CHAR);
> +      do_test (i, 16 << i, 4096, SMALL_CHAR, MIDDLE_CHAR);
> +    }
> +
>    for (i = 1; i < 8; ++i)
>      {
>        do_test (i, 64, 256, SMALL_CHAR, MIDDLE_CHAR);
>        do_test (i, 64, 256, SMALL_CHAR, BIG_CHAR);
>      }
>
> +  for (i = 0; i < 8; ++i)
> +    {
> +      do_test (16 * i, 256, 512, SMALL_CHAR, MIDDLE_CHAR);
> +      do_test (16 * i, 256, 512, SMALL_CHAR, BIG_CHAR);
> +    }
> +
>    for (i = 0; i < 32; ++i)
>      {
>        do_test (0, i, i + 1, SMALL_CHAR, MIDDLE_CHAR);
> @@ -169,12 +185,24 @@ test_main (void)
>        do_test (i, 16 << i, 2048, 0, MIDDLE_CHAR);
>      }
>
> +  for (i = 1; i < 8; ++i)
> +    {
> +      do_test (0, 16 << i, 4096, 0, MIDDLE_CHAR);
> +      do_test (i, 16 << i, 4096, 0, MIDDLE_CHAR);
> +    }
> +
>    for (i = 1; i < 8; ++i)
>      {
>        do_test (i, 64, 256, 0, MIDDLE_CHAR);
>        do_test (i, 64, 256, 0, BIG_CHAR);
>      }
>
> +  for (i = 0; i < 8; ++i)
> +    {
> +      do_test (16 * i, 256, 512, 0, MIDDLE_CHAR);
> +      do_test (16 * i, 256, 512, 0, BIG_CHAR);
> +    }
> +
>    for (i = 0; i < 32; ++i)
>      {
>        do_test (0, i, i + 1, 0, MIDDLE_CHAR);
> --
> 2.29.2

Please make the similar changes in string/test-strchr.c.

Thanks.

-- 
H.J.

  reply	other threads:[~2021-02-01 17:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01  0:30 [PATCH v2 1/2] x86: Refactor and improve performance of strchr-avx2.S noah
2021-02-01  0:30 ` [PATCH v2 2/2] x86: Add additional benchmarks for strchr noah
2021-02-01 17:10   ` H.J. Lu [this message]
2021-02-01 17:08 ` [PATCH v2 1/2] x86: Refactor and improve performance of strchr-avx2.S H.J. Lu
2021-02-02  7:23 ` [PATCH v3 " goldstein.w.n

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMe9rOop=no5vopFXMjGkXozwqWosaoYeq-3-aZ39hHXwY-fVA@mail.gmail.com' \
    --to=hjl.tools@gmail.com \
    --cc=carlos@systemhalted.org \
    --cc=goldstein.w.n@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).