public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Noah Goldstein <goldstein.w.n@gmail.com>
Cc: libc-alpha@sourceware.org, carlos@systemhalted.org
Subject: Re: [PATCH v1 7/7] Bench: Improve benchtests for memchr, strchr, strnlen, strrchr
Date: Tue, 18 Oct 2022 14:00:56 -0700	[thread overview]
Message-ID: <CAMe9rOrWmpKw3dy_A6CjsOwFEYBgUE+ajC_2OUJGQUMR=Pqd5g@mail.gmail.com> (raw)
In-Reply-To: <20221018024901.3381469-7-goldstein.w.n@gmail.com>

On Mon, Oct 17, 2022 at 7:49 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> 1. Add more complete coverage in the medium size range.
> 2. In strnlen remove the `1 << i` which was UB (`i` could go beyond
>    32/64)
> 3. Add timer for total benchmark runtime (useful for deciding about
>    tradeoff between coverage and runtime).

So this is only used for total runtime and won't be used for performance
comparison.  Will "time ./bench" be sufficient?

> ---
>  benchtests/bench-memchr.c    | 83 +++++++++++++++++++++++++-----------
>  benchtests/bench-rawmemchr.c | 36 ++++++++++++++--
>  benchtests/bench-strchr.c    | 42 +++++++++++++-----
>  benchtests/bench-strnlen.c   | 19 ++++++---
>  benchtests/bench-strrchr.c   | 33 +++++++++++++-
>  5 files changed, 166 insertions(+), 47 deletions(-)
>
> diff --git a/benchtests/bench-memchr.c b/benchtests/bench-memchr.c
> index 0facda2fa0..c4d758ae61 100644
> --- a/benchtests/bench-memchr.c
> +++ b/benchtests/bench-memchr.c
> @@ -126,9 +126,10 @@ do_test (json_ctx_t *json_ctx, size_t align, size_t pos, size_t len,
>  int
>  test_main (void)
>  {
> -  size_t i;
> +  size_t i, j, al, al_max;
>    int repeats;
>    json_ctx_t json_ctx;
> +  timing_t bench_start, bench_stop, bench_total_time;
>    test_init ();
>
>    json_init (&json_ctx, 0, stdout);
> @@ -147,35 +148,47 @@ test_main (void)
>
>    json_array_begin (&json_ctx, "results");
>
> +  TIMING_NOW (bench_start);
> +  al_max = 0;
> +#ifdef USE_AS_MEMRCHR
> +  al_max = getpagesize () / 2;
> +#endif
> +
>    for (repeats = 0; repeats < 2; ++repeats)
>      {
> -      for (i = 1; i < 8; ++i)
> +      for (al = 0; al <= al_max; al += getpagesize () / 2)
>         {
> -         do_test (&json_ctx, 0, 16 << i, 2048, 23, repeats);
> -         do_test (&json_ctx, i, 64, 256, 23, repeats);
> -         do_test (&json_ctx, 0, 16 << i, 2048, 0, repeats);
> -         do_test (&json_ctx, i, 64, 256, 0, repeats);
> -
> -         do_test (&json_ctx, getpagesize () - 15, 64, 256, 0, repeats);
> +         for (i = 1; i < 8; ++i)
> +           {
> +             do_test (&json_ctx, al, 16 << i, 2048, 23, repeats);
> +             do_test (&json_ctx, al + i, 64, 256, 23, repeats);
> +             do_test (&json_ctx, al, 16 << i, 2048, 0, repeats);
> +             do_test (&json_ctx, al + i, 64, 256, 0, repeats);
> +
> +             do_test (&json_ctx, al + getpagesize () - 15, 64, 256, 0,
> +                      repeats);
>  #ifdef USE_AS_MEMRCHR
> -         /* Also test the position close to the beginning for memrchr.  */
> -         do_test (&json_ctx, 0, i, 256, 23, repeats);
> -         do_test (&json_ctx, 0, i, 256, 0, repeats);
> -         do_test (&json_ctx, i, i, 256, 23, repeats);
> -         do_test (&json_ctx, i, i, 256, 0, repeats);
> +             /* Also test the position close to the beginning for memrchr.  */
> +             do_test (&json_ctx, al, i, 256, 23, repeats);
> +             do_test (&json_ctx, al, i, 256, 0, repeats);
> +             do_test (&json_ctx, al + i, i, 256, 23, repeats);
> +             do_test (&json_ctx, al + i, i, 256, 0, repeats);
>  #endif
> +           }
> +         for (i = 1; i < 8; ++i)
> +           {
> +             do_test (&json_ctx, al + i, i << 5, 192, 23, repeats);
> +             do_test (&json_ctx, al + i, i << 5, 192, 0, repeats);
> +             do_test (&json_ctx, al + i, i << 5, 256, 23, repeats);
> +             do_test (&json_ctx, al + i, i << 5, 256, 0, repeats);
> +             do_test (&json_ctx, al + i, i << 5, 512, 23, repeats);
> +             do_test (&json_ctx, al + i, i << 5, 512, 0, repeats);
> +
> +             do_test (&json_ctx, al + getpagesize () - 15, i << 5, 256, 23,
> +                      repeats);
> +           }
>         }
> -      for (i = 1; i < 8; ++i)
> -       {
> -         do_test (&json_ctx, i, i << 5, 192, 23, repeats);
> -         do_test (&json_ctx, i, i << 5, 192, 0, repeats);
> -         do_test (&json_ctx, i, i << 5, 256, 23, repeats);
> -         do_test (&json_ctx, i, i << 5, 256, 0, repeats);
> -         do_test (&json_ctx, i, i << 5, 512, 23, repeats);
> -         do_test (&json_ctx, i, i << 5, 512, 0, repeats);
> -
> -         do_test (&json_ctx, getpagesize () - 15, i << 5, 256, 23, repeats);
> -       }
> +
>        for (i = 1; i < 32; ++i)
>         {
>           do_test (&json_ctx, 0, i, i + 1, 23, repeats);
> @@ -207,11 +220,33 @@ test_main (void)
>           do_test (&json_ctx, 0, 2, i + 1, 0, repeats);
>  #endif
>         }
> +      for (al = 0; al <= al_max; al += getpagesize () / 2)
> +       {
> +         for (i = (16 / sizeof (CHAR)); i <= (8192 / sizeof (CHAR)); i += i)
> +           {
> +             for (j = 0; j <= (384 / sizeof (CHAR));
> +                  j += (32 / sizeof (CHAR)))
> +               {
> +                 do_test (&json_ctx, al, i + j, i, 23, repeats);
> +                 do_test (&json_ctx, al, i, i + j, 23, repeats);
> +                 if (j < i)
> +                   {
> +                     do_test (&json_ctx, al, i - j, i, 23, repeats);
> +                     do_test (&json_ctx, al, i, i - j, 23, repeats);
> +                   }
> +               }
> +           }
> +       }
> +
>  #ifndef USE_AS_MEMRCHR
>        break;
>  #endif
>      }
>
> +  TIMING_NOW (bench_stop);
> +  TIMING_DIFF (bench_total_time, bench_start, bench_stop);
> +  json_attr_double (&json_ctx, "benchtime", bench_total_time);
> +
>    json_array_end (&json_ctx);
>    json_attr_object_end (&json_ctx);
>    json_attr_object_end (&json_ctx);
> diff --git a/benchtests/bench-rawmemchr.c b/benchtests/bench-rawmemchr.c
> index b1803afc14..667ecd48f9 100644
> --- a/benchtests/bench-rawmemchr.c
> +++ b/benchtests/bench-rawmemchr.c
> @@ -70,7 +70,7 @@ do_test (json_ctx_t *json_ctx, size_t align, size_t pos, size_t len, int seek_ch
>    size_t i;
>    char *result;
>
> -  align &= 7;
> +  align &= getpagesize () - 1;
>    if (align + len >= page_size)
>      return;
>
> @@ -106,7 +106,7 @@ test_main (void)
>  {
>    json_ctx_t json_ctx;
>    size_t i;
> -
> +  timing_t bench_start, bench_stop, bench_total_time;
>    test_init ();
>
>    json_init (&json_ctx, 0, stdout);
> @@ -120,11 +120,12 @@ test_main (void)
>
>    json_array_begin (&json_ctx, "ifuncs");
>    FOR_EACH_IMPL (impl, 0)
> -      json_element_string (&json_ctx, impl->name);
> +    json_element_string (&json_ctx, impl->name);
>    json_array_end (&json_ctx);
>
>    json_array_begin (&json_ctx, "results");
>
> +  TIMING_NOW (bench_start);
>    for (i = 1; i < 7; ++i)
>      {
>        do_test (&json_ctx, 0, 16 << i, 2048, 23);
> @@ -137,6 +138,35 @@ test_main (void)
>        do_test (&json_ctx, 0, i, i + 1, 23);
>        do_test (&json_ctx, 0, i, i + 1, 0);
>      }
> +  for (; i < 256; i += 32)
> +    {
> +      do_test (&json_ctx, 0, i, i + 1, 23);
> +      do_test (&json_ctx, 0, i - 1, i, 23);
> +    }
> +  for (; i < 512; i += 64)
> +    {
> +      do_test (&json_ctx, 0, i, i + 1, 23);
> +      do_test (&json_ctx, 0, i - 1, i, 23);
> +    }
> +  for (; i < 1024; i += 128)
> +    {
> +      do_test (&json_ctx, 0, i, i + 1, 23);
> +      do_test (&json_ctx, 0, i - 1, i, 23);
> +    }
> +  for (; i < 2048; i += 256)
> +    {
> +      do_test (&json_ctx, 0, i, i + 1, 23);
> +      do_test (&json_ctx, 0, i - 1, i, 23);
> +    }
> +  for (; i < 4096; i += 512)
> +    {
> +      do_test (&json_ctx, 0, i, i + 1, 23);
> +      do_test (&json_ctx, 0, i - 1, i, 23);
> +    }
> +
> +  TIMING_NOW (bench_stop);
> +  TIMING_DIFF (bench_total_time, bench_start, bench_stop);
> +  json_attr_double (&json_ctx, "benchtime", bench_total_time);
>
>    json_array_end (&json_ctx);
>    json_attr_object_end (&json_ctx);
> diff --git a/benchtests/bench-strchr.c b/benchtests/bench-strchr.c
> index 54640bde7e..af325806ce 100644
> --- a/benchtests/bench-strchr.c
> +++ b/benchtests/bench-strchr.c
> @@ -287,8 +287,8 @@ int
>  test_main (void)
>  {
>    json_ctx_t json_ctx;
> -  size_t i;
> -
> +  size_t i, j;
> +  timing_t bench_start, bench_stop, bench_total_time;
>    test_init ();
>
>    json_init (&json_ctx, 0, stdout);
> @@ -307,6 +307,7 @@ test_main (void)
>
>    json_array_begin (&json_ctx, "results");
>
> +  TIMING_NOW (bench_start);
>    for (i = 1; i < 8; ++i)
>      {
>        do_test (&json_ctx, 0, 16 << i, 2048, SMALL_CHAR, MIDDLE_CHAR);
> @@ -367,15 +368,34 @@ test_main (void)
>        do_test (&json_ctx, 0, i, i + 1, 0, BIG_CHAR);
>      }
>
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.0);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.1);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.25);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.33);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.5);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.66);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.75);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 0.9);
> -  DO_RAND_TEST(&json_ctx, 0, 15, 16, 1.0);
> +  for (i = 16 / sizeof (CHAR); i <= 8192 / sizeof (CHAR); i += i)
> +    {
> +      for (j = 32 / sizeof (CHAR); j <= 320 / sizeof (CHAR);
> +          j += 32 / sizeof (CHAR))
> +       {
> +         do_test (&json_ctx, 0, i, i + j, 0, MIDDLE_CHAR);
> +         do_test (&json_ctx, 0, i + j, i, 0, MIDDLE_CHAR);
> +         if (i > j)
> +           {
> +             do_test (&json_ctx, 0, i, i - j, 0, MIDDLE_CHAR);
> +             do_test (&json_ctx, 0, i - j, i, 0, MIDDLE_CHAR);
> +           }
> +       }
> +    }
> +
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.0);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.1);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.25);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.33);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.5);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.66);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.75);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 0.9);
> +  DO_RAND_TEST (&json_ctx, 0, 15, 16, 1.0);
> +
> +  TIMING_NOW (bench_stop);
> +  TIMING_DIFF (bench_total_time, bench_start, bench_stop);
> +  json_attr_double (&json_ctx, "benchtime", bench_total_time);
>
>    json_array_end (&json_ctx);
>    json_attr_object_end (&json_ctx);
> diff --git a/benchtests/bench-strnlen.c b/benchtests/bench-strnlen.c
> index 13b46b3f57..c6281b6373 100644
> --- a/benchtests/bench-strnlen.c
> +++ b/benchtests/bench-strnlen.c
> @@ -117,7 +117,7 @@ test_main (void)
>  {
>    size_t i, j;
>    json_ctx_t json_ctx;
> -
> +  timing_t bench_start, bench_stop, bench_total_time;
>    test_init ();
>
>    json_init (&json_ctx, 0, stdout);
> @@ -136,6 +136,7 @@ test_main (void)
>
>    json_array_begin (&json_ctx, "results");
>
> +  TIMING_NOW (bench_start);
>    for (i = 0; i <= 1; ++i)
>      {
>        do_test (&json_ctx, i, 1, 128, MIDDLE_CHAR);
> @@ -195,23 +196,27 @@ test_main (void)
>      {
>        for (j = 0; j <= (704 / sizeof (CHAR)); j += (32 / sizeof (CHAR)))
>         {
> -         do_test (&json_ctx, 0, 1 << i, (i + j), BIG_CHAR);
>           do_test (&json_ctx, 0, i + j, i, BIG_CHAR);
> -
> -         do_test (&json_ctx, 64, 1 << i, (i + j), BIG_CHAR);
>           do_test (&json_ctx, 64, i + j, i, BIG_CHAR);
>
> +         do_test (&json_ctx, 0, i, i + j, BIG_CHAR);
> +         do_test (&json_ctx, 64, i, i + j, BIG_CHAR);
> +
>           if (j < i)
>             {
> -             do_test (&json_ctx, 0, 1 << i, i - j, BIG_CHAR);
>               do_test (&json_ctx, 0, i - j, i, BIG_CHAR);
> -
> -             do_test (&json_ctx, 64, 1 << i, i - j, BIG_CHAR);
>               do_test (&json_ctx, 64, i - j, i, BIG_CHAR);
> +
> +             do_test (&json_ctx, 0, i, i - j, BIG_CHAR);
> +             do_test (&json_ctx, 64, i, i - j, BIG_CHAR);
>             }
>         }
>      }
>
> +  TIMING_NOW (bench_stop);
> +  TIMING_DIFF (bench_total_time, bench_start, bench_stop);
> +  json_attr_double (&json_ctx, "benchtime", bench_total_time);
> +
>    json_array_end (&json_ctx);
>    json_attr_object_end (&json_ctx);
>    json_attr_object_end (&json_ctx);
> diff --git a/benchtests/bench-strrchr.c b/benchtests/bench-strrchr.c
> index 7cd2a15484..e6d8163047 100644
> --- a/benchtests/bench-strrchr.c
> +++ b/benchtests/bench-strrchr.c
> @@ -151,8 +151,9 @@ int
>  test_main (void)
>  {
>    json_ctx_t json_ctx;
> -  size_t i, j;
> +  size_t i, j, k;
>    int seek;
> +  timing_t bench_start, bench_stop, bench_total_time;
>
>    test_init ();
>    json_init (&json_ctx, 0, stdout);
> @@ -171,9 +172,10 @@ test_main (void)
>
>    json_array_begin (&json_ctx, "results");
>
> +  TIMING_NOW (bench_start);
>    for (seek = 0; seek <= 23; seek += 23)
>      {
> -      for (j = 1; j < 32; j += j)
> +      for (j = 1; j <= 256; j = (j * 4))
>         {
>           for (i = 1; i < 9; ++i)
>             {
> @@ -197,12 +199,39 @@ test_main (void)
>               do_test (&json_ctx, getpagesize () - i / 2 - 1, i, i + 1, seek,
>                        SMALL_CHAR, j);
>             }
> +
> +         for (i = (16 / sizeof (CHAR)); i <= (288 / sizeof (CHAR)); i += 32)
> +           {
> +             do_test (&json_ctx, 0, i - 16, i, seek, SMALL_CHAR, j);
> +             do_test (&json_ctx, 0, i, i + 16, seek, SMALL_CHAR, j);
> +           }
> +
> +         for (i = (16 / sizeof (CHAR)); i <= (2048 / sizeof (CHAR)); i += i)
> +           {
> +             for (k = 0; k <= (288 / sizeof (CHAR));
> +                  k += (48 / sizeof (CHAR)))
> +               {
> +                 do_test (&json_ctx, 0, k, i, seek, SMALL_CHAR, j);
> +                 do_test (&json_ctx, 0, i, i + k, seek, SMALL_CHAR, j);
> +
> +                 if (k < i)
> +                   {
> +                     do_test (&json_ctx, 0, i - k, i, seek, SMALL_CHAR, j);
> +                     do_test (&json_ctx, 0, k, i - k, seek, SMALL_CHAR, j);
> +                     do_test (&json_ctx, 0, i, i - k, seek, SMALL_CHAR, j);
> +                   }
> +               }
> +           }
> +
>           if (seek == 0)
>             {
>               break;
>             }
>         }
>      }
> +  TIMING_NOW (bench_stop);
> +  TIMING_DIFF (bench_total_time, bench_start, bench_stop);
> +  json_attr_double (&json_ctx, "benchtime", bench_total_time);
>
>    json_array_end (&json_ctx);
>    json_attr_object_end (&json_ctx);
> --
> 2.34.1
>


-- 
H.J.

  reply	other threads:[~2022-10-18 21:01 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18  2:48 [PATCH v1 1/7] x86: Optimize memchr-evex.S and implement with VMM headers Noah Goldstein
2022-10-18  2:48 ` [PATCH v1 2/7] x86: Shrink / minorly optimize strchr-evex " Noah Goldstein
2022-10-18  2:51   ` Noah Goldstein
2022-10-18  2:48 ` [PATCH v1 3/7] x86: Optimize strnlen-evex.S " Noah Goldstein
2022-10-18  2:51   ` Noah Goldstein
2022-10-18  2:48 ` [PATCH v1 4/7] x86: Optimize memrchr-evex.S Noah Goldstein
2022-10-18  2:51   ` Noah Goldstein
2022-10-18  2:48 ` [PATCH v1 5/7] x86: Optimize strrchr-evex.S and implement with VMM headers Noah Goldstein
2022-10-18  2:52   ` Noah Goldstein
2022-10-18  2:49 ` [PATCH v1 6/7] x86: Add support for VEC_SIZE == 64 in strcmp-evex.S impl Noah Goldstein
2022-10-20  2:15   ` [PATCH v4] " Noah Goldstein
2022-10-20  3:46     ` H.J. Lu
2022-10-18  2:49 ` [PATCH v1 7/7] Bench: Improve benchtests for memchr, strchr, strnlen, strrchr Noah Goldstein
2022-10-18 21:00   ` H.J. Lu [this message]
2022-10-18 21:05     ` Noah Goldstein
2022-10-18 21:53       ` H.J. Lu
2022-10-18 22:58         ` Noah Goldstein
2022-10-18  2:50 ` [PATCH v1 1/7] x86: Optimize memchr-evex.S and implement with VMM headers Noah Goldstein
2022-10-18 23:19 ` [PATCH v2 " Noah Goldstein
2022-10-18 23:19   ` [PATCH v2 2/7] x86: Shrink / minorly optimize strchr-evex " Noah Goldstein
2022-10-18 23:19   ` [PATCH v2 3/7] x86: Optimize strnlen-evex.S " Noah Goldstein
2022-10-18 23:19   ` [PATCH v2 4/7] x86: Optimize memrchr-evex.S Noah Goldstein
2022-10-18 23:19   ` [PATCH v2 5/7] x86: Optimize strrchr-evex.S and implement with VMM headers Noah Goldstein
2022-10-18 23:19   ` [PATCH v2 6/7] x86: Add support for VEC_SIZE == 64 in strcmp-evex.S impl Noah Goldstein
2022-10-18 23:19   ` [PATCH v2 7/7] Bench: Improve benchtests for memchr, strchr, strnlen, strrchr Noah Goldstein
2022-10-19  0:01     ` H.J. Lu
2022-10-19  0:44       ` Noah Goldstein
2022-10-19  0:44 ` [PATCH v3 1/7] x86: Optimize memchr-evex.S and implement with VMM headers Noah Goldstein
2022-10-19  0:44   ` [PATCH v3 2/7] x86: Shrink / minorly optimize strchr-evex " Noah Goldstein
2022-10-19 16:53     ` H.J. Lu
2022-10-19  0:44   ` [PATCH v3 3/7] x86: Optimize strnlen-evex.S " Noah Goldstein
2022-10-19 16:57     ` H.J. Lu
2022-10-19  0:44   ` [PATCH v3 4/7] x86: Optimize memrchr-evex.S Noah Goldstein
2022-10-19 16:58     ` H.J. Lu
2022-10-19  0:44   ` [PATCH v3 5/7] x86: Optimize strrchr-evex.S and implement with VMM headers Noah Goldstein
2022-10-19 16:58     ` H.J. Lu
2022-10-19  0:44   ` [PATCH v3 6/7] x86: Add support for VEC_SIZE == 64 in strcmp-evex.S impl Noah Goldstein
2022-10-19 16:59     ` H.J. Lu
2022-10-19  0:44   ` [PATCH v3 7/7] Bench: Improve benchtests for memchr, strchr, strnlen, strrchr Noah Goldstein
2022-10-19 17:00     ` H.J. Lu
2022-10-19 16:52   ` [PATCH v3 1/7] x86: Optimize memchr-evex.S and implement with VMM headers H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMe9rOrWmpKw3dy_A6CjsOwFEYBgUE+ajC_2OUJGQUMR=Pqd5g@mail.gmail.com' \
    --to=hjl.tools@gmail.com \
    --cc=carlos@systemhalted.org \
    --cc=goldstein.w.n@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).