Re: [PATCH v2 1/6] x86-64: Create microbenchmark infrastructure for libmvec

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: "H.J. Lu" <hjl.tools@gmail.com>
To: Sunil Pandey <skpgkp2@gmail.com>
Cc: Noah Goldstein <goldstein.w.n@gmail.com>,
	GNU C Library <libc-alpha@sourceware.org>
Subject: Re: [PATCH v2 1/6] x86-64: Create microbenchmark infrastructure for libmvec
Date: Sat, 13 Nov 2021 11:47:52 -0800	[thread overview]
Message-ID: <CAMe9rOqVF4-ocCw6KYeiaAmM67FWKDEuYHiRgHO+uvchfFPyDg@mail.gmail.com> (raw)
In-Reply-To: <CAMAf5_dBK1msQ+tUcJiNE45n7ZzOR8C53y=E9iLK1NrVrSCFsw@mail.gmail.com>

On Fri, Nov 12, 2021 at 2:51 PM Sunil Pandey via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> On Fri, Nov 12, 2021 at 1:02 PM Noah Goldstein <goldstein.w.n@gmail.com>
> wrote:
>
> > On Fri, Nov 12, 2021 at 1:19 PM Sunil K Pandey via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> > >
> > > Add python script to generate libmvec microbenchmark from the input
> > > values for each libmvec function using skeleton benchmark template.
> > >
> > > Creates double and float benchmarks with vector length 1, 2, 4, 8,
> > > and 16 for each libmvec function.  Vector length 1 corresponds to
> > > scalar version of function and is included for vector function perf
> > > comparison.
> > > ---
> > >  sysdeps/x86_64/fpu/Makeconfig               |  35 ++
> > >  sysdeps/x86_64/fpu/Makefile                 |  40 ++
> > >  sysdeps/x86_64/fpu/bench-libmvec-skeleton.c | 104 +++++
> > >  sysdeps/x86_64/fpu/scripts/bench_libmvec.py | 464 ++++++++++++++++++++
> > >  4 files changed, 643 insertions(+)
> > >  create mode 100644 sysdeps/x86_64/fpu/bench-libmvec-skeleton.c
> > >  create mode 100755 sysdeps/x86_64/fpu/scripts/bench_libmvec.py
> > >
> > > diff --git a/sysdeps/x86_64/fpu/Makeconfig
> > b/sysdeps/x86_64/fpu/Makeconfig
> > > index 24aaee1a43..503e9b5ffa 100644
> > > --- a/sysdeps/x86_64/fpu/Makeconfig
> > > +++ b/sysdeps/x86_64/fpu/Makeconfig
> > > @@ -29,6 +29,23 @@ libmvec-funcs = \
> > >    sin \
> > >    sincos \
> > >
> > > +# Define libmvec function for benchtests directory.
> > > +libmvec-bench-funcs = \
> > > +
> > > +bench-libmvec-double = \
> > > +  $(addprefix double-vlen1-, $(libmvec-bench-funcs)) \
> > > +  $(addprefix double-vlen2-, $(libmvec-bench-funcs)) \
> > > +  $(addprefix double-vlen4-, $(libmvec-bench-funcs)) \
> > > +  $(addprefix double-vlen4-avx2-, $(libmvec-bench-funcs)) \
> > > +  $(addprefix double-vlen8-, $(libmvec-bench-funcs)) \
> > > +
> > > +bench-libmvec-float = \
> > > +  $(addsuffix f, $(addprefix float-vlen1-, $(libmvec-bench-funcs))) \
> > > +  $(addsuffix f, $(addprefix float-vlen4-, $(libmvec-bench-funcs))) \
> > > +  $(addsuffix f, $(addprefix float-vlen8-, $(libmvec-bench-funcs))) \
> > > +  $(addsuffix f, $(addprefix float-vlen8-avx2-,
> > $(libmvec-bench-funcs))) \
> > > +  $(addsuffix f, $(addprefix float-vlen16-, $(libmvec-bench-funcs))) \
> > > +
> > >  # The base libmvec ABI tests.
> > >  libmvec-abi-func-tests = \
> > >    $(addprefix test-double-libmvec-,$(libmvec-funcs)) \
> > > @@ -83,5 +100,23 @@ $(common-objpfx)libmvec.mk:
> > $(common-objpfx)config.make
> > >            echo "  \$$(float-vlen16-arch-ext-cflags)"; \
> > >            echo; \
> > >          done; \
> > > +        echo "endif"; \
> > > +        echo "ifeq (\$$(subdir),benchtests)"; \
> > > +        for t in $(libmvec-bench-funcs); do \
> > > +          echo "CFLAGS-bench-double-vlen4-$$t.c = \\"; \
> > > +          echo "  \$$(double-vlen4-arch-ext-cflags)"; \
> > > +          echo "CFLAGS-bench-double-vlen4-avx2-$$t.c = \\"; \
> > > +          echo "  \$$(double-vlen4-arch-ext2-cflags)"; \
> > > +          echo "CFLAGS-bench-double-vlen8-$$t.c = \\"; \
> > > +          echo "  \$$(double-vlen8-arch-ext-cflags)"; \
> > > +          echo; \
> > > +          echo "CFLAGS-bench-float-vlen8-$${t}f.c = \\"; \
> > > +          echo "  \$$(float-vlen8-arch-ext-cflags)"; \
> > > +          echo "CFLAGS-bench-float-vlen8-avx2-$${t}f.c = \\"; \
> > > +          echo "  \$$(float-vlen8-arch-ext2-cflags)"; \
> > > +          echo "CFLAGS-bench-float-vlen16-$${t}f.c = \\"; \
> > > +          echo "  \$$(float-vlen16-arch-ext-cflags)"; \
> > > +          echo; \
> > > +        done; \
> > >          echo "endif") > $@T
> > >         mv -f $@T $@
> > > diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile
> > > index d172ae815d..9fb587cf8f 100644
> > > --- a/sysdeps/x86_64/fpu/Makefile
> > > +++ b/sysdeps/x86_64/fpu/Makefile
> > > @@ -72,3 +72,43 @@ ifeq
> > ($(subdir)$(config-cflags-mprefer-vector-width),mathyes)
> > >  # performance of sin and cos by more than 40% on Skylake.
> > >  CFLAGS-branred.c = -mprefer-vector-width=128
> > >  endif
> > > +
> > > +ifeq ($(subdir),benchtests)
> > > +double-vlen4-arch-ext-cflags = -mavx
> > > +double-vlen4-arch-ext2-cflags = -mavx2
> > > +double-vlen8-arch-ext-cflags = -mavx512f
> > > +
> > > +float-vlen8-arch-ext-cflags = -mavx
> > > +float-vlen8-arch-ext2-cflags = -mavx2
> > > +float-vlen16-arch-ext-cflags = -mavx512f
> > > +
> > > +bench-libmvec := $(bench-libmvec-double) $(bench-libmvec-float)
> > > +
> > > +ifeq (${BENCHSET},)
> > > +bench += $(bench-libmvec)
> > > +endif
> > > +
> > > +ifeq (${STATIC-BENCHTESTS},yes)
> > > +libmvec-benchtests = $(common-objpfx)mathvec/libmvec.a
> > $(common-objpfx)math/libm.a
> > > +else
> > > +libmvec-benchtests = $(libmvec) $(libm)
> > > +endif
> > > +
> > > +$(addprefix $(objpfx)bench-,$(bench-libmvec-double)):
> > $(libmvec-benchtests)
> > > +$(addprefix $(objpfx)bench-,$(bench-libmvec-float)):
> > $(libmvec-benchtests)
> > > +bench-libmvec-deps = $(..)sysdeps/x86_64/fpu/bench-libmvec-skeleton.c
> > bench-timing.h Makefile
> > > +
> > > +$(objpfx)bench-float-%.c: $(bench-libmvec-deps)
> > > +       { if [ -n "$($*-INCLUDE)" ]; then \
> > > +         cat $($*-INCLUDE); \
> > > +       fi; \
> > > +       $(PYTHON) $(..)sysdeps/x86_64/fpu/scripts/bench_libmvec.py
> > $(basename $(@F)); } > $@-tmp
> > > +       mv -f $@-tmp $@
> > > +
> > > +$(objpfx)bench-double-%.c: $(bench-libmvec-deps)
> > > +       { if [ -n "$($*-INCLUDE)" ]; then \
> > > +         cat $($*-INCLUDE); \
> > > +       fi; \
> > > +       $(PYTHON) $(..)sysdeps/x86_64/fpu/scripts/bench_libmvec.py
> > $(basename $(@F)); } > $@-tmp
> > > +       mv -f $@-tmp $@
> > > +endif
> > > diff --git a/sysdeps/x86_64/fpu/bench-libmvec-skeleton.c
> > b/sysdeps/x86_64/fpu/bench-libmvec-skeleton.c
> > > new file mode 100644
> > > index 0000000000..d56a0c4462
> > > --- /dev/null
> > > +++ b/sysdeps/x86_64/fpu/bench-libmvec-skeleton.c
> > > @@ -0,0 +1,104 @@
> > > +/* Skeleton for libmvec benchmark programs.
> > > +   Copyright (C) 2021 Free Software Foundation, Inc.
> > > +   This file is part of the GNU C Library.
> > > +
> > > +   The GNU C Library is free software; you can redistribute it and/or
> > > +   modify it under the terms of the GNU Lesser General Public
> > > +   License as published by the Free Software Foundation; either
> > > +   version 2.1 of the License, or (at your option) any later version.
> > > +
> > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > +   Lesser General Public License for more details.
> > > +
> > > +   You should have received a copy of the GNU Lesser General Public
> > > +   License along with the GNU C Library; if not, see
> > > +   <https://www.gnu.org/licenses/>.  */
> > > +
> > > +#include <string.h>
> > > +#include <stdint.h>
> > > +#include <stdbool.h>
> > > +#include <stdio.h>
> > > +#include <time.h>
> > > +#include <inttypes.h>
> > > +#include <bench-timing.h>
> > > +#include <json-lib.h>
> > > +#include <bench-util.h>
> > > +
> > > +#include <bench-util.c>
> > > +#include <math-tests-arch.h>
> > > +#define D_ITERS 10000
> > > +
> > > +int
> > > +main (int argc, char **argv)
> > > +{
> > > +  unsigned long i, k;
> > > +  timing_t start, end;
> > > +  json_ctx_t json_ctx;
> > > +
> > > +#if defined REQUIRE_AVX
> > > +  if (!CPU_FEATURE_ACTIVE (AVX))
> > > +    {
> > > +      printf ("AVX not supported.\n");
> > > +      return 0;
> > > +    }
> > > +#elif defined REQUIRE_AVX2
> > > +  if (!CPU_FEATURE_ACTIVE (AVX2))
> > > +    {
> > > +      printf ("AVX2 not supported.\n");
> > > +      return 0;
> > > +    }
> > > +#elif defined REQUIRE_AVX512F
> > > +  if (!CPU_FEATURE_ACTIVE (AVX512F))
> > > +    {
> > > +      printf ("AVX512F not supported.\n");
> > > +      return 0;
> > > +    }
> > > +#endif
> > > +
> > > +  bench_start ();
> > > +
> > > +#ifdef BENCH_INIT
> > > +  BENCH_INIT ();
> > > +#endif
> > > +
> > > +  json_init (&json_ctx, 2, stdout);
> > > +
> > > +  /* Begin function.  */
> > > +  json_attr_object_begin (&json_ctx, FUNCNAME);
> > > +
> > > +  for (int v = 0; v < NUM_VARIANTS; v++)
> > > +    {
> > > +      double d_total_time = 0;
> > > +      uint64_t cur;
> >
> > Think these should also be type `timing_t`
> >
>
> I do not see a difference if I use timing_t or uint64_t. In any case
> variable cur stores the
> difference between start and end time, not time.
>
>
> >
> > > +      for (k = 0; k < D_ITERS; k++)
> > > +       {
> > > +         TIMING_NOW (start);
> > > +         for (i = 0; i < NUM_SAMPLES (v); i++)
> >
> > What is the rationale for both `D_ITERS` and `NUM_SAMPLES (v)`? Why not
> > one loop that iterates for `D_ITERS * NUM_SAMPLES (v)`?
> >
>
> D_ITERS define how many times each variant full data set will run.
> NUM_SAMPLES(v)
> represent the number of data sets in variant v. Index v and i select, i'th
> data set from
> variant v and call vector function.  Having two loops simplifies logic.
>
>
> > > +           BENCH_FUNC (v, i);
> > > +         TIMING_NOW (end);
> > > +
> > > +         TIMING_DIFF (cur, start, end);
> > > +
> > > +         d_total_time += cur;
> >.> > Think this should be `TIMING_ACCUM(d_total_time, cur)`.
> >
>
> Not much difference, if I use TIMING_ACCUM or simply add cur to
> d_total_time.
>

Please use TIMING_ACCUM (d_total_time, cur) to be consistent with
TIMING_DIFF (cur, start, end).

Thanks.


-- 
H.J.

next prev parent reply	other threads:[~2021-11-13 19:48 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10  3:07 [PATCH 0/6] Implement microbenchmark " Sunil K Pandey
2021-11-10  3:07 ` [PATCH 1/6] x86-64: Create microbenchmark infrastructure " Sunil K Pandey
2021-11-10  3:53   ` Noah Goldstein
2021-11-11 18:34     ` Sunil Pandey
2021-11-12 19:17       ` [PATCH v2 0/6] Implement microbenchmark " Sunil K Pandey
2021-11-12 19:17         ` [PATCH v2 1/6] x86-64: Create microbenchmark infrastructure " Sunil K Pandey
2021-11-12 21:02           ` Noah Goldstein
2021-11-12 22:49             ` Sunil Pandey
2021-11-13 19:47               ` H.J. Lu [this message]
2021-11-14  2:59                 ` Sunil Pandey
2021-11-15 21:06             ` [PATCH v3 0/6] Implement microbenchmark " Sunil K Pandey
2021-11-15 21:06               ` [PATCH v3 1/6] x86-64: Create microbenchmark infrastructure " Sunil K Pandey
2021-11-16 17:21                 ` H.J. Lu
2021-11-16 18:37                   ` [PATCH] " Sunil K Pandey
2021-11-16 18:40                     ` H.J. Lu
2021-11-15 21:06               ` [PATCH v3 2/6] x86-64: Add vector cos/cosf to libmvec microbenchmark Sunil K Pandey
2021-11-15 21:06               ` [PATCH v3 3/6] x86-64: Add vector exp/expf " Sunil K Pandey
2021-11-15 21:06               ` [PATCH v3 4/6] x86-64: Add vector log/logf " Sunil K Pandey
2021-11-15 21:06               ` [PATCH v3 5/6] x86-64: Add vector pow/powf " Sunil K Pandey
2021-11-15 21:06               ` [PATCH v3 6/6] x86-64: Add vector sin/sinf " Sunil K Pandey
2021-11-12 19:17         ` [PATCH v2 2/6] x86-64: Add vector cos/cosf " Sunil K Pandey
2021-11-12 19:17         ` [PATCH v2 3/6] x86-64: Add vector exp/expf " Sunil K Pandey
2021-11-12 19:17         ` [PATCH v2 4/6] x86-64: Add vector log/logf " Sunil K Pandey
2021-11-12 23:18           ` Joseph Myers
2021-11-13  1:37             ` Sunil Pandey
2021-11-13  1:44               ` Joseph Myers
2021-11-13  6:14                 ` Sunil Pandey
2021-11-16  0:12                   ` Joseph Myers
2021-11-23 17:40                     ` [PATCH v4 0/5] Add vector math functions to microbenchmark Sunil K Pandey
2021-11-23 17:40                       ` [PATCH v4 1/5] x86-64: Add vector cos/cosf to libmvec microbenchmark Sunil K Pandey
2021-11-24 12:22                         ` H.J. Lu
2021-11-23 17:40                       ` [PATCH v4 2/5] x86-64: Add vector exp/expf " Sunil K Pandey
2021-11-24 12:24                         ` H.J. Lu
2021-11-23 17:40                       ` [PATCH v4 3/5] x86-64: Add vector log/logf " Sunil K Pandey
2021-11-24 12:25                         ` H.J. Lu
2021-11-23 17:40                       ` [PATCH v4 4/5] x86-64: Add vector pow/powf " Sunil K Pandey
2021-11-24 12:26                         ` H.J. Lu
2021-11-23 17:40                       ` [PATCH v4 5/5] x86-64: Add vector sin/sinf " Sunil K Pandey
2021-11-24 12:30                         ` H.J. Lu
2021-11-23 18:30                     ` [PATCH v2 4/6] x86-64: Add vector log/logf " Sunil Pandey
2021-11-23 22:13                       ` Joseph Myers
2021-11-12 19:17         ` [PATCH v2 5/6] x86-64: Add vector pow/powf " Sunil K Pandey
2021-11-12 19:18         ` [PATCH v2 6/6] x86-64: Add vector sin/sinf " Sunil K Pandey
2021-11-10  3:07 ` [PATCH 2/6] x86-64: Add cos/cosf " Sunil K Pandey
2021-11-10  3:07 ` [PATCH 3/6] x86-64: Add exp/expf " Sunil K Pandey
2021-11-10  3:07 ` [PATCH 4/6] x86-64: Add log/logf " Sunil K Pandey
2021-11-10  3:07 ` [PATCH 5/6] x86-64: Add pow/powf " Sunil K Pandey
2021-11-10 20:27   ` Joseph Myers
2021-11-11  3:31     ` Sunil Pandey
2021-11-11 10:39       ` Szabolcs Nagy
2021-11-11 17:40         ` Sunil Pandey
2021-11-11 18:51           ` Sunil Pandey
2021-11-10  3:07 ` [PATCH 6/6] x86-64: Add sin/sinf " Sunil K Pandey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOqVF4-ocCw6KYeiaAmM67FWKDEuYHiRgHO+uvchfFPyDg@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=goldstein.w.n@gmail.com \
    --cc=libc-alpha@sourceware.org \
    --cc=skpgkp2@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).