public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jason Merrill <jason@redhat.com>
To: gcc-patches List <gcc-patches@gcc.gnu.org>
Cc: "libstdc++" <libstdc++@gcc.gnu.org>,
	libc-alpha@sourceware.org,
	 "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
Subject: Re: [PATCH] c++: implement C++17 hardware interference size
Date: Thu, 15 Jul 2021 22:41:17 -0400	[thread overview]
Message-ID: <CADzB+2ktqA=nzYS5zJAxH9TfFexP-OzZw8SoPOd907-qxVbWpg@mail.gmail.com> (raw)
In-Reply-To: <20210716023656.670004-1-jason@redhat.com>

Adding CCs that got lost in the initial mail.

On Thu, Jul 15, 2021 at 10:36 PM Jason Merrill <jason@redhat.com> wrote:

> The last missing piece of the C++17 standard library is the hardware
> intereference size constants.  Much of the delay in implementing these has
> been due to uncertainty about what the right values are, and even whether
> there is a single constant value that is suitable; the destructive
> interference size is intended to be used in structure layout, so program
> ABIs will depend on it.
>
> In principle, both of these values should be the same as the target's L1
> cache line size.  When compiling for a generic target that is intended to
> support a range of target CPUs with different cache line sizes, the
> constructive size should probably be the minimum size, and the destructive
> size the maximum, unless you are constrained by ABI compatibility with
> previous code.
>
> JF Bastien's implementation proposal is summarized at
> https://github.com/itanium-cxx-abi/cxx-abi/issues/74
>
> I implement this by adding new --params for the two sizes.  Targets need to
> override these values in targetm.target_option.override() to support the
> feature.
>
> 64 bytes still seems correct for the x86 family.
>
> I'm not sure why he said 64/64 for 32-bit ARM, since the Cortex A9 has a
> 32-byte cache line, and that seems to be the only ARM_PREFETCH_BENEFICIAL
> target, so I'd think 32/64 would make more sense.
>
> He proposed 64/128 for AArch64, but since the A64FX now has a 256B cache
> line, I've changed that to 64/256.  Does that seem right?
>
> Currently the patch does not adjust the values based on -march, as in JF's
> proposal.  I'll need more guidance from the ARM/AArch64 maintainers about
> how to go about that.  --param l1-cache-line-size is set based on -mtune,
> but I don't think we want -mtune to change these ABI-affecting values.  Are
> there -march values for which a smaller range than 64-256 makes sense?
>
> gcc/ChangeLog:
>
>         * params.opt: Add destructive-interference-size and
>         constructive-interference-size.
>         * doc/invoke.texi: Document them.
>         * config/aarch64/aarch64.c (aarch64_override_options_internal):
>         Set them.
>         * config/arm/arm.c (arm_option_override): Set them.
>         * config/i386/i386-options.c (ix86_option_override_internal):
>         Set them.
>
> gcc/c-family/ChangeLog:
>
>         * c.opt: Add -Winterference-size.
>         * c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE
>         and __GCC_CONSTRUCTIVE_SIZE.
>
> gcc/cp/ChangeLog:
>
>         * decl.c (cxx_init_decl_processing): Check
>         --param *-interference-size values.
>
> libstdc++-v3/ChangeLog:
>
>         * include/std/version: Define __cpp_lib_hardware_interference_size.
>         * libsupc++/new: Define hardware interference size variables.
>
> gcc/testsuite/ChangeLog:
>
>         * g++.target/aarch64/interference.C: New test.
>         * g++.target/arm/interference.C: New test.
>         * g++.target/i386/interference.C: New test.
> ---
>  gcc/doc/invoke.texi                           | 22 ++++++++++++++++++
>  gcc/c-family/c.opt                            |  5 ++++
>  gcc/params.opt                                | 15 ++++++++++++
>  gcc/c-family/c-cppbuiltin.c                   | 12 ++++++++++
>  gcc/config/aarch64/aarch64.c                  |  9 ++++++++
>  gcc/config/arm/arm.c                          |  6 +++++
>  gcc/config/i386/i386-options.c                |  6 +++++
>  gcc/cp/decl.c                                 | 23 +++++++++++++++++++
>  .../g++.target/aarch64/interference.C         |  9 ++++++++
>  gcc/testsuite/g++.target/arm/interference.C   |  9 ++++++++
>  gcc/testsuite/g++.target/i386/interference.C  |  8 +++++++
>  libstdc++-v3/include/std/version              |  3 +++
>  libstdc++-v3/libsupc++/new                    | 10 ++++++--
>  13 files changed, 135 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/aarch64/interference.C
>  create mode 100644 gcc/testsuite/g++.target/arm/interference.C
>  create mode 100644 gcc/testsuite/g++.target/i386/interference.C
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index ea8812425e9..f93cb7a20f7 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13857,6 +13857,28 @@ prefetch hints can be issued for any constant
> stride.
>
>  This setting is only useful for strides that are known and constant.
>
> +@item destructive_interference_size
> +@item constructive_interference_size
> +The values for the C++17 variables
> +@code{std::hardware_destructive_interference_size} and
> +@code{std::hardware_constructive_interference_size}.  The destructive
> +interference size is the minimum recommended offset between two
> +independent concurrently-accessed objects; the constructive
> +interference size is the maximum recommended size of contiguous memory
> +accessed together.  Typically both will be the size of an L1 cache
> +line for the target, in bytes.  If the target can have a range of L1
> +cache line sizes, typically the constructive interference size will be
> +the small end of the range and the destructive size will be the large
> +end.
> +
> +These values, particularly the destructive size, are intended to be
> +used for layout, and thus have ABI impact.  The default values can
> +change with @samp{-march}, and users should be aware of this and
> +perhaps specify these values directly if they need to be ABI
> +compatible with code that was compiled with a different @samp{-march}.
> +Changing the default values for a generic target should be done
> +cautiously.
> +
>  @item loop-interchange-max-num-stmts
>  The maximum number of stmts in a loop to be interchanged.
>
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 91929706aff..0398faf430a 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -722,6 +722,11 @@ Winit-list-lifetime
>  C++ ObjC++ Var(warn_init_list) Warning Init(1)
>  Warn about uses of std::initializer_list that can result in dangling
> pointers.
>
> +Winterference-size
> +C++ ObjC++ Var(warn_interference_size) Warning Init(1)
> +Warn about nonsensical values of --param destructive-interference-size or
> +constructive-interference-size.
> +
>  Wimplicit
>  C ObjC Var(warn_implicit) Warning LangEnabledBy(C ObjC,Wall)
>  Warn about implicit declarations.
> diff --git a/gcc/params.opt b/gcc/params.opt
> index 92b003e38cb..a81a3ec82f1 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -358,6 +358,21 @@ The maximum code size growth ratio when expanding
> into a jump table (in percent)
>  Common Joined UInteger Var(param_l1_cache_line_size) Init(32) Param
> Optimization
>  The size of L1 cache line.
>
> +-param=destructive-interference-size=
> +Common Joined UInteger Var(param_destruct_interfere_size) Init(0) Param
> Optimization
> +The minimum recommended offset between two concurrently-accessed objects
> to
> +avoid additional performance degradation due to contention introduced by
> the
> +implementation.  Typically the L1 cache line size, but can be larger to
> +accommodate a variety of target processors with different cache line
> sizes.
> +C++17 code might use this value in structure layout.
> +
> +-param=constructive-interference-size=
> +Common Joined UInteger Var(param_construct_interfere_size) Init(0) Param
> Optimization
> +The maximum recommended size of contiguous memory occupied by two objects
> +accessed with temporal locality by concurrent threads.  Typically the L1
> cache
> +line size, but can be smaller to accommodate a variety of target
> processors with
> +different cache line sizes.
> +
>  -param=l1-cache-size=
>  Common Joined UInteger Var(param_l1_cache_size) Init(64) Param
> Optimization
>  The size of L1 cache.
> diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
> index f79f939bd10..a7bf2544533 100644
> --- a/gcc/c-family/c-cppbuiltin.c
> +++ b/gcc/c-family/c-cppbuiltin.c
> @@ -741,6 +741,18 @@ cpp_atomic_builtins (cpp_reader *pfile)
>    builtin_define_with_int_value ("__GCC_ATOMIC_TEST_AND_SET_TRUEVAL",
>                                  targetm.atomic_test_and_set_trueval);
>
> +  /* Macros for C++17 hardware interference size constants.  Either both
> or
> +     neither should be set.  */
> +  gcc_assert (!param_destruct_interfere_size
> +             == !param_construct_interfere_size);
> +  if (param_destruct_interfere_size)
> +    {
> +      builtin_define_with_int_value ("__GCC_DESTRUCTIVE_SIZE",
> +                                    param_destruct_interfere_size);
> +      builtin_define_with_int_value ("__GCC_CONSTRUCTIVE_SIZE",
> +                                    param_construct_interfere_size);
> +    }
> +
>    /* ptr_type_node can't be used here since ptr_mode is only set when
>       toplev calls backend_init which is not done with -E  or pch.  */
>    psize = POINTER_SIZE_UNITS;
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index f5b25a7f704..b172fcdc93e 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -16297,6 +16297,15 @@ aarch64_override_options_internal (struct
> gcc_options *opts)
>      SET_OPTION_IF_UNSET (opts, &global_options_set,
>                          param_l1_cache_line_size,
>                          aarch64_tune_params.prefetch->l1_cache_line_size);
> +  /* For a generic AArch64 target, cover the current range of cache line
> +     sizes.  */
> +  SET_OPTION_IF_UNSET (opts, &global_options_set,
> +                      param_destruct_interfere_size,
> +                      256);
> +  SET_OPTION_IF_UNSET (opts, &global_options_set,
> +                      param_construct_interfere_size,
> +                      64);
> +
>    if (aarch64_tune_params.prefetch->l2_cache_size >= 0)
>      SET_OPTION_IF_UNSET (opts, &global_options_set,
>                          param_l2_cache_size,
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 6d781e23ee9..edfa2ad3426 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3656,6 +3656,12 @@ arm_option_override (void)
>      SET_OPTION_IF_UNSET (&global_options, &global_options_set,
>                          param_l1_cache_line_size,
>                          current_tune->prefetch.l1_cache_line_size);
> +  /* For a generic ARM target, JF Bastien proposed using 64 for both.
> +     ??? Why not 32 for constructive?  */
> +  SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> +                      param_destruct_interfere_size, 64);
> +  SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> +                      param_construct_interfere_size, 64);
>    if (current_tune->prefetch.l1_cache_size >= 0)
>      SET_OPTION_IF_UNSET (&global_options, &global_options_set,
>                          param_l1_cache_size,
> diff --git a/gcc/config/i386/i386-options.c
> b/gcc/config/i386/i386-options.c
> index 7cba655595e..3b1b3c838f8 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -2571,6 +2571,12 @@ ix86_option_override_internal (bool main_args_p,
>    SET_OPTION_IF_UNSET (opts, opts_set, param_l2_cache_size,
>                        ix86_tune_cost->l2_cache_size);
>
> +  /* 64B is the accepted value for these for all x86.  */
> +  SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> +                      param_destruct_interfere_size, 64);
> +  SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> +                      param_construct_interfere_size, 64);
> +
>    /* Enable sw prefetching at -O3 for CPUS that prefetching is helpful.
> */
>    if (opts->x_flag_prefetch_loop_arrays < 0
>        && HAVE_prefetch
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 01d64a16125..880fb8948a9 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -4732,6 +4732,29 @@ cxx_init_decl_processing (void)
>    /* Show we use EH for cleanups.  */
>    if (flag_exceptions)
>      using_eh_for_cleanups ();
> +
> +  /* Check that the hardware interference sizes are at least
> +     alignof(max_align_t), as required by the standard.  */
> +  if (param_destruct_interfere_size)
> +    {
> +      int max_align = max_align_t_align () / BITS_PER_UNIT;
> +      if (param_destruct_interfere_size < max_align)
> +       error ("%<--param destructive-interference-size=%d%> is less than "
> +              "%d", param_destruct_interfere_size, max_align);
> +      else if (param_destruct_interfere_size < param_l1_cache_line_size)
> +       warning (OPT_Winterference_size,
> +                "%<--param destructive-interference-size=%d%> "
> +                "is less than %<--param l1-cache-line-size=%d%>",
> +                param_destruct_interfere_size, param_l1_cache_line_size);
> +      if (param_construct_interfere_size < max_align)
> +       error ("%<--param constructive-interference-size=%d%> is less than
> "
> +              "%d", param_construct_interfere_size, max_align);
> +      else if (param_construct_interfere_size > param_l1_cache_line_size)
> +       warning (OPT_Winterference_size,
> +                "%<--param constructive-interference-size=%d%> "
> +                "is greater than %<--param l1-cache-line-size=%d%>",
> +                param_construct_interfere_size, param_l1_cache_line_size);
> +    }
>  }
>
>  /* Enter an abi node in global-module context.  returns a cookie to
> diff --git a/gcc/testsuite/g++.target/aarch64/interference.C
> b/gcc/testsuite/g++.target/aarch64/interference.C
> new file mode 100644
> index 00000000000..0fc01655223
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/aarch64/interference.C
> @@ -0,0 +1,9 @@
> +// Test C++17 hardware interference size constants
> +// { dg-do compile { target c++17 } }
> +
> +#include <new>
> +
> +// Most AArch64 CPUs have an L1 cache line size of 64, but some recent
> ones use
> +// 128 or even 256.
> +static_assert(std::hardware_destructive_interference_size == 256);
> +static_assert(std::hardware_constructive_interference_size == 64);
> diff --git a/gcc/testsuite/g++.target/arm/interference.C
> b/gcc/testsuite/g++.target/arm/interference.C
> new file mode 100644
> index 00000000000..34fe8a52bff
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/arm/interference.C
> @@ -0,0 +1,9 @@
> +// Test C++17 hardware interference size constants
> +// { dg-do compile { target c++17 } }
> +
> +#include <new>
> +
> +// Recent ARM CPUs have a cache line size of 64.  Older ones have
> +// a size of 32, but I guess they're old enough that we don't care?
> +static_assert(std::hardware_destructive_interference_size == 64);
> +static_assert(std::hardware_constructive_interference_size == 64);
> diff --git a/gcc/testsuite/g++.target/i386/interference.C
> b/gcc/testsuite/g++.target/i386/interference.C
> new file mode 100644
> index 00000000000..c7b910e3ada
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/i386/interference.C
> @@ -0,0 +1,8 @@
> +// Test C++17 hardware interference size constants
> +// { dg-do compile { target c++17 } }
> +
> +#include <new>
> +
> +// It is generally agreed that these are the right values for all x86.
> +static_assert(std::hardware_destructive_interference_size == 64);
> +static_assert(std::hardware_constructive_interference_size == 64);
> diff --git a/libstdc++-v3/include/std/version
> b/libstdc++-v3/include/std/version
> index 27bcd32cb60..d5e155db48b 100644
> --- a/libstdc++-v3/include/std/version
> +++ b/libstdc++-v3/include/std/version
> @@ -140,6 +140,9 @@
>  #define __cpp_lib_filesystem 201703
>  #define __cpp_lib_gcd 201606
>  #define __cpp_lib_gcd_lcm 201606
> +#ifdef __GCC_DESTRUCTIVE_SIZE
> +# define __cpp_lib_hardware_interference_size 201703L
> +#endif
>  #define __cpp_lib_hypot 201603
>  #define __cpp_lib_invoke 201411L
>  #define __cpp_lib_lcm 201606
> diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
> index 3349b13fd1b..7bc67a6cb02 100644
> --- a/libstdc++-v3/libsupc++/new
> +++ b/libstdc++-v3/libsupc++/new
> @@ -183,9 +183,9 @@ inline void operator delete[](void*, void*)
> _GLIBCXX_USE_NOEXCEPT { }
>  } // extern "C++"
>
>  #if __cplusplus >= 201703L
> -#ifdef _GLIBCXX_HAVE_BUILTIN_LAUNDER
>  namespace std
>  {
> +#ifdef _GLIBCXX_HAVE_BUILTIN_LAUNDER
>  #define __cpp_lib_launder 201606
>    /// Pointer optimization barrier [ptr.launder]
>    template<typename _Tp>
> @@ -205,8 +205,14 @@ namespace std
>    void launder(const void*) = delete;
>    void launder(volatile void*) = delete;
>    void launder(const volatile void*) = delete;
> -}
>  #endif // _GLIBCXX_HAVE_BUILTIN_LAUNDER
> +
> +#ifdef __GCC_DESTRUCTIVE_SIZE
> +# define __cpp_lib_hardware_interference_size 201703L
> +  inline constexpr size_t hardware_destructive_interference_size =
> __GCC_DESTRUCTIVE_SIZE;
> +  inline constexpr size_t hardware_constructive_interference_size =
> __GCC_CONSTRUCTIVE_SIZE;
> +#endif // __GCC_DESTRUCTIVE_SIZE
> +}
>  #endif // C++17
>
>  #if __cplusplus > 201703L
>
> base-commit: f6dde32b9d487dd6e343d0a1e1d1f60783f5e735
> prerequisite-patch-id: 62730bcaf1f07786fd756efb6f3bbd94d778c092
> prerequisite-patch-id: 36fa087f6b261576422f8f3b1638f76e5183c95a
> --
> 2.27.0
>
>

  reply	other threads:[~2021-07-16  2:41 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-16  2:36 Jason Merrill
2021-07-16  2:41 ` Jason Merrill [this message]
2021-07-16  2:48   ` Noah Goldstein
2021-07-16 11:17     ` Jonathan Wakely
2021-07-16 13:27       ` Richard Earnshaw
2021-07-16 13:26   ` Jonathan Wakely
2021-07-16 15:12   ` Matthias Kretz
2021-07-16 15:30     ` Jason Merrill
2021-07-16 16:54       ` Jonathan Wakely
2021-07-16 18:43         ` Jason Merrill
2021-07-16 19:26         ` Matthias Kretz
2021-07-16 19:58           ` Jonathan Wakely
2021-07-17  8:14             ` Matthias Kretz
2021-07-17 13:32               ` Jonathan Wakely
2021-07-17 13:54                 ` Matthias Kretz
2021-07-17 21:37                   ` Jason Merrill
2021-07-19  9:41                     ` Richard Earnshaw
2021-07-20 16:43                       ` Jason Merrill
2021-09-10 13:16                         ` [PATCH RFC] " Jason Merrill
2021-09-14  7:56                           ` Christophe LYON
2021-09-15 10:25                             ` Richard Earnshaw
2021-09-15 11:30                               ` Christophe Lyon
2021-09-15 12:31                             ` Martin Liška
2021-09-15 15:31                               ` [pushed] c++: don't warn about internal interference sizes Jason Merrill
2021-09-15 15:36                                 ` Jeff Law
2021-09-15 15:38                                   ` Jason Merrill
2021-09-15 16:15                                     ` Christophe Lyon
2021-09-15 15:35                               ` [PATCH RFC] c++: implement C++17 hardware interference size Jason Merrill
2021-07-20 18:05                 ` [PATCH] " Thomas Rodgers
2021-07-16 17:20     ` Noah Goldstein
2021-07-16 19:37       ` Matthias Kretz
2021-07-16 21:23         ` Noah Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADzB+2ktqA=nzYS5zJAxH9TfFexP-OzZw8SoPOd907-qxVbWpg@mail.gmail.com' \
    --to=jason@redhat.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libc-alpha@sourceware.org \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).