From: Jason Merrill <jason@redhat.com>
To: gcc-patches List <gcc-patches@gcc.gnu.org>
Cc: "libstdc++" <libstdc++@gcc.gnu.org>,
libc-alpha@sourceware.org,
"Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
Subject: Re: [PATCH] c++: implement C++17 hardware interference size
Date: Thu, 15 Jul 2021 22:41:17 -0400 [thread overview]
Message-ID: <CADzB+2ktqA=nzYS5zJAxH9TfFexP-OzZw8SoPOd907-qxVbWpg@mail.gmail.com> (raw)
In-Reply-To: <20210716023656.670004-1-jason@redhat.com>
Adding CCs that got lost in the initial mail.
On Thu, Jul 15, 2021 at 10:36 PM Jason Merrill <jason@redhat.com> wrote:
> The last missing piece of the C++17 standard library is the hardware
> intereference size constants. Much of the delay in implementing these has
> been due to uncertainty about what the right values are, and even whether
> there is a single constant value that is suitable; the destructive
> interference size is intended to be used in structure layout, so program
> ABIs will depend on it.
>
> In principle, both of these values should be the same as the target's L1
> cache line size. When compiling for a generic target that is intended to
> support a range of target CPUs with different cache line sizes, the
> constructive size should probably be the minimum size, and the destructive
> size the maximum, unless you are constrained by ABI compatibility with
> previous code.
>
> JF Bastien's implementation proposal is summarized at
> https://github.com/itanium-cxx-abi/cxx-abi/issues/74
>
> I implement this by adding new --params for the two sizes. Targets need to
> override these values in targetm.target_option.override() to support the
> feature.
>
> 64 bytes still seems correct for the x86 family.
>
> I'm not sure why he said 64/64 for 32-bit ARM, since the Cortex A9 has a
> 32-byte cache line, and that seems to be the only ARM_PREFETCH_BENEFICIAL
> target, so I'd think 32/64 would make more sense.
>
> He proposed 64/128 for AArch64, but since the A64FX now has a 256B cache
> line, I've changed that to 64/256. Does that seem right?
>
> Currently the patch does not adjust the values based on -march, as in JF's
> proposal. I'll need more guidance from the ARM/AArch64 maintainers about
> how to go about that. --param l1-cache-line-size is set based on -mtune,
> but I don't think we want -mtune to change these ABI-affecting values. Are
> there -march values for which a smaller range than 64-256 makes sense?
>
> gcc/ChangeLog:
>
> * params.opt: Add destructive-interference-size and
> constructive-interference-size.
> * doc/invoke.texi: Document them.
> * config/aarch64/aarch64.c (aarch64_override_options_internal):
> Set them.
> * config/arm/arm.c (arm_option_override): Set them.
> * config/i386/i386-options.c (ix86_option_override_internal):
> Set them.
>
> gcc/c-family/ChangeLog:
>
> * c.opt: Add -Winterference-size.
> * c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE
> and __GCC_CONSTRUCTIVE_SIZE.
>
> gcc/cp/ChangeLog:
>
> * decl.c (cxx_init_decl_processing): Check
> --param *-interference-size values.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/version: Define __cpp_lib_hardware_interference_size.
> * libsupc++/new: Define hardware interference size variables.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/aarch64/interference.C: New test.
> * g++.target/arm/interference.C: New test.
> * g++.target/i386/interference.C: New test.
> ---
> gcc/doc/invoke.texi | 22 ++++++++++++++++++
> gcc/c-family/c.opt | 5 ++++
> gcc/params.opt | 15 ++++++++++++
> gcc/c-family/c-cppbuiltin.c | 12 ++++++++++
> gcc/config/aarch64/aarch64.c | 9 ++++++++
> gcc/config/arm/arm.c | 6 +++++
> gcc/config/i386/i386-options.c | 6 +++++
> gcc/cp/decl.c | 23 +++++++++++++++++++
> .../g++.target/aarch64/interference.C | 9 ++++++++
> gcc/testsuite/g++.target/arm/interference.C | 9 ++++++++
> gcc/testsuite/g++.target/i386/interference.C | 8 +++++++
> libstdc++-v3/include/std/version | 3 +++
> libstdc++-v3/libsupc++/new | 10 ++++++--
> 13 files changed, 135 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/g++.target/aarch64/interference.C
> create mode 100644 gcc/testsuite/g++.target/arm/interference.C
> create mode 100644 gcc/testsuite/g++.target/i386/interference.C
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index ea8812425e9..f93cb7a20f7 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13857,6 +13857,28 @@ prefetch hints can be issued for any constant
> stride.
>
> This setting is only useful for strides that are known and constant.
>
> +@item destructive_interference_size
> +@item constructive_interference_size
> +The values for the C++17 variables
> +@code{std::hardware_destructive_interference_size} and
> +@code{std::hardware_constructive_interference_size}. The destructive
> +interference size is the minimum recommended offset between two
> +independent concurrently-accessed objects; the constructive
> +interference size is the maximum recommended size of contiguous memory
> +accessed together. Typically both will be the size of an L1 cache
> +line for the target, in bytes. If the target can have a range of L1
> +cache line sizes, typically the constructive interference size will be
> +the small end of the range and the destructive size will be the large
> +end.
> +
> +These values, particularly the destructive size, are intended to be
> +used for layout, and thus have ABI impact. The default values can
> +change with @samp{-march}, and users should be aware of this and
> +perhaps specify these values directly if they need to be ABI
> +compatible with code that was compiled with a different @samp{-march}.
> +Changing the default values for a generic target should be done
> +cautiously.
> +
> @item loop-interchange-max-num-stmts
> The maximum number of stmts in a loop to be interchanged.
>
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 91929706aff..0398faf430a 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -722,6 +722,11 @@ Winit-list-lifetime
> C++ ObjC++ Var(warn_init_list) Warning Init(1)
> Warn about uses of std::initializer_list that can result in dangling
> pointers.
>
> +Winterference-size
> +C++ ObjC++ Var(warn_interference_size) Warning Init(1)
> +Warn about nonsensical values of --param destructive-interference-size or
> +constructive-interference-size.
> +
> Wimplicit
> C ObjC Var(warn_implicit) Warning LangEnabledBy(C ObjC,Wall)
> Warn about implicit declarations.
> diff --git a/gcc/params.opt b/gcc/params.opt
> index 92b003e38cb..a81a3ec82f1 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -358,6 +358,21 @@ The maximum code size growth ratio when expanding
> into a jump table (in percent)
> Common Joined UInteger Var(param_l1_cache_line_size) Init(32) Param
> Optimization
> The size of L1 cache line.
>
> +-param=destructive-interference-size=
> +Common Joined UInteger Var(param_destruct_interfere_size) Init(0) Param
> Optimization
> +The minimum recommended offset between two concurrently-accessed objects
> to
> +avoid additional performance degradation due to contention introduced by
> the
> +implementation. Typically the L1 cache line size, but can be larger to
> +accommodate a variety of target processors with different cache line
> sizes.
> +C++17 code might use this value in structure layout.
> +
> +-param=constructive-interference-size=
> +Common Joined UInteger Var(param_construct_interfere_size) Init(0) Param
> Optimization
> +The maximum recommended size of contiguous memory occupied by two objects
> +accessed with temporal locality by concurrent threads. Typically the L1
> cache
> +line size, but can be smaller to accommodate a variety of target
> processors with
> +different cache line sizes.
> +
> -param=l1-cache-size=
> Common Joined UInteger Var(param_l1_cache_size) Init(64) Param
> Optimization
> The size of L1 cache.
> diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
> index f79f939bd10..a7bf2544533 100644
> --- a/gcc/c-family/c-cppbuiltin.c
> +++ b/gcc/c-family/c-cppbuiltin.c
> @@ -741,6 +741,18 @@ cpp_atomic_builtins (cpp_reader *pfile)
> builtin_define_with_int_value ("__GCC_ATOMIC_TEST_AND_SET_TRUEVAL",
> targetm.atomic_test_and_set_trueval);
>
> + /* Macros for C++17 hardware interference size constants. Either both
> or
> + neither should be set. */
> + gcc_assert (!param_destruct_interfere_size
> + == !param_construct_interfere_size);
> + if (param_destruct_interfere_size)
> + {
> + builtin_define_with_int_value ("__GCC_DESTRUCTIVE_SIZE",
> + param_destruct_interfere_size);
> + builtin_define_with_int_value ("__GCC_CONSTRUCTIVE_SIZE",
> + param_construct_interfere_size);
> + }
> +
> /* ptr_type_node can't be used here since ptr_mode is only set when
> toplev calls backend_init which is not done with -E or pch. */
> psize = POINTER_SIZE_UNITS;
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index f5b25a7f704..b172fcdc93e 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -16297,6 +16297,15 @@ aarch64_override_options_internal (struct
> gcc_options *opts)
> SET_OPTION_IF_UNSET (opts, &global_options_set,
> param_l1_cache_line_size,
> aarch64_tune_params.prefetch->l1_cache_line_size);
> + /* For a generic AArch64 target, cover the current range of cache line
> + sizes. */
> + SET_OPTION_IF_UNSET (opts, &global_options_set,
> + param_destruct_interfere_size,
> + 256);
> + SET_OPTION_IF_UNSET (opts, &global_options_set,
> + param_construct_interfere_size,
> + 64);
> +
> if (aarch64_tune_params.prefetch->l2_cache_size >= 0)
> SET_OPTION_IF_UNSET (opts, &global_options_set,
> param_l2_cache_size,
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 6d781e23ee9..edfa2ad3426 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3656,6 +3656,12 @@ arm_option_override (void)
> SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> param_l1_cache_line_size,
> current_tune->prefetch.l1_cache_line_size);
> + /* For a generic ARM target, JF Bastien proposed using 64 for both.
> + ??? Why not 32 for constructive? */
> + SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> + param_destruct_interfere_size, 64);
> + SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> + param_construct_interfere_size, 64);
> if (current_tune->prefetch.l1_cache_size >= 0)
> SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> param_l1_cache_size,
> diff --git a/gcc/config/i386/i386-options.c
> b/gcc/config/i386/i386-options.c
> index 7cba655595e..3b1b3c838f8 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -2571,6 +2571,12 @@ ix86_option_override_internal (bool main_args_p,
> SET_OPTION_IF_UNSET (opts, opts_set, param_l2_cache_size,
> ix86_tune_cost->l2_cache_size);
>
> + /* 64B is the accepted value for these for all x86. */
> + SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> + param_destruct_interfere_size, 64);
> + SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> + param_construct_interfere_size, 64);
> +
> /* Enable sw prefetching at -O3 for CPUS that prefetching is helpful.
> */
> if (opts->x_flag_prefetch_loop_arrays < 0
> && HAVE_prefetch
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 01d64a16125..880fb8948a9 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -4732,6 +4732,29 @@ cxx_init_decl_processing (void)
> /* Show we use EH for cleanups. */
> if (flag_exceptions)
> using_eh_for_cleanups ();
> +
> + /* Check that the hardware interference sizes are at least
> + alignof(max_align_t), as required by the standard. */
> + if (param_destruct_interfere_size)
> + {
> + int max_align = max_align_t_align () / BITS_PER_UNIT;
> + if (param_destruct_interfere_size < max_align)
> + error ("%<--param destructive-interference-size=%d%> is less than "
> + "%d", param_destruct_interfere_size, max_align);
> + else if (param_destruct_interfere_size < param_l1_cache_line_size)
> + warning (OPT_Winterference_size,
> + "%<--param destructive-interference-size=%d%> "
> + "is less than %<--param l1-cache-line-size=%d%>",
> + param_destruct_interfere_size, param_l1_cache_line_size);
> + if (param_construct_interfere_size < max_align)
> + error ("%<--param constructive-interference-size=%d%> is less than
> "
> + "%d", param_construct_interfere_size, max_align);
> + else if (param_construct_interfere_size > param_l1_cache_line_size)
> + warning (OPT_Winterference_size,
> + "%<--param constructive-interference-size=%d%> "
> + "is greater than %<--param l1-cache-line-size=%d%>",
> + param_construct_interfere_size, param_l1_cache_line_size);
> + }
> }
>
> /* Enter an abi node in global-module context. returns a cookie to
> diff --git a/gcc/testsuite/g++.target/aarch64/interference.C
> b/gcc/testsuite/g++.target/aarch64/interference.C
> new file mode 100644
> index 00000000000..0fc01655223
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/aarch64/interference.C
> @@ -0,0 +1,9 @@
> +// Test C++17 hardware interference size constants
> +// { dg-do compile { target c++17 } }
> +
> +#include <new>
> +
> +// Most AArch64 CPUs have an L1 cache line size of 64, but some recent
> ones use
> +// 128 or even 256.
> +static_assert(std::hardware_destructive_interference_size == 256);
> +static_assert(std::hardware_constructive_interference_size == 64);
> diff --git a/gcc/testsuite/g++.target/arm/interference.C
> b/gcc/testsuite/g++.target/arm/interference.C
> new file mode 100644
> index 00000000000..34fe8a52bff
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/arm/interference.C
> @@ -0,0 +1,9 @@
> +// Test C++17 hardware interference size constants
> +// { dg-do compile { target c++17 } }
> +
> +#include <new>
> +
> +// Recent ARM CPUs have a cache line size of 64. Older ones have
> +// a size of 32, but I guess they're old enough that we don't care?
> +static_assert(std::hardware_destructive_interference_size == 64);
> +static_assert(std::hardware_constructive_interference_size == 64);
> diff --git a/gcc/testsuite/g++.target/i386/interference.C
> b/gcc/testsuite/g++.target/i386/interference.C
> new file mode 100644
> index 00000000000..c7b910e3ada
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/i386/interference.C
> @@ -0,0 +1,8 @@
> +// Test C++17 hardware interference size constants
> +// { dg-do compile { target c++17 } }
> +
> +#include <new>
> +
> +// It is generally agreed that these are the right values for all x86.
> +static_assert(std::hardware_destructive_interference_size == 64);
> +static_assert(std::hardware_constructive_interference_size == 64);
> diff --git a/libstdc++-v3/include/std/version
> b/libstdc++-v3/include/std/version
> index 27bcd32cb60..d5e155db48b 100644
> --- a/libstdc++-v3/include/std/version
> +++ b/libstdc++-v3/include/std/version
> @@ -140,6 +140,9 @@
> #define __cpp_lib_filesystem 201703
> #define __cpp_lib_gcd 201606
> #define __cpp_lib_gcd_lcm 201606
> +#ifdef __GCC_DESTRUCTIVE_SIZE
> +# define __cpp_lib_hardware_interference_size 201703L
> +#endif
> #define __cpp_lib_hypot 201603
> #define __cpp_lib_invoke 201411L
> #define __cpp_lib_lcm 201606
> diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
> index 3349b13fd1b..7bc67a6cb02 100644
> --- a/libstdc++-v3/libsupc++/new
> +++ b/libstdc++-v3/libsupc++/new
> @@ -183,9 +183,9 @@ inline void operator delete[](void*, void*)
> _GLIBCXX_USE_NOEXCEPT { }
> } // extern "C++"
>
> #if __cplusplus >= 201703L
> -#ifdef _GLIBCXX_HAVE_BUILTIN_LAUNDER
> namespace std
> {
> +#ifdef _GLIBCXX_HAVE_BUILTIN_LAUNDER
> #define __cpp_lib_launder 201606
> /// Pointer optimization barrier [ptr.launder]
> template<typename _Tp>
> @@ -205,8 +205,14 @@ namespace std
> void launder(const void*) = delete;
> void launder(volatile void*) = delete;
> void launder(const volatile void*) = delete;
> -}
> #endif // _GLIBCXX_HAVE_BUILTIN_LAUNDER
> +
> +#ifdef __GCC_DESTRUCTIVE_SIZE
> +# define __cpp_lib_hardware_interference_size 201703L
> + inline constexpr size_t hardware_destructive_interference_size =
> __GCC_DESTRUCTIVE_SIZE;
> + inline constexpr size_t hardware_constructive_interference_size =
> __GCC_CONSTRUCTIVE_SIZE;
> +#endif // __GCC_DESTRUCTIVE_SIZE
> +}
> #endif // C++17
>
> #if __cplusplus > 201703L
>
> base-commit: f6dde32b9d487dd6e343d0a1e1d1f60783f5e735
> prerequisite-patch-id: 62730bcaf1f07786fd756efb6f3bbd94d778c092
> prerequisite-patch-id: 36fa087f6b261576422f8f3b1638f76e5183c95a
> --
> 2.27.0
>
>
next prev parent reply other threads:[~2021-07-16 2:41 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-16 2:36 Jason Merrill
2021-07-16 2:41 ` Jason Merrill [this message]
2021-07-16 2:48 ` Noah Goldstein
2021-07-16 11:17 ` Jonathan Wakely
2021-07-16 13:27 ` Richard Earnshaw
2021-07-16 13:26 ` Jonathan Wakely
2021-07-16 15:12 ` Matthias Kretz
2021-07-16 15:30 ` Jason Merrill
2021-07-16 16:54 ` Jonathan Wakely
2021-07-16 18:43 ` Jason Merrill
2021-07-16 19:26 ` Matthias Kretz
2021-07-16 19:58 ` Jonathan Wakely
2021-07-17 8:14 ` Matthias Kretz
2021-07-17 13:32 ` Jonathan Wakely
2021-07-17 13:54 ` Matthias Kretz
2021-07-17 21:37 ` Jason Merrill
2021-07-19 9:41 ` Richard Earnshaw
2021-07-20 16:43 ` Jason Merrill
2021-09-10 13:16 ` [PATCH RFC] " Jason Merrill
2021-09-14 7:56 ` Christophe LYON
2021-09-15 10:25 ` Richard Earnshaw
2021-09-15 11:30 ` Christophe Lyon
2021-09-15 12:31 ` Martin Liška
2021-09-15 15:31 ` [pushed] c++: don't warn about internal interference sizes Jason Merrill
2021-09-15 15:36 ` Jeff Law
2021-09-15 15:38 ` Jason Merrill
2021-09-15 16:15 ` Christophe Lyon
2021-09-15 15:35 ` [PATCH RFC] c++: implement C++17 hardware interference size Jason Merrill
2021-07-20 18:05 ` [PATCH] " Thomas Rodgers
2021-07-16 17:20 ` Noah Goldstein
2021-07-16 19:37 ` Matthias Kretz
2021-07-16 21:23 ` Noah Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADzB+2ktqA=nzYS5zJAxH9TfFexP-OzZw8SoPOd907-qxVbWpg@mail.gmail.com' \
--to=jason@redhat.com \
--cc=Richard.Earnshaw@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=libc-alpha@sourceware.org \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).