public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r14-5579] gcov: Improve -fprofile-update=atomic
@ 2023-11-18 11:45 Sebastian Huber
0 siblings, 0 replies; only message in thread
From: Sebastian Huber @ 2023-11-18 11:45 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:20a3c74c347429c109bc7002285b735be83f6a0b
commit r14-5579-g20a3c74c347429c109bc7002285b735be83f6a0b
Author: Sebastian Huber <sebastian.huber@embedded-brains.de>
Date: Tue Nov 14 21:36:51 2023 +0100
gcov: Improve -fprofile-update=atomic
The code coverage support uses counters to determine which edges in the control
flow graph were executed. If a counter overflows, then the code coverage
information is invalid. Therefore the counter type should be a 64-bit integer.
In multi-threaded applications, it is important that the counter increments are
atomic. This is not the case by default. The user can enable atomic counter
increments through the -fprofile-update=atomic and
-fprofile-update=prefer-atomic options.
If the target supports 64-bit atomic operations, then everything is fine. If
not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic
counter increments will be used. However, if the target does not support the
required atomic operations and -fprofile-atomic=update was chosen by the user,
then a warning was issued and as a forced fallback to non-atomic operations was
done. This is probably not what a user wants. There is still hardware on the
market which does not have atomic operations and is used for multi-threaded
applications. A user which selects -fprofile-update=atomic wants consistent
code coverage data and not random data.
This patch removes the fallback to non-atomic operations for
-fprofile-update=atomic the target platform supports libatomic. To
mitigate potential performance issues an optimization for systems which
only support 32-bit atomic operations is provided. Here, the edge
counter increments are done like this:
low = __atomic_add_fetch_4 (&counter.low, 1, MEMMODEL_RELAXED);
high_inc = low == 0 ? 1 : 0;
__atomic_add_fetch_4 (&counter.high, high_inc, MEMMODEL_RELAXED);
In gimple_gen_time_profiler() this split operation cannot be used, since the
updated counter value is also required. Here, a library call is emitted. This
is not a performance issue since the update is only done if counters[0] == 0.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Define
__LIBGCC_HAVE_LIBATOMIC for libgcov.
gcc/ChangeLog:
* doc/invoke.texi (-fprofile-update): Clarify default method. Document
the atomic method behaviour.
* tree-profile.cc (enum counter_update_method): New.
(counter_update): Likewise.
(gen_counter_update): Use counter_update_method. Split the
atomic counter update in two 32-bit atomic operations if
necessary.
(tree_profiling): Select counter_update_method.
libgcc/ChangeLog:
* libgcov.h (GCOV_SUPPORTS_ATOMIC): Always define it.
Set it also to 1, if __LIBGCC_HAVE_LIBATOMIC is defined.
Diff:
---
gcc/c-family/c-cppbuiltin.cc | 2 +
gcc/doc/invoke.texi | 19 ++++++++-
gcc/tree-profile.cc | 99 ++++++++++++++++++++++++++++++++++++++++----
libgcc/libgcov.h | 10 ++---
4 files changed, 114 insertions(+), 16 deletions(-)
diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc
index 5a5309d0d09..8d2f07a96ce 100644
--- a/gcc/c-family/c-cppbuiltin.cc
+++ b/gcc/c-family/c-cppbuiltin.cc
@@ -1567,6 +1567,8 @@ c_cpp_builtins (cpp_reader *pfile)
/* For libgcov. */
builtin_define_with_int_value ("__LIBGCC_VTABLE_USES_DESCRIPTORS__",
TARGET_VTABLE_USES_DESCRIPTORS);
+ builtin_define_with_int_value ("__LIBGCC_HAVE_LIBATOMIC",
+ targetm.have_libatomic);
}
/* For use in assembly language. */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c0b571327fb..557d613a1e6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16982,7 +16982,24 @@ while the second one prevents profile corruption by emitting thread-safe code.
Using @samp{prefer-atomic} would be transformed either to @samp{atomic},
when supported by a target, or to @samp{single} otherwise. The GCC driver
automatically selects @samp{prefer-atomic} when @option{-pthread}
-is present in the command line.
+is present in the command line, otherwise the default method is @samp{single}.
+
+If @samp{atomic} is selected, then the profile information is updated using
+atomic operations on a best-effort basis. Ideally, the profile information is
+updated through atomic operations in hardware. If the target platform does not
+support the required atomic operations in hardware, however, @file{libatomic}
+is available, then the profile information is updated through calls to
+@file{libatomic}. If the target platform neither supports the required atomic
+operations in hardware nor @file{libatomic}, then the profile information is
+not atomically updated and a warning is issued. In this case, the obtained
+profiling information may be corrupt for multi-threaded applications.
+
+For performance reasons, if 64-bit counters are used for the profiling
+information and the target platform only supports 32-bit atomic operations in
+hardware, then the performance critical profiling updates are done using two
+32-bit atomic operations for each counter update. If a signal interrupts these
+two operations updating a counter, then the profiling information may be in an
+inconsistent state.
@opindex fprofile-filter-files
@item -fprofile-filter-files=@var{regex}
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 24805ff905c..7d3cb1ef089 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -73,6 +73,41 @@ static GTY(()) tree ic_tuple_var;
static GTY(()) tree ic_tuple_counters_field;
static GTY(()) tree ic_tuple_callee_field;
+/* Types of counter update methods.
+
+ By default, the counter updates are done for a single threaded system
+ (COUNTER_UPDATE_SINGLE_THREAD).
+
+ If the user selected atomic profile counter updates
+ (-fprofile-update=atomic), then the counter updates will be done atomically
+ on a best-effort basis. One of three methods to do the counter updates is
+ selected according to the target capabilities.
+
+ Ideally, the counter updates are done through atomic operations in hardware
+ (COUNTER_UPDATE_ATOMIC_BUILTIN).
+
+ If the target supports only 32-bit atomic increments and gcov_type_node is a
+ 64-bit integer type, then for the profile edge counters the increment is
+ performed through two separate 32-bit atomic increments
+ (COUNTER_UPDATE_ATOMIC_SPLIT or COUNTER_UPDATE_ATOMIC_PARTIAL). If the
+ target supports libatomic (targetm.have_libatomic), then other counter
+ updates are carried out by libatomic calls (COUNTER_UPDATE_ATOMIC_SPLIT).
+ If the target does not support libatomic, then the other counter updates are
+ not done atomically (COUNTER_UPDATE_ATOMIC_PARTIAL) and a warning is
+ issued.
+
+ If the target does not support atomic operations in hardware, however, it
+ supports libatomic, then all updates are carried out by libatomic calls
+ (COUNTER_UPDATE_ATOMIC_BUILTIN). */
+enum counter_update_method {
+ COUNTER_UPDATE_SINGLE_THREAD,
+ COUNTER_UPDATE_ATOMIC_BUILTIN,
+ COUNTER_UPDATE_ATOMIC_SPLIT,
+ COUNTER_UPDATE_ATOMIC_PARTIAL
+};
+
+static counter_update_method counter_update = COUNTER_UPDATE_SINGLE_THREAD;
+
/* Do initialization work for the edge profiler. */
/* Add code:
@@ -269,7 +304,8 @@ gen_counter_update (gimple_stmt_iterator *gsi, tree counter, tree result,
tree one = build_int_cst (type, 1);
tree relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED);
- if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
+ if (counter_update == COUNTER_UPDATE_ATOMIC_BUILTIN ||
+ (result && counter_update == COUNTER_UPDATE_ATOMIC_SPLIT))
{
/* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */
tree f = builtin_decl_explicit (TYPE_PRECISION (type) > 32
@@ -278,6 +314,38 @@ gen_counter_update (gimple_stmt_iterator *gsi, tree counter, tree result,
gcall *call = gimple_build_call (f, 3, addr, one, relaxed);
gen_assign_counter_update (gsi, call, f, result, name);
}
+ else if (!result && (counter_update == COUNTER_UPDATE_ATOMIC_SPLIT ||
+ counter_update == COUNTER_UPDATE_ATOMIC_PARTIAL))
+ {
+ /* low = __atomic_add_fetch_4 (addr, 1, MEMMODEL_RELAXED);
+ high_inc = low == 0 ? 1 : 0;
+ __atomic_add_fetch_4 (addr_high, high_inc, MEMMODEL_RELAXED); */
+ tree zero32 = build_zero_cst (uint32_type_node);
+ tree one32 = build_one_cst (uint32_type_node);
+ tree addr_high = make_temp_ssa_name (TREE_TYPE (addr), NULL, name);
+ tree four = build_int_cst (size_type_node, 4);
+ gassign *assign1 = gimple_build_assign (addr_high, POINTER_PLUS_EXPR,
+ addr, four);
+ gsi_insert_after (gsi, assign1, GSI_NEW_STMT);
+ if (WORDS_BIG_ENDIAN)
+ std::swap (addr, addr_high);
+ tree f = builtin_decl_explicit (BUILT_IN_ATOMIC_ADD_FETCH_4);
+ gcall *call1 = gimple_build_call (f, 3, addr, one, relaxed);
+ tree low = make_temp_ssa_name (uint32_type_node, NULL, name);
+ gimple_call_set_lhs (call1, low);
+ gsi_insert_after (gsi, call1, GSI_NEW_STMT);
+ tree is_zero = make_temp_ssa_name (boolean_type_node, NULL, name);
+ gassign *assign2 = gimple_build_assign (is_zero, EQ_EXPR, low,
+ zero32);
+ gsi_insert_after (gsi, assign2, GSI_NEW_STMT);
+ tree high_inc = make_temp_ssa_name (uint32_type_node, NULL, name);
+ gassign *assign3 = gimple_build_assign (high_inc, COND_EXPR,
+ is_zero, one32, zero32);
+ gsi_insert_after (gsi, assign3, GSI_NEW_STMT);
+ gcall *call2 = gimple_build_call (f, 3, addr_high, high_inc,
+ relaxed);
+ gsi_insert_after (gsi, call2, GSI_NEW_STMT);
+ }
else
{
tree tmp1 = make_temp_ssa_name (type, NULL, name);
@@ -689,15 +757,20 @@ tree_profiling (void)
struct cgraph_node *node;
/* Verify whether we can utilize atomic update operations. */
- bool can_support_atomic = false;
+ bool can_support_atomic = targetm.have_libatomic;
unsigned HOST_WIDE_INT gcov_type_size
= tree_to_uhwi (TYPE_SIZE_UNIT (get_gcov_type ()));
- if (gcov_type_size == 4)
- can_support_atomic
- = HAVE_sync_compare_and_swapsi || HAVE_atomic_compare_and_swapsi;
- else if (gcov_type_size == 8)
- can_support_atomic
- = HAVE_sync_compare_and_swapdi || HAVE_atomic_compare_and_swapdi;
+ bool have_atomic_4
+ = HAVE_sync_compare_and_swapsi || HAVE_atomic_compare_and_swapsi;
+ bool have_atomic_8
+ = HAVE_sync_compare_and_swapdi || HAVE_atomic_compare_and_swapdi;
+ if (!can_support_atomic)
+ {
+ if (gcov_type_size == 4)
+ can_support_atomic = have_atomic_4;
+ else if (gcov_type_size == 8)
+ can_support_atomic = have_atomic_8;
+ }
if (flag_profile_update == PROFILE_UPDATE_ATOMIC
&& !can_support_atomic)
@@ -710,6 +783,16 @@ tree_profiling (void)
flag_profile_update = can_support_atomic
? PROFILE_UPDATE_ATOMIC : PROFILE_UPDATE_SINGLE;
+ if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
+ {
+ if (gcov_type_size == 8 && !have_atomic_8 && have_atomic_4)
+ counter_update = COUNTER_UPDATE_ATOMIC_SPLIT;
+ else
+ counter_update = COUNTER_UPDATE_ATOMIC_BUILTIN;
+ }
+ else if (gcov_type_size == 8 && have_atomic_4)
+ counter_update = COUNTER_UPDATE_ATOMIC_PARTIAL;
+
/* This is a small-ipa pass that gets called only once, from
cgraphunit.cc:ipa_passes(). */
gcc_assert (symtab->state == IPA_SSA);
diff --git a/libgcc/libgcov.h b/libgcc/libgcov.h
index 763118ea5b5..d04c070d0cf 100644
--- a/libgcc/libgcov.h
+++ b/libgcc/libgcov.h
@@ -95,18 +95,14 @@ typedef unsigned gcov_type_unsigned __attribute__ ((mode (QI)));
#define GCOV_LOCKED_WITH_LOCKING 0
#endif
-#ifndef GCOV_SUPPORTS_ATOMIC
/* Detect whether target can support atomic update of profilers. */
-#if __SIZEOF_LONG_LONG__ == 4 && __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
-#define GCOV_SUPPORTS_ATOMIC 1
-#else
-#if __SIZEOF_LONG_LONG__ == 8 && __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
+#if (__SIZEOF_LONG_LONG__ == 4 && __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4) || \
+ (__SIZEOF_LONG_LONG__ == 8 && __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8) || \
+ defined (__LIBGCC_HAVE_LIBATOMIC)
#define GCOV_SUPPORTS_ATOMIC 1
#else
#define GCOV_SUPPORTS_ATOMIC 0
#endif
-#endif
-#endif
/* In libgcov we need these functions to be extern, so prefix them with
__gcov. In libgcov they must also be hidden so that the instance in
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-11-18 11:45 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-18 11:45 [gcc r14-5579] gcov: Improve -fprofile-update=atomic Sebastian Huber
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).