From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by sourceware.org (Postfix) with ESMTPS id 654673873866 for ; Tue, 13 Dec 2022 14:31:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 654673873866 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x130.google.com with SMTP id cf42so5219097lfb.1 for ; Tue, 13 Dec 2022 06:31:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qQN3V9bLVU79wwf8mvirW4tcSEz/OlA+SwPpi+zvUo0=; b=QYPsG0vWfrFUeZbdlC+WFTyxYwjcKnPDsmCehzYQeyagcuAZbpbL0nShfkxBrVDvds GS7Jd7/wKUbw8XZPg//3WW/UvE5OwqBT/qlPYalpseVoEf39+kM4pI+2FqHPMWXUFBeg 5FUwh91pi99sLOJ/KIyAmZf4uk8HNn8TmsvluWIL7dWg0lARqgHSVcH96M1MCDlWyQgU MrLVpvWsD+uGjXWh0COKlBt9elEa2X6RHj78GgF/EhoRUDNLAuhxQfnSUE3HsKOLZpHx q5DKd6WZdTzBAUwH5fwOTime8iTO6Y6fxkxZLyXIAvdBRwosUBAEiO1iIGNddm0+8PCE qSlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qQN3V9bLVU79wwf8mvirW4tcSEz/OlA+SwPpi+zvUo0=; b=EgBLBFcr/A06RP3nv/qWaeiNvyiVGJ5DcxkOtTokS7C4JYQ21y25pT37fQpEUtZVQg thZ31JrgvrUibWPjJ+FUuMTEa6XBb2qsIwqSBalCLR9wtEp+IDsqlw3qm8elSsjtDv6H 7k+sJYsajE+B0lFj9UydvEV+06kDutyM99Yrug3c/5FgD1tJZ1C89Lz5yHI9m9eYiFYr ilncT2L3QrQDmPuYLOm/DBKNJozpDyV4jvkpAipBWHJXkX740c7+88F4p44LuUI/jOlY D1A/3Q0HFjVbUMX4lYfJEJdXqtQIQDWNNbkzPejyPTlRZkVfyTI3+8Vp/CE3VsbQIlQq pbCA== X-Gm-Message-State: ANoB5pkfhgfWmWC0YMv2mGdkzciHuNnVIiRuF4/kp58amecNXH40hbXw EOqEF0fd0QxSYNGsBlJPqrI438UF0Z5ZARhhgAFDh1qcrR4= X-Google-Smtp-Source: AA0mqf7uCNSyHWfbrix5ilmjZbHznLZAIWra7SaS4PO8VqjOUJoOhT9Xv3F+wjS9473XsIDLKf8T2oR0XOQED7F0cVw= X-Received: by 2002:a05:6512:376c:b0:4ad:70c1:de61 with SMTP id z12-20020a056512376c00b004ad70c1de61mr31206465lft.509.1670941868037; Tue, 13 Dec 2022 06:31:08 -0800 (PST) MIME-Version: 1.0 References: <20221209135609.55159-1-sebastian.huber@embedded-brains.de> In-Reply-To: <20221209135609.55159-1-sebastian.huber@embedded-brains.de> From: Richard Biener Date: Tue, 13 Dec 2022 15:30:55 +0100 Message-ID: Subject: Re: [PATCH] gcov: Fix -fprofile-update=atomic To: Sebastian Huber Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Dec 9, 2022 at 2:56 PM Sebastian Huber wrote: > > The code coverage support uses counters to determine which edges in the control > flow graph were executed. If a counter overflows, then the code coverage > information is invalid. Therefore the counter type should be a 64-bit integer. > In multithreaded applications, it is important that the counter increments are > atomic. This is not the case by default. The user can enable atomic counter > increments through the -fprofile-update=atomic and > -fprofile-update=prefer-atomic options. > > If the hardware supports 64-bit atomic operations, then everything is fine. If > not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic > counter increments will be used. However, if the hardware does not support the > required atomic operations and -fprofile-atomic=update was chosen by the user, > then a warning was issued and as a forced fall-back to non-atomic operations > was done. This is probably not what a user wants. There is still hardware on > the market which does not have atomic operations and is used for multithreaded > applications. A user which selects -fprofile-update=atomic wants consistent > code coverage data and not random data. > > This patch removes the fall-back to non-atomic operations for > -fprofile-update=atomic. If atomic operations in hardware are not available, > then a library call to libatomic is emitted. To mitigate potential performance > issues an optimization for systems which only support 32-bit atomic operations > is provided. Here, the edge counter increments are done like this: > > low = __atomic_add_fetch_4 (&counter.low, 1, MEMMODEL_RELAXED); > high_inc = low == 0 ? 1 : 0; > __atomic_add_fetch_4 (&counter.high, high_inc, MEMMODEL_RELAXED); You check for compare_and_swapsi and the old code checks TYPE_PRECISION (gcov_type_node) > 32 to determine whether 8 byte or 4 byte gcov_type is used. But you do not seem to handle the case where neither SImode nor DImode atomic operations are available? Can we instead do if (gcov_type_size == 4) can_support_atomic4 = HAVE_sync_compare_and_swapsi || HAVE_atomic_compare_and_swapsi; else if (gcov_type_size == 8) can_support_atomic8 = HAVE_sync_compare_and_swapdi || HAVE_atomic_compare_and_swapdi; if (flag_profile_update == PROFILE_UPDATE_ATOMIC && !can_support_atomic4 && !can_support_atomic8) { warning (0, "target does not support atomic profile update, " "single mode is selected"); flag_profile_update = PROFILE_UPDATE_SINGLE; } thus retain the diagnostic & fallback for this case? The existing code also suggests there might be targets with HImode or TImode counters? In another mail you mentioned that for gen_time_profiler this doesn't work but its code relies on flag_profile_update as well. So do we need to split the flag somehow, or continue using the PROFILE_UPDATE_SINGLE fallback when we are doing more than just edge profiling? Thanks, Richard. > gcc/ChangeLog: > > * tree-profile.cc (split_atomic_increment): New. > (gimple_gen_edge_profiler): Split the atomic edge counter increment in > two 32-bit atomic operations if necessary. > (tree_profiling): Remove profile update warning and fall-back. Set > split_atomic_increment if necessary. > --- > gcc/tree-profile.cc | 81 +++++++++++++++++++++++++++++++++------------ > 1 file changed, 59 insertions(+), 22 deletions(-) > > diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc > index 2beb49241f2..1d326dde59a 100644 > --- a/gcc/tree-profile.cc > +++ b/gcc/tree-profile.cc > @@ -73,6 +73,17 @@ static GTY(()) tree ic_tuple_var; > static GTY(()) tree ic_tuple_counters_field; > static GTY(()) tree ic_tuple_callee_field; > > +/* If the user selected atomic profile counter updates > + (-fprofile-update=atomic), then the counter updates will be done atomically. > + Ideally, this is done through atomic operations in hardware. If the > + hardware supports only 32-bit atomic increments and gcov_type_node is a > + 64-bit integer type, then for the profile edge counters the increment is > + performed through two separate 32-bit atomic increments. This case is > + indicated by the split_atomic_increment variable begin true. If the > + hardware does not support atomic operations at all, then a library call to > + libatomic is emitted. */ > +static bool split_atomic_increment; > + > /* Do initialization work for the edge profiler. */ > > /* Add code: > @@ -242,30 +253,59 @@ gimple_init_gcov_profiler (void) > void > gimple_gen_edge_profiler (int edgeno, edge e) > { > - tree one; > - > - one = build_int_cst (gcov_type_node, 1); > + const char *name = "PROF_edge_counter"; > + tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno); > + tree one = build_int_cst (gcov_type_node, 1); > > if (flag_profile_update == PROFILE_UPDATE_ATOMIC) > { > - /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */ > - tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno); > - tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32 > - ? BUILT_IN_ATOMIC_FETCH_ADD_8: > - BUILT_IN_ATOMIC_FETCH_ADD_4); > - gcall *stmt = gimple_build_call (f, 3, addr, one, > - build_int_cst (integer_type_node, > - MEMMODEL_RELAXED)); > - gsi_insert_on_edge (e, stmt); > + tree addr = build_fold_addr_expr (ref); > + tree relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED); > + if (!split_atomic_increment) > + { > + /* __atomic_fetch_add (&counter, 1, MEMMODEL_RELAXED); */ > + tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32 > + ? BUILT_IN_ATOMIC_FETCH_ADD_8: > + BUILT_IN_ATOMIC_FETCH_ADD_4); > + gcall *stmt = gimple_build_call (f, 3, addr, one, relaxed); > + gsi_insert_on_edge (e, stmt); > + } > + else > + { > + /* low = __atomic_add_fetch_4 (addr, 1, MEMMODEL_RELAXED); > + high_inc = low == 0 ? 1 : 0; > + __atomic_add_fetch_4 (addr_high, high_inc, MEMMODEL_RELAXED); */ > + tree zero32 = build_zero_cst (uint32_type_node); > + tree one32 = build_one_cst (uint32_type_node); > + tree addr_high = make_temp_ssa_name (TREE_TYPE (addr), NULL, name); > + gimple *stmt = gimple_build_assign (addr_high, POINTER_PLUS_EXPR, > + addr, > + build_int_cst (size_type_node, > + 4)); > + gsi_insert_on_edge (e, stmt); > + if (WORDS_BIG_ENDIAN) > + std::swap (addr, addr_high); > + tree f = builtin_decl_explicit (BUILT_IN_ATOMIC_ADD_FETCH_4); > + stmt = gimple_build_call (f, 3, addr, one, relaxed); > + tree low = make_temp_ssa_name (uint32_type_node, NULL, name); > + gimple_call_set_lhs (stmt, low); > + gsi_insert_on_edge (e, stmt); > + tree is_zero = make_temp_ssa_name (boolean_type_node, NULL, name); > + stmt = gimple_build_assign (is_zero, EQ_EXPR, low, zero32); > + gsi_insert_on_edge (e, stmt); > + tree high_inc = make_temp_ssa_name (uint32_type_node, NULL, name); > + stmt = gimple_build_assign (high_inc, COND_EXPR, is_zero, one32, > + zero32); > + gsi_insert_on_edge (e, stmt); > + stmt = gimple_build_call (f, 3, addr_high, high_inc, relaxed); > + gsi_insert_on_edge (e, stmt); > + } > } > else > { > - tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno); > - tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, > - NULL, "PROF_edge_counter"); > + tree gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name); > gassign *stmt1 = gimple_build_assign (gcov_type_tmp_var, ref); > - gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, > - NULL, "PROF_edge_counter"); > + gcov_type_tmp_var = make_temp_ssa_name (gcov_type_node, NULL, name); > gassign *stmt2 = gimple_build_assign (gcov_type_tmp_var, PLUS_EXPR, > gimple_assign_lhs (stmt1), one); > gassign *stmt3 = gimple_build_assign (unshare_expr (ref), > @@ -710,11 +750,8 @@ tree_profiling (void) > > if (flag_profile_update == PROFILE_UPDATE_ATOMIC > && !can_support_atomic) > - { > - warning (0, "target does not support atomic profile update, " > - "single mode is selected"); > - flag_profile_update = PROFILE_UPDATE_SINGLE; > - } > + split_atomic_increment = HAVE_sync_compare_and_swapsi > + || HAVE_atomic_compare_and_swapsi; > else if (flag_profile_update == PROFILE_UPDATE_PREFER_ATOMIC) > flag_profile_update = can_support_atomic > ? PROFILE_UPDATE_ATOMIC : PROFILE_UPDATE_SINGLE; > -- > 2.35.3 >