From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by sourceware.org (Postfix) with ESMTPS id 89B773857736 for ; Tue, 17 Oct 2023 09:20:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 89B773857736 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 89B773857736 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::134 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697534460; cv=none; b=oCnjbe+ls2PwGa9LBopNUd1rf1+1qa6UHpl0ikp5H3ClGVP3mmFiRaufPhVyeLvJhk60JU8m5w77KbBjIipXB8ADHkzoTjMlNY9I3K90eF5R7FTcd2Nrkr2a1lUkP9X2c0cQYIyPQYr9uR5a5w8nmca7leu419i95JivBXF6B38= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697534460; c=relaxed/simple; bh=VWJe0/qNRnVWhP+PlwWV8KzpdJnOhgkeeOsbiLZZKaE=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=P3jD9i0a/Jv8JbgjLBj/cgEAxkEnOkR88jOZsiCH+Ql8QY+DEURHR3fC4twGIknYhOD/X+GxyS3ixJ+G1hH85mvXAAJxDzBjCUXgMWLEfxneRbqNfKjBzQ0cmntjfsxyhrrJVoDSM4nxDZdGs07XoBE9/rPF8hzRwynz7+BJx5E= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-5079eed8bfbso4665800e87.1 for ; Tue, 17 Oct 2023 02:20:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697534456; x=1698139256; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=B5KMav7sKukGKnTEZ78GJCChXnUipXicbTg/DBa7eRU=; b=Ea7X0MYOQeDX02qfU8rbzJzpbFbXKbsciTDJRjI/ZST/WhS4PvZDKRFeypbi4Cd6DK cVFy+cMCzs1YcuaLks8ACW8WBrY+RWZ8wzhJyMMRBxgoxwM2Y1y6uiPgTgduAh6/9vnm ZpjYHyHcyTApRsHm/8wKjFfNWvmkrrrK5f9Dmtzg1a5H2xJXNkN0Gweni5Wp95xcrfDr 8Gedn1GDMHox7uHdNuAoREpmYh9faKku0E/Gd/UIVGz0MtzKf2iHPxrOfSdcDNS5cOHQ jBBWiyNa2+NTLV4KKzoglK3tyI61Taj7+4FQZqqmOO2q5uzacexYbYtUHWeYjPc8p93/ d8Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697534456; x=1698139256; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=B5KMav7sKukGKnTEZ78GJCChXnUipXicbTg/DBa7eRU=; b=fDtVH2rgpJgfX8HC+s2kKz7WEC47zUM4orqqHKi+93Vp1UbTm5QKRiqC/kF5ovLNYW +kzKmuZFqSgN+Zrla2UwezEf6V+JgjEzXJPLNa05/fVqeuGQbLW7gmwy/XjOlsaWKU6S KpDi8xaIjaF25PehHV2pQSgLsskuVPPNgFiCmYmJpeeL/PygFcmmA3N+ytvZ/kgIWdfv rs+fIdZkBFm7nqRZ7oNgK8hKGaUOwYPGJBOKkNRc5mGDqJT2ljAAmE67NSNi3FpIm6vj /clN8725VbBVNvTFTpFkAItFaYtIwJGZcwzfuJNcYh4FwpMeNa1r10qBofEOovnCqo/z 912A== X-Gm-Message-State: AOJu0YyWCyGhiSXAby8rJ4DjImPXmJ3Equhnepy8czDui++qM5Jgo9KA OyybQlkwhvQvnWWZiJ4/YsGnIQgejqTtWCKhtgMS1+YveOk= X-Google-Smtp-Source: AGHT+IH1aSDeNofY8jiWsH/KkbXHRJ9fiMK9fKIkZ3QGn5c0eiJf5YvB8Qh0rHhk0mScfT6RCYEyDW+32SCOkpEk65E= X-Received: by 2002:a19:5002:0:b0:4fe:ecd:4950 with SMTP id e2-20020a195002000000b004fe0ecd4950mr1363499lfb.1.1697534455485; Tue, 17 Oct 2023 02:20:55 -0700 (PDT) MIME-Version: 1.0 References: <79f04438-7473-2b01-d26a-9357ad9318af@linux.ibm.com> In-Reply-To: From: Richard Biener Date: Tue, 17 Oct 2023 11:17:58 +0200 Message-ID: Subject: Re: [PATCH v8] tree-ssa-sink: Improve code sinking pass To: Ajit Agarwal Cc: gcc-patches , Jeff Law , Segher Boessenkool , Peter Bergner Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Oct 17, 2023 at 10:53=E2=80=AFAM Ajit Agarwal wrote: > > Hello Richard: > > On 17/10/23 2:03 pm, Richard Biener wrote: > > On Thu, Oct 12, 2023 at 10:42=E2=80=AFAM Ajit Agarwal wrote: > >> > >> This patch improves code sinking pass to sink statements before call t= o reduce > >> register pressure. > >> Review comments are incorporated. Synced and modified with latest trun= k sources. > >> > >> For example : > >> > >> void bar(); > >> int j; > >> void foo(int a, int b, int c, int d, int e, int f) > >> { > >> int l; > >> l =3D a + b + c + d +e + f; > >> if (a !=3D 5) > >> { > >> bar(); > >> j =3D l; > >> } > >> } > >> > >> Code Sinking does the following: > >> > >> void bar(); > >> int j; > >> void foo(int a, int b, int c, int d, int e, int f) > >> { > >> int l; > >> > >> if (a !=3D 5) > >> { > >> l =3D a + b + c + d +e + f; > >> bar(); > >> j =3D l; > >> } > >> } > >> > >> Bootstrapped regtested on powerpc64-linux-gnu. > >> > >> Thanks & Regards > >> Ajit > >> > >> tree-ssa-sink: Improve code sinking pass > >> > >> Currently, code sinking will sink code after function calls. This inc= reases > >> register pressure for callee-saved registers. The following patch imp= roves > >> code sinking by placing the sunk code before calls in the use block or= in > >> the immediate dominator of the use blocks. > > > > The patch no longer does what the description above says. > Why you think so. Please let me know. You talk about calls above but the patch doesn't do anything about calls. = You also don't do anything about register pressure, rather the effect of your changes are to move some stmts by a smaller "distance", whatever effect that has. > > > > More comments below. > > > >> 2023-10-12 Ajit Kumar Agarwal > >> > >> gcc/ChangeLog: > >> > >> PR tree-optimization/81953 > >> * tree-ssa-sink.cc (statement_sink_location): Move statements = before > >> calls. > >> (select_best_block): Add heuristics to select the best blocks = in the > >> immediate post dominator. > >> > >> gcc/testsuite/ChangeLog: > >> > >> PR tree-optimization/81953 > >> * gcc.dg/tree-ssa/ssa-sink-20.c: New test. > >> * gcc.dg/tree-ssa/ssa-sink-21.c: New test. > >> --- > >> gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c | 15 ++++++++ > >> gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c | 19 ++++++++++ > >> gcc/tree-ssa-sink.cc | 39 ++++++++++++--------= - > >> 3 files changed, 56 insertions(+), 17 deletions(-) > >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c > >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c > >> > >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsui= te/gcc.dg/tree-ssa/ssa-sink-21.c > >> new file mode 100644 > >> index 00000000000..d3b79ca5803 > >> --- /dev/null > >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c > >> @@ -0,0 +1,15 @@ > >> +/* { dg-do compile } */ > >> +/* { dg-options "-O2 -fdump-tree-sink-stats" } */ > >> +void bar(); > >> +int j; > >> +void foo(int a, int b, int c, int d, int e, int f) > >> +{ > >> + int l; > >> + l =3D a + b + c + d +e + f; > >> + if (a !=3D 5) > >> + { > >> + bar(); > >> + j =3D l; > >> + } > >> +} > >> +/* { dg-final { scan-tree-dump {l_12\s+=3D\s+_4\s+\+\s+f_11\(D\);\n\s= +bar\s+\(\)} sink1 } } */ > >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c b/gcc/testsui= te/gcc.dg/tree-ssa/ssa-sink-22.c > >> new file mode 100644 > >> index 00000000000..84e7938c54f > >> --- /dev/null > >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c > >> @@ -0,0 +1,19 @@ > >> +/* { dg-do compile } */ > >> +/* { dg-options "-O2 -fdump-tree-sink-stats" } */ > >> +void bar(); > >> +int j, x; > >> +void foo(int a, int b, int c, int d, int e, int f) > >> +{ > >> + int l; > >> + l =3D a + b + c + d +e + f; > >> + if (a !=3D 5) > >> + { > >> + bar(); > >> + if (b !=3D 3) > >> + x =3D 3; > >> + else > >> + x =3D 5; > >> + j =3D l; > >> + } > >> +} > >> +/* { dg-final { scan-tree-dump {l_13\s+=3D\s+_4\s+\+\s+f_12\(D\);\n\s= +bar\s+\(\)} sink1 } } */ > >> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc > >> index a360c5cdd6e..95298bc8402 100644 > >> --- a/gcc/tree-ssa-sink.cc > >> +++ b/gcc/tree-ssa-sink.cc > >> @@ -174,7 +174,8 @@ nearest_common_dominator_of_uses (def_operand_p de= f_p, bool *debug_stmts) > >> > >> /* Given EARLY_BB and LATE_BB, two blocks in a path through the domin= ator > >> tree, return the best basic block between them (inclusive) to plac= e > >> - statements. > >> + statements. The best basic block should be an immediate dominator = of > >> + best basic block if the use stmt is after the call. > >> > >> We want the most control dependent block in the shallowest loop ne= st. > >> > >> @@ -196,6 +197,16 @@ select_best_block (basic_block early_bb, > >> basic_block best_bb =3D late_bb; > >> basic_block temp_bb =3D late_bb; > >> int threshold; > >> + /* Get the sinking threshold. If the statement to be moved has mem= ory > >> + operands, then increase the threshold by 7% as those are even mo= re > >> + profitable to avoid, clamping at 100%. */ > >> + threshold =3D param_sink_frequency_threshold; > >> + if (gimple_vuse (stmt) || gimple_vdef (stmt)) > >> + { > >> + threshold +=3D 7; > >> + if (threshold > 100) > >> + threshold =3D 100; > >> + } > >> > >> while (temp_bb !=3D early_bb) > >> { > >> @@ -204,6 +215,14 @@ select_best_block (basic_block early_bb, > >> if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb)) > >> best_bb =3D temp_bb; > >> > >> + /* if we have temp_bb post dominated by use block block then im= mediate > >> + * dominator would be our best block. */ > >> + if (!gimple_vuse (stmt) > >> + && bb_loop_depth (temp_bb) =3D=3D bb_loop_depth (early_bb) > >> + && !(temp_bb->count * 100 >=3D early_bb->count * threshold) > >> + && dominated_by_p (CDI_DOMINATORS, late_bb, temp_bb)) > > > > this isn't a post-dominance check, in fact this always returns true. T= his > > also overrides the best found loop depth which probably means finding > > both inside the same loop doesn't work. > > I can remove dominated check. You would like me to do in different loop t= han doing inside the same > loop. Please let me know. > > > > What's the intent of the change? > > The purpose of this change is to assign best_bb the immediate dominator i= f both early_bb and temp_bb have same loop depth. So why is the change then not simply - if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb)) + if (bb_loop_depth (temp_bb) <=3D bb_loop_depth (best_bb)) best_bb =3D temp_bb; ? Not that I think this is desirable. We want to sink to the least executed place which doesn't map 1:1 to loop depth but control flow forks. The heuristic using basic-block counts is prone to profile errors (but otherwise should cover t= he general idea of the existing code). > Thanks & Regards > Ajit > > > >> + best_bb =3D temp_bb; > >> + > >> /* Walk up the dominator tree, hopefully we'll find a shallower > >> loop nest. */ > >> temp_bb =3D get_immediate_dominator (CDI_DOMINATORS, temp_bb); > >> @@ -233,17 +252,6 @@ select_best_block (basic_block early_bb, > >> && !dominated_by_p (CDI_DOMINATORS, best_bb->loop_father->latch= , best_bb)) > >> return early_bb; > >> > >> - /* Get the sinking threshold. If the statement to be moved has mem= ory > >> - operands, then increase the threshold by 7% as those are even mo= re > >> - profitable to avoid, clamping at 100%. */ > >> - threshold =3D param_sink_frequency_threshold; > >> - if (gimple_vuse (stmt) || gimple_vdef (stmt)) > >> - { > >> - threshold +=3D 7; > >> - if (threshold > 100) > >> - threshold =3D 100; > >> - } > >> - > >> /* If BEST_BB is at the same nesting level, then require it to have > >> significantly lower execution frequency to avoid gratuitous move= ment. */ > >> if (bb_loop_depth (best_bb) =3D=3D bb_loop_depth (early_bb) > >> @@ -430,6 +438,7 @@ statement_sink_location (gimple *stmt, basic_block= frombb, > >> continue; > >> break; > >> } > >> + > >> use =3D USE_STMT (one_use); > >> > >> if (gimple_code (use) !=3D GIMPLE_PHI) > >> @@ -439,10 +448,7 @@ statement_sink_location (gimple *stmt, basic_bloc= k frombb, > >> if (sinkbb =3D=3D frombb) > >> return false; > >> > >> - if (sinkbb =3D=3D gimple_bb (use)) > >> - *togsi =3D gsi_for_stmt (use); > >> - else > >> - *togsi =3D gsi_after_labels (sinkbb); > >> + *togsi =3D gsi_after_labels (sinkbb); > >> > >> return true; > >> } > >> @@ -825,7 +831,6 @@ pass_sink_code::execute (function *fun) > >> mark_dfs_back_edges (fun); > >> memset (&sink_stats, 0, sizeof (sink_stats)); > >> calculate_dominance_info (CDI_DOMINATORS); > >> - > >> virtual_operand_live vop_live; > >> > >> int *rpo =3D XNEWVEC (int, n_basic_blocks_for_fn (cfun)); > >> -- > >> 2.39.3 > >>