From: "Li, Pan2" <pan2.li@intel.com>
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: RE: [PATCH] tree-optimization/114589 - remove profile based sink heuristics
Date: Fri, 14 Jun 2024 13:23:44 +0000 [thread overview]
Message-ID: <MW5PR11MB590847A9779658BB3B6A1ABFA9C22@MW5PR11MB5908.namprd11.prod.outlook.com> (raw)
In-Reply-To: <1727s277-781s-2574-2qo8-99n97951s107@fhfr.qr>
> It definitely looks like a latent issue being triggered. Either in LRA
> or in how the target presents itself.
Thanks Richard, will have a try and keep you posted.
Pan
-----Original Message-----
From: Richard Biener <rguenther@suse.de>
Sent: Friday, June 14, 2024 9:11 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org
Subject: RE: [PATCH] tree-optimization/114589 - remove profile based sink heuristics
On Fri, 14 Jun 2024, Li, Pan2 wrote:
> Hi Richard,
>
> Here is one PR related to this patch (by git bisect), details as below.
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115458
>
> I am still trying to narrow down which change caused this failure or any hints here?
It definitely looks like a latent issue being triggered. Either in LRA
or in how the target presents itself.
Richard.
> Pan
>
> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Wednesday, May 15, 2024 5:39 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] tree-optimization/114589 - remove profile based sink heuristics
>
> The following removes the profile based heuristic limiting sinking
> and instead uses post-dominators to avoid sinking to places that
> are executed under the same conditions as the earlier location which
> the profile based heuristic should have guaranteed as well.
>
> To avoid regressing this moves the empty-latch check to cover all
> sink cases.
>
> It also stream-lines the resulting select_best_block a bit but avoids
> adjusting heuristics more with this change. gfortran.dg/streamio_9.f90
> starts execute failing with this on x86_64 with -m32 because the
> (float)i * 9.9999...e-7 compute is sunk across a STOP causing it
> to be no longer spilled and thus the compare failing due to excess
> precision. The patch adds -ffloat-store to avoid this, following
> other similar testcases.
>
> This change doesn't fix the testcase in the PR on itself.
>
> Bootstrapped on x86_64-unknown-linux-gnu, re-testing in progress.
>
> PR tree-optimization/114589
> * tree-ssa-sink.cc (select_best_block): Remove profile-based
> heuristics. Instead reject sink locations that sink
> to post-dominators. Move empty latch check here from
> statement_sink_location. Also consider early_bb for the
> loop depth check.
> (statement_sink_location): Remove superfluous check. Remove
> empty latch check.
> (pass_sink_code::execute): Compute/release post-dominators.
>
> * gfortran.dg/streamio_9.f90: Use -ffloat-store to avoid
> excess precision when not spilling.
> ---
> gcc/testsuite/gfortran.dg/streamio_9.f90 | 1 +
> gcc/tree-ssa-sink.cc | 62 ++++++++----------------
> 2 files changed, 20 insertions(+), 43 deletions(-)
>
> diff --git a/gcc/testsuite/gfortran.dg/streamio_9.f90 b/gcc/testsuite/gfortran.dg/streamio_9.f90
> index b6bddb973f8..f29ded6ba54 100644
> --- a/gcc/testsuite/gfortran.dg/streamio_9.f90
> +++ b/gcc/testsuite/gfortran.dg/streamio_9.f90
> @@ -1,4 +1,5 @@
> ! { dg-do run }
> +! { dg-options "-ffloat-store" }
> ! PR29053 Stream IO test 9.
> ! Contributed by Jerry DeLisle <jvdelisle@gcc.gnu.org>.
> ! Test case derived from that given in PR by Steve Kargl.
> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> index 2f90acb7ef4..2188b7523c7 100644
> --- a/gcc/tree-ssa-sink.cc
> +++ b/gcc/tree-ssa-sink.cc
> @@ -178,15 +178,7 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>
> We want the most control dependent block in the shallowest loop nest.
>
> - If the resulting block is in a shallower loop nest, then use it. Else
> - only use the resulting block if it has significantly lower execution
> - frequency than EARLY_BB to avoid gratuitous statement movement. We
> - consider statements with VOPS more desirable to move.
> -
> - This pass would obviously benefit from PDO as it utilizes block
> - frequencies. It would also benefit from recomputing frequencies
> - if profile data is not available since frequencies often get out
> - of sync with reality. */
> + If the resulting block is in a shallower loop nest, then use it. */
>
> static basic_block
> select_best_block (basic_block early_bb,
> @@ -195,18 +187,17 @@ select_best_block (basic_block early_bb,
> {
> basic_block best_bb = late_bb;
> basic_block temp_bb = late_bb;
> - int threshold;
>
> while (temp_bb != early_bb)
> {
> + /* Walk up the dominator tree, hopefully we'll find a shallower
> + loop nest. */
> + temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb);
> +
> /* If we've moved into a lower loop nest, then that becomes
> our best block. */
> if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb))
> best_bb = temp_bb;
> -
> - /* Walk up the dominator tree, hopefully we'll find a shallower
> - loop nest. */
> - temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb);
> }
>
> /* Placing a statement before a setjmp-like function would be invalid
> @@ -221,6 +212,16 @@ select_best_block (basic_block early_bb,
> if (bb_loop_depth (best_bb) < bb_loop_depth (early_bb))
> return best_bb;
>
> + /* Do not move stmts to post-dominating places on the same loop depth. */
> + if (dominated_by_p (CDI_POST_DOMINATORS, early_bb, best_bb))
> + return early_bb;
> +
> + /* If the latch block is empty, don't make it non-empty by sinking
> + something into it. */
> + if (best_bb == early_bb->loop_father->latch
> + && empty_block_p (best_bb))
> + return early_bb;
> +
> /* Avoid turning an unconditional read into a conditional one when we
> still might want to perform vectorization. */
> if (best_bb->loop_father == early_bb->loop_father
> @@ -233,28 +234,7 @@ select_best_block (basic_block early_bb,
> && !dominated_by_p (CDI_DOMINATORS, best_bb->loop_father->latch, best_bb))
> return early_bb;
>
> - /* Get the sinking threshold. If the statement to be moved has memory
> - operands, then increase the threshold by 7% as those are even more
> - profitable to avoid, clamping at 100%. */
> - threshold = param_sink_frequency_threshold;
> - if (gimple_vuse (stmt) || gimple_vdef (stmt))
> - {
> - threshold += 7;
> - if (threshold > 100)
> - threshold = 100;
> - }
> -
> - /* If BEST_BB is at the same nesting level, then require it to have
> - significantly lower execution frequency to avoid gratuitous movement. */
> - if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
> - /* If result of comparsion is unknown, prefer EARLY_BB.
> - Thus use !(...>=..) rather than (...<...) */
> - && !(best_bb->count * 100 >= early_bb->count * threshold))
> - return best_bb;
> -
> - /* No better block found, so return EARLY_BB, which happens to be the
> - statement's original block. */
> - return early_bb;
> + return best_bb;
> }
>
> /* Given a statement (STMT) and the basic block it is currently in (FROMBB),
> @@ -452,13 +432,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> return false;
>
> sinkbb = select_best_block (frombb, sinkbb, stmt);
> - if (!sinkbb || sinkbb == frombb)
> - return false;
> -
> - /* If the latch block is empty, don't make it non-empty by sinking
> - something into it. */
> - if (sinkbb == frombb->loop_father->latch
> - && empty_block_p (sinkbb))
> + if (sinkbb == frombb)
> return false;
>
> *togsi = gsi_after_labels (sinkbb);
> @@ -822,6 +796,7 @@ pass_sink_code::execute (function *fun)
> mark_dfs_back_edges (fun);
> memset (&sink_stats, 0, sizeof (sink_stats));
> calculate_dominance_info (CDI_DOMINATORS);
> + calculate_dominance_info (CDI_POST_DOMINATORS);
>
> virtual_operand_live vop_live;
>
> @@ -833,6 +808,7 @@ pass_sink_code::execute (function *fun)
>
> statistics_counter_event (fun, "Sunk statements", sink_stats.sunk);
> statistics_counter_event (fun, "Commoned stores", sink_stats.commoned);
> + free_dominance_info (CDI_POST_DOMINATORS);
> remove_fake_exit_edges ();
> loop_optimizer_finalize ();
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
prev parent reply other threads:[~2024-06-14 13:23 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-15 9:38 Richard Biener
2024-05-18 3:27 ` Hans-Peter Nilsson
2024-06-14 13:08 ` Li, Pan2
2024-06-14 13:10 ` Richard Biener
2024-06-14 13:23 ` Li, Pan2 [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MW5PR11MB590847A9779658BB3B6A1ABFA9C22@MW5PR11MB5908.namprd11.prod.outlook.com \
--to=pan2.li@intel.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).