public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/102879] [12 Regression] Dead Code Elimination Regression at -O3
Date: Thu, 10 Mar 2022 09:42:43 +0000	[thread overview]
Message-ID: <bug-102879-4-Z4FSEmw6LV@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-102879-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102879

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org
           Priority|P3                          |P1

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
There's an interesting missing value-numbering optimization here,
call_may_clobber_ref_p_1 considers the call to foo () possibly clobbering 'c'
even though 'c' does not escape the TU.

Since 'foo' is external there's no IPA reference or modref data but we
do know that !may_be_aliased (base) so we could amend

  /* If the reference is based on a decl that is not aliased the call
     cannot possibly clobber it.  */
  if (DECL_P (base)
      && !may_be_aliased (base)
      /* But local non-readonly statics can be modified through recursion
         or the call may implement a threading barrier which we must
         treat as may-def.  */
      && (TREE_READONLY (base)
          || !is_global_var (base)))
    return false;

to constrain the "But local ..." (note nested functions make 'local'
difficult to express so we use !is_global_var).  Of course the
threading barrier issue would still exist, but then the call itself
isn't clobbering it just serves as a barrier for code motion - I'm not
sure what kind of transforms we have to forbid.

Now, we _do_ have to ensure that foo () cannot access 'c' which it for
example might do if there's a

bar() { c = 3 };
void (*hook)() = bar;

and foo calls the exported *hook.  In the end we have

c/1 (c) @0x7ffff7ff3180
  Type: variable definition analyzed
  Visibility: semantic_interposition prevailing_def_ironly
  References:
  Referring: main/4 (write) main/4 (read)
  Availability: available
  Varpool flags: used-by-single-function

(semantic_interposition!?), used-by-single-function might be the "trick"
to use here.  Maybe we can also compute a non-recursive flag on
main/4 to say that control flow cannot possibly be (indirectly) recursive.

For the threading issue we might need a flag like
not-called-by-address-taken-functions (including not address taken itself) on
functions which should
practically rule out being a thread.

Anyway, the testcase in GCC 11 relies on cunrolli unrolling the inner loop
and cunroll unrolling the outer loop while GCC 12 no longer unrolls the
outer loop because

size: 18-3, last_iteration: 17-3
  Loop size: 18
  Estimated size after unrolling: 19
Not unrolling loop 1: contains call and code would grow.

while GCC 11 has

size: 17-3, last_iteration: 16-3
  Loop size: 17
  Estimated size after unrolling: 18
Making edge 14->9 impossible by redistributing probability to other edges.
Making edge 4->5 impossible by redistributing probability to other edges.
t.c:8:21: optimized: loop with 1 iterations completely unrolled (header
execution count 134197598)
Exit condition of peeled iterations was eliminated.
Last iteration exit edge was proved true.
Forced exit to be taken: if (0 != 0)

The difference is get_loop_hot_path () which on trunk gets presented with
a loop body where some extra path duplication has occured, duplicating the
store
to d and directing the path to foo () where the respective edge has 66%
probability vs. 33% on trunk and on the GCC 11 branch the situation
is reversed with 67% for the skip over the call.

On trunk threadfull1 duplicates the path with the store to 'd' and that
is also what wrecks the edge probabilities.

I think that's what we definitely need to fix here - the profile wreckage
done by threadfull1.

  parent reply	other threads:[~2022-03-10  9:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-21 12:51 [Bug tree-optimization/102879] New: " theodort at inf dot ethz.ch
2021-10-21 13:19 ` [Bug tree-optimization/102879] " marxin at gcc dot gnu.org
2021-10-21 14:09 ` aldyh at gcc dot gnu.org
2021-10-21 14:42 ` amacleod at redhat dot com
2021-10-21 20:01 ` pinskia at gcc dot gnu.org
2022-03-10  9:42 ` rguenth at gcc dot gnu.org [this message]
2022-03-10 10:06 ` rguenth at gcc dot gnu.org
2022-04-25 13:12 ` rguenth at gcc dot gnu.org
2022-05-06  8:31 ` [Bug tree-optimization/102879] [12/13 " jakub at gcc dot gnu.org
2022-07-26 13:22 ` rguenth at gcc dot gnu.org
2023-05-08 12:22 ` [Bug tree-optimization/102879] [12/13/14 " rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-102879-4-Z4FSEmw6LV@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).