public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
Date: Fri, 14 Jul 2023 10:22:55 +0000	[thread overview]
Message-ID: <bug-109154-4-0RqBs4wgMA@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-109154-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154

--- Comment #67 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <tnfchris@gcc.gnu.org>:

https://gcc.gnu.org/g:9ed4fcfe47f28b36c73d74109898514ef4da00fb

commit r14-2517-g9ed4fcfe47f28b36c73d74109898514ef4da00fb
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Fri Jul 14 11:21:46 2023 +0100

    ifcvt: Sort PHI arguments not only occurrences but also complexity
[PR109154]

    This patch builds on the previous patch by fixing another issue with the
    way ifcvt currently picks which branches to test.

    The issue with the current implementation is while it sorts for
    occurrences of the argument, it doesn't check for complexity of the
arguments.

    As an example:

      <bb 15> [local count: 528603100]:
      ...
      if (distbb_75 >= 0.0)
        goto <bb 17>; [59.00%]
      else
        goto <bb 16>; [41.00%]

      <bb 16> [local count: 216727269]:
      ...
      goto <bb 19>; [100.00%]

      <bb 17> [local count: 311875831]:
      ...
      if (distbb_75 < iftmp.0_98)
        goto <bb 18>; [20.00%]
      else
        goto <bb 19>; [80.00%]

      <bb 18> [local count: 62375167]:
      ...

      <bb 19> [local count: 528603100]:
      # prephitmp_175 = PHI <_173(18), 0.0(17), _174(16)>

    All tree arguments to the PHI have the same number of occurrences, namely
1,
    however it makes a big difference which comparison we test first.

    Sorting only on occurrences we'll pick the compares coming from BB 18 and
BB 17,
    This means we end up generating 4 comparisons, while 2 would have been
enough.

    By keeping track of the "complexity" of the COND in each BB, (i.e. the
number
    of comparisons needed to traverse from the start [BB 15] to end [BB 19])
and
    using a key tuple of <occurrences, complexity> we end up selecting the
compare
    from BB 16 and BB 18 first.  BB 16 only requires 1 compare, and BB 18,
after we
    test BB 16 also only requires one additional compare.  This change paired
with
    the one previous above results in the optimal 2 compares.

    For deep nesting, i.e. for

    ...
      _79 = vr_15 > 20;
      _80 = _68 & _79;
      _82 = vr_15 <= 20;
      _83 = _68 & _82;
      _84 = vr_15 < -20;
      _85 = _73 & _84;
      _87 = vr_15 >= -20;
      _88 = _73 & _87;
      _ifc__111 = _55 ? 10 : 12;
      _ifc__112 = _70 ? 7 : _ifc__111;
      _ifc__113 = _85 ? 8 : _ifc__112;
      _ifc__114 = _88 ? 9 : _ifc__113;
      _ifc__115 = _45 ? 1 : _ifc__114;
      _ifc__116 = _63 ? 3 : _ifc__115;
      _ifc__117 = _65 ? 4 : _ifc__116;
      _ifc__118 = _83 ? 6 : _ifc__117;
      _ifc__119 = _60 ? 2 : _ifc__118;
      _ifc__120 = _43 ? 13 : _ifc__119;
      _ifc__121 = _75 ? 11 : _ifc__120;
      vw_1 = _80 ? 5 : _ifc__121;

    Most of the comparisons are still needed because the chain of
    occurrences to not negate eachother. i.e. _80 is _73 & vr_15 >= -20 and
    _85 is _73 & vr_15 < -20.  clearly given _73 needs to be true in both
branches,
    the only additional test needed is on vr_15, where the one test is the
negation
    of the other.  So we don't need to do the comparison of _73 twice.

    The changes in the patch reduces the overall number of compares by one, but
has
    a bigger effect on the dependency chain.

    Previously we would generate 5 instructions chain:

            cmple   p7.s, p4/z, z29.s, z30.s
            cmpne   p7.s, p7/z, z29.s, #0
            cmple   p6.s, p7/z, z31.s, z30.s
            cmpge   p6.s, p6/z, z27.s, z25.s
            cmplt   p15.s, p6/z, z28.s, z21.s

    as the longest chain.  With this patch we generate 3:

            cmple   p7.s, p3/z, z27.s, z30.s
            cmpne   p7.s, p7/z, z27.s, #0
            cmpgt   p7.s, p7/z, z31.s, z30.s

    and I don't think (x <= y) && (x != 0) && (z > y) can be reduced further.

    gcc/ChangeLog:

            PR tree-optimization/109154
            * tree-if-conv.cc (INCLUDE_ALGORITHM): Include.
            (struct bb_predicate): Add no_predicate_stmts.
            (set_bb_predicate): Increase predicate count.
            (set_bb_predicate_gimplified_stmts): Conditionally initialize
            no_predicate_stmts.
            (get_bb_num_predicate_stmts): New.
            (init_bb_predicate): Initialzie no_predicate_stmts.
            (release_bb_predicate): Cleanup no_predicate_stmts.
            (insert_gimplified_predicates): Preserve no_predicate_stmts.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/109154
            * gcc.dg/vect/vect-ifcvt-20.c: New test.

  parent reply	other threads:[~2023-07-14 10:22 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-16 11:57 [Bug tree-optimization/109154] New: [13 regression] aarch64 -mcpu=neoverse-v1 microbude performance regression pgodbole at nvidia dot com
2023-03-16 13:11 ` [Bug tree-optimization/109154] " tnfchris at gcc dot gnu.org
2023-03-16 14:58 ` [Bug target/109154] " rguenth at gcc dot gnu.org
2023-03-16 17:03 ` tnfchris at gcc dot gnu.org
2023-03-16 17:03 ` [Bug target/109154] [13 regression] jump threading with de-optimizes nested floating point comparisons tnfchris at gcc dot gnu.org
2023-03-22 10:20 ` [Bug tree-optimization/109154] [13 regression] jump threading " aldyh at gcc dot gnu.org
2023-03-22 10:29 ` avieira at gcc dot gnu.org
2023-03-22 12:22 ` rguenth at gcc dot gnu.org
2023-03-22 12:42 ` rguenth at gcc dot gnu.org
2023-03-22 13:11 ` aldyh at gcc dot gnu.org
2023-03-22 14:00 ` amacleod at redhat dot com
2023-03-22 14:39 ` aldyh at gcc dot gnu.org
2023-03-27  8:09 ` rguenth at gcc dot gnu.org
2023-03-27  9:30 ` jakub at gcc dot gnu.org
2023-03-27  9:42 ` aldyh at gcc dot gnu.org
2023-03-27  9:44 ` jakub at gcc dot gnu.org
2023-03-27 10:18 ` rguenther at suse dot de
2023-03-27 10:40 ` jakub at gcc dot gnu.org
2023-03-27 10:44 ` jakub at gcc dot gnu.org
2023-03-27 10:54 ` rguenth at gcc dot gnu.org
2023-03-27 10:56 ` jakub at gcc dot gnu.org
2023-03-27 10:59 ` jakub at gcc dot gnu.org
2023-03-27 17:07 ` jakub at gcc dot gnu.org
2023-03-28  8:33 ` rguenth at gcc dot gnu.org
2023-03-28  9:01 ` cvs-commit at gcc dot gnu.org
2023-03-28 10:07 ` tnfchris at gcc dot gnu.org
2023-03-28 10:08 ` tnfchris at gcc dot gnu.org
2023-03-28 12:18 ` jakub at gcc dot gnu.org
2023-03-28 12:25 ` rguenth at gcc dot gnu.org
2023-03-28 12:42 ` rguenth at gcc dot gnu.org
2023-03-28 13:19 ` rguenth at gcc dot gnu.org
2023-03-28 13:44 ` jakub at gcc dot gnu.org
2023-03-28 13:52 ` jakub at gcc dot gnu.org
2023-03-28 15:31 ` amacleod at redhat dot com
2023-03-28 15:40 ` jakub at gcc dot gnu.org
2023-03-28 15:53 ` amacleod at redhat dot com
2023-03-28 15:58 ` jakub at gcc dot gnu.org
2023-03-28 16:42 ` amacleod at redhat dot com
2023-03-28 21:12 ` amacleod at redhat dot com
2023-03-29  6:33 ` cvs-commit at gcc dot gnu.org
2023-03-29  6:38 ` rguenth at gcc dot gnu.org
2023-03-29 22:41 ` amacleod at redhat dot com
2023-03-30 18:17 ` cvs-commit at gcc dot gnu.org
2023-04-05  9:28 ` tnfchris at gcc dot gnu.org
2023-04-05  9:34 ` ktkachov at gcc dot gnu.org
2023-04-11  9:36 ` rguenth at gcc dot gnu.org
2023-04-13 16:54 ` jakub at gcc dot gnu.org
2023-04-13 17:25 ` rguenther at suse dot de
2023-04-13 17:29 ` jakub at gcc dot gnu.org
2023-04-14 18:10 ` jakub at gcc dot gnu.org
2023-04-14 18:14 ` jakub at gcc dot gnu.org
2023-04-14 18:22 ` jakub at gcc dot gnu.org
2023-04-14 19:09 ` jakub at gcc dot gnu.org
2023-04-15 10:10 ` cvs-commit at gcc dot gnu.org
2023-04-17 11:07 ` jakub at gcc dot gnu.org
2023-04-25 18:32 ` [Bug tree-optimization/109154] [13/14 " tnfchris at gcc dot gnu.org
2023-04-25 18:34 ` jakub at gcc dot gnu.org
2023-04-26  6:58 ` rguenth at gcc dot gnu.org
2023-04-26  9:43 ` tnfchris at gcc dot gnu.org
2023-04-26 10:07 ` jakub at gcc dot gnu.org
2023-07-07 18:10 ` tnfchris at gcc dot gnu.org
2023-07-10  7:15 ` rguenth at gcc dot gnu.org
2023-07-10 10:33 ` tnfchris at gcc dot gnu.org
2023-07-10 10:46 ` rguenth at gcc dot gnu.org
2023-07-10 11:02 ` tnfchris at gcc dot gnu.org
2023-07-10 11:27 ` rguenth at gcc dot gnu.org
2023-07-10 11:49 ` tnfchris at gcc dot gnu.org
2023-07-14 10:22 ` cvs-commit at gcc dot gnu.org
2023-07-14 10:22 ` cvs-commit at gcc dot gnu.org [this message]
2023-07-27  9:25 ` rguenth at gcc dot gnu.org
2023-10-02 10:53 ` cvs-commit at gcc dot gnu.org
2023-10-18  8:54 ` cvs-commit at gcc dot gnu.org
2023-10-18  8:54 ` cvs-commit at gcc dot gnu.org
2023-10-18  8:54 ` cvs-commit at gcc dot gnu.org
2023-10-18  8:55 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:20 ` cvs-commit at gcc dot gnu.org
2023-11-09 14:25 ` [Bug tree-optimization/109154] [13 " tnfchris at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-109154-4-0RqBs4wgMA@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).