[Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022
Date: Fri, 28 Jul 2023 07:51:51 +0000	[thread overview]
Message-ID: <bug-106293-4-vDHd9Yl3e5@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-106293-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jan Hubicka <hubicka@gcc.gnu.org>:

https://gcc.gnu.org/g:b24acae8f4d315a5b071ffc2574ce91c7a0800ca

commit r14-2850-gb24acae8f4d315a5b071ffc2574ce91c7a0800ca
Author: Jan Hubicka <jh@suse.cz>
Date:   Fri Jul 28 09:48:34 2023 +0200

    loop-split improvements, part 2

    this patch fixes profile update in the first case of loop splitting.
    The pass still gives up on very basic testcases:

    __attribute__ ((noinline,noipa))
    void test1 (int n)
    {
      if (n <= 0 || n > 100000)
        return;
      for (int i = 0; i <= n; i++)
        {
          if (i < n)
            do_something ();
          if (a[i])
            do_something2();
        }
    }

    Here I needed to do the conditoinal that enforces sane value range of n.
    The reason is that it gives up on:
          !number_of_iterations_exit (loop1, exit1, &niter, false, true)
    and without the conditonal we get assumption that n>=0 and not INT_MAX.
    I think from overflow we shold derive that INT_MAX test is not needed and
since
    the loop does nothing for n<0 it is also just an paranoia.

    I am not sure how to fix this though :(.  In general the pass does not
really
    need to compute iteration count.  It only needs to know what direction the
IVs
    go so it can detect tests that fires in first part of iteration space.

    Rich, any idea what the correct test should be?

    In testcase:
      for (int i = 0; i < 200; i++)
        if (i < 150)
          do_something ();
        else
          do_something2 ();
    the old code did wrong update of the exit condition probabilities.
    We know that first loop iterates 150 times and the second loop 50 times
    and we get it by simply scaling loop body by the probability of inner test.

    With the patch we now get:

      <bb 2> [count: 1000]:

      <bb 3> [count: 150000]:    <- loop 1 correctly iterates 149 times
      # i_10 = PHI <i_7(8), 0(2)>
      do_something ();
      i_7 = i_10 + 1;
      if (i_7 <= 149)
        goto <bb 8>; [99.33%]
      else
        goto <bb 17>; [0.67%]

      <bb 8> [count: 149000]:
      goto <bb 3>; [100.00%]

      <bb 16> [count: 1000]:
      # i_15 = PHI <i_18(17)>

      <bb 9> [count: 49975]:    <- loop 2 should iterate 50 times but
                                   we are slightly wrong
      # i_3 = PHI <i_15(16), i_14(13)>
      do_something2 ();
      i_14 = i_3 + 1;
      if (i_14 != 200)
        goto <bb 13>; [98.00%]
      else
        goto <bb 7>; [2.00%]

      <bb 13> [count: 48975]:
      goto <bb 9>; [100.00%]

      <bb 17> [count: 1000]:   <- this test is always true becuase it is
                                  reached form bb 3
      # i_18 = PHI <i_7(3)>
      if (i_18 != 200)
        goto <bb 16>; [99.95%]
      else
        goto <bb 7>; [0.05%]

      <bb 7> [count: 1000]:
      return;

    The reason why we are slightly wrong is the condtion in bb17 that
    is always true but the pass does not konw it.

    Rich any idea how to do that?  I think connect_loops should work out
    the cas where the loop exit conditon is never satisfied at the time
    the splitted condition fails for first time.

    Before patch on hmmer we get a lot of mismatches:
    Profile report here claims:
    dump id |static mismat|dynamic mismatch                                    
|
            |in count     |in count                  |time                     
|
    lsplit  |      5    +5|   8151850567  +8151850567| 531506481006      
+57.9%|
    ldist   |      9    +4|  15345493501  +7193642934| 606848841056      
+14.2%|
    ifcvt   |     10    +1|  15487514871   +142021370| 689469797790      
+13.6%|
    vect    |     35   +25|  17558425961  +2070911090| 517375405715      
-25.0%|
    cunroll |     42    +7|  16898736178   -659689783| 452445796198       
-4.9%|
    loopdone|     33    -9|   2678017188 -14220718990| 330969127663            
|
    tracer  |     34    +1|   2678018710        +1522| 330613415364       
+0.0%|
    fre     |     33    -1|   2676980249     -1038461| 330465677073       
-0.0%|
    expand  |     28    -5|   2497468467  
-179511782|--------------------------|

    With patch

    lsplit  |      0      |            0             | 328723360744       
-2.3%|
    ldist   |      0      |            0             | 396193562452      
+20.6%|
    ifcvt   |      1    +1|     71010686    +71010686| 478743508522      
+20.8%|
    vect    |     14   +13|    697518955   +626508269| 299398068323      
-37.5%|
    cunroll |     13    -1|    489349408   -208169547| 257777839725      
-10.5%|
    loopdone|     11    -2|    402558559    -86790849| 201010712702            
|
    tracer  |     13    +2|    402977200      +418641| 200651036623       
+0.0%|
    fre     |     13      |    402622146      -355054| 200344398654       
-0.2%|
    expand  |     11    -2|    333608636   
-69013510|--------------------------|

    So no mismatches for lsplit and ldist and also lsplit thinks it improves
speed by
    2.3% rather than regressig it by 57%.

    Update is still not perfect since we do not work out that the second loop
    never iterates.

    Ifcft wrecks profile by desing since it insert conditonals with both arms
100%
    that will be eliminated later after vect.  It is not clear to me what
happens
    in vect though.

    Bootstrapped/regtested x86_64-linux, comitted.

    gcc/ChangeLog:

            PR middle-end/106923
            * tree-ssa-loop-split.cc (connect_loops): Change probability
            of the test preconditioning second loop to very_likely.
            (fix_loop_bb_probability): Handle correctly case where
            on of the arms of the conditional is empty.
            (split_loop): Fold the test guarding first condition to
            see if it is constant true; Set correct entry block
            probabilities of the split loops; determine correct loop
            eixt probabilities.

    gcc/testsuite/ChangeLog:

            PR middle-end/106293
            * gcc.dg/tree-prof/loop-split-1.c: New test.
            * gcc.dg/tree-prof/loop-split-2.c: New test.
            * gcc.dg/tree-prof/loop-split-3.c: New test.

next prev parent reply	other threads:[~2023-07-28  7:51 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14  9:08 [Bug tree-optimization/106293] New: " jamborm at gcc dot gnu.org
2022-07-14  9:22 ` [Bug tree-optimization/106293] [13 Regression] " rguenth at gcc dot gnu.org
2022-07-14 12:10 ` rguenth at gcc dot gnu.org
2022-07-14 12:22 ` rguenth at gcc dot gnu.org
2022-07-25  9:44 ` luoxhu at gcc dot gnu.org
2022-07-25  9:46 ` luoxhu at gcc dot gnu.org
2023-01-10 12:12 ` yann at ywg dot ch
2023-01-10 12:45 ` rguenth at gcc dot gnu.org
2023-01-10 15:53 ` cvs-commit at gcc dot gnu.org
2023-01-10 15:54 ` rguenth at gcc dot gnu.org
2023-01-11  7:04 ` cvs-commit at gcc dot gnu.org
2023-04-17 15:11 ` [Bug tree-optimization/106293] [13/14 " jakub at gcc dot gnu.org
2023-04-17 16:15 ` jamborm at gcc dot gnu.org
2023-04-26  6:56 ` rguenth at gcc dot gnu.org
2023-07-27  9:23 ` rguenth at gcc dot gnu.org
2023-07-27 18:01 ` hubicka at gcc dot gnu.org
2023-07-27 21:38 ` hubicka at gcc dot gnu.org
2023-07-28  7:22 ` rguenther at suse dot de
2023-07-28  8:01   ` Jan Hubicka
2023-07-28  7:51 ` cvs-commit at gcc dot gnu.org [this message]
2023-07-28  8:01 ` hubicka at ucw dot cz
2023-07-28 12:09 ` rguenther at suse dot de
2023-07-31  7:44 ` hubicka at gcc dot gnu.org
2023-07-31 15:39 ` jamborm at gcc dot gnu.org
2023-08-01 10:40 ` hubicka at gcc dot gnu.org
2023-08-02  8:48 ` hubicka at gcc dot gnu.org
2023-08-02  9:42 ` rguenth at gcc dot gnu.org
2023-08-04 10:09 ` [Bug tree-optimization/106293] [13 regression] " hubicka at gcc dot gnu.org
2023-08-07  8:56 ` cvs-commit at gcc dot gnu.org
2023-08-10 16:01 ` hubicka at gcc dot gnu.org
2024-05-21  9:11 ` jakub at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-106293-4-vDHd9Yl3e5@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).