public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/106293] [13 regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022
Date: Mon, 07 Aug 2023 08:56:32 +0000	[thread overview]
Message-ID: <bug-106293-4-jEPBuHHiM4@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-106293-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293

--- Comment #27 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jan Hubicka <hubicka@gcc.gnu.org>:

https://gcc.gnu.org/g:73c14db6d1a8c1267137b94c41f2e2c9410dcbb1

commit r14-3015-g73c14db6d1a8c1267137b94c41f2e2c9410dcbb1
Author: Jan Hubicka <jh@suse.cz>
Date:   Mon Aug 7 10:55:58 2023 +0200

    Fix profile update after versioning ifconverted loop

    If loop is ifconverted and later versioning by vectorizer, vectorizer will
    reuse the scalar loop produced by ifconvert. Curiously enough it does not
seem
    to do so for versions produced by loop distribution while for loop
distribution
    this matters (since since both ldist versions survive to final code) while
    after ifcvt it does not (since we remove non-vectorized path).

    This patch fixes associated profile update.  Here it is necessary to scale
both
    arms of the conditional according to runtime checks inserted.  We got
partly
    right the loop body, but not the preheader block and block after exit.  The
    first is particularly bad since it changes loop iterations estimates.

    So we now turn 4 original loops:
      loop 1: iterations by profile: 473.497707 (reliable) entry count:84821
(precise, freq 0.9979)
      loop 2: iterations by profile: 100.000000 (reliable) entry count:39848881
(precise, freq 468.8104)
      loop 3: iterations by profile: 100.000000 (reliable) entry count:39848881
(precise, freq 468.8104)
      loop 4: iterations by profile: 100.999596 (reliable) entry count:84167
(precise, freq 0.9902)

    Into following loops
      iterations by profile: 5.312499 (unreliable, maybe flat) entry
count:12742188 (guessed, freq 149.9081)
         vectorized and split loop 1, peeled
      iterations by profile: 0.009496 (unreliable, maybe flat) entry
count:374798 (guessed, freq 4.4094)
         split loop 1 (last iteration), peeled
      iterations by profile: 100.000008 (unreliable) entry count:3945039
(guessed, freq 46.4122)
         scalar version of loop 1
      iterations by profile: 100.000007 (unreliable) entry count:7101070
(guessed, freq 83.5420)
         redundant scalar version of loop 1 which we could eliminate if
vectorizer understood ldist
      iterations by profile: 100.000000 (unreliable) entry count:35505353
(guessed, freq 417.7100)
         unvectorized loop 2
      iterations by profile: 5.312500 (unreliable) entry count:25563855
(guessed, freq 300.7512)
         vectorized loop 2, not peeled (hits max-peel-insns)
      iterations by profile: 100.000007 (unreliable) entry count:7101070
(guessed, freq 83.5420)
         unvectorized loop 3
      iterations by profile: 5.312500 (unreliable) entry count:25563855
(guessed, freq 300.7512)
         vectorized loop 3, not peeled (hits max-peel-insns)
      iterations by profile: 473.497707 (reliable) entry count:84821 (precise,
freq 0.9979)
         loop 1
      iterations by profile: 100.999596 (reliable) entry count:84167 (precise,
freq 0.9902)
         loop 4

    With this change we are on 0 profile erros on hmmer benchmark:

    Pass dump id |dynamic mismatch          |overall                           
  |
                 |in count                  |size            |time             
  |
    172t ch_vect |            0             |      996       | 385812023346    
  |
    173t ifcvt   |     71010686    +71010686|     1021  +2.5%| 468361969416
+21.4%|
    174t vect    |    210830784   +139820098|     1497 +46.6%| 216073467874
-53.9%|
    175t dce     |    210830784             |     1387  -7.3%| 205273170281 
-5.0%|
    176t pcom    |    210830784             |     1387       | 201722634966 
-1.7%|
    177t cunroll |            0   -210830784|     1443  +4.0%| 180441501289
-10.5%|
    182t ivopts  |            0             |     1385  -4.0%| 136412345683
-24.4%|
    183t lim     |            0             |     1389  +0.3%| 135093950836 
-1.0%|
    192t reassoc |            0             |     1381  -0.6%| 134778347700 
-0.2%|
    193t slsr    |            0             |     1380  -0.1%| 134738100330 
-0.0%|
    195t tracer  |            0             |     1521 +10.2%| 134738179146 
+0.0%|
    196t fre     |      2680654     +2680654|     1489  -2.1%| 134659672725 
-0.1%|
    198t dom     |      5361308     +2680654|     1473  -1.1%| 134449553658 
-0.2%|
    201t vrp     |      5361308             |     1474  +0.1%| 134489004050 
+0.0%|
    202t ccp     |      5361308             |     1472  -0.1%| 134440752274 
-0.0%|
    204t dse     |      5361308             |     1444  -1.9%| 133802300525 
-0.5%|
    206t forwprop|      5361308             |     1433  -0.8%| 133542828370 
-0.2%|
    207t sink    |      5361308             |     1431  -0.1%| 133542658728 
-0.0%|
    211t store-me|      5361308             |     1430  -0.1%| 133542573728 
-0.0%|
    212t cddce   |      5361308             |     1428  -0.1%| 133541776728 
-0.0%|
    258r expand  |      5361308            
|----------------|--------------------|
    260r into_cfg|      5361308             |     9334  -0.8%| 885820707913 
-0.6%|
    261r jump    |      5361308             |     9330  -0.0%| 885820367913 
-0.0%|
    265r fwprop1 |      5361308             |     9206  -1.3%| 876756504385 
-1.0%|
    267r rtl pre |      5361308             |     9210  +0.0%| 876914305953 
+0.0%|
    269r cprop   |      5361308             |     9202  -0.1%| 876756165101 
-0.0%|
    271r cse_loca|      5361308             |     9198  -0.0%| 876727760821 
-0.0%|
    272r ce1     |      5361308             |     9126  -0.8%| 875726815885 
-0.1%|
    276r loop2_in|      5361308             |     9167  +0.4%| 873573110570 
-0.2%|
    282r cprop   |      5361308             |     9095  -0.8%| 871937317262 
-0.2%|
    284r cse2    |      5361308             |     9091  -0.0%| 871936977978 
-0.0%|
    285r dse1    |      5361308             |     9067  -0.3%| 871437031602 
-0.1%|
    290r combine |      5361308             |     9071  +0.0%| 869206278202 
-0.3%|
    292r stv     |      5361308             |    17157 +89.1%|
2111071925708+142.9%|
    295r bbpart  |      5361308             |    17161  +0.0%| 2111071925708   
   |
    296r outof_cf|      5361308             |    17233  +0.4%| 2111655121000 
+0.0%|
    297r split1  |      5361308             |    17245  +0.1%| 2111656138852 
+0.0%|
    306r ira     |      5361308             |    19189 +11.3%| 2136098398308 
+1.2%|
    307r reload  |      5361308             |    12101 -36.9%| 981091222830
-54.1%|
    309r postrelo|      5361308             |    12019  -0.7%| 978750345475 
-0.2%|
    310r gcse2   |      5361308             |    12027  +0.1%| 978329108320 
-0.0%|
    311r split2  |      5361308             |    12023  -0.0%| 978507631352 
+0.0%|
    312r ree     |      5361308             |    12027  +0.0%| 978505414244 
-0.0%|
    313r cmpelim |      5361308             |    11979  -0.4%| 977531601988 
-0.1%|
    314r pro_and_|      5361308             |    12091  +0.9%| 977541801988 
+0.0%|
    315r dse2    |      5361308             |    12091       | 977541801988    
  |
    316r csa     |      5361308             |    12087  -0.0%| 977541461988 
-0.0%|
    317r jump2   |      5361308             |    12039  -0.4%| 977683176572 
+0.0%|
    318r compgoto|      5361308             |    12039       | 977683176572    
  |
    320r peephole|      5361308             |    12047  +0.1%| 977362727612 
-0.0%|
    321r ce3     |      5361308             |    12047       | 977362727612    
  |
    323r cprop_ha|      5361308             |    11907  -1.2%| 968751076676 
-0.9%|
    324r rtl_dce |      5361308             |    11903  -0.0%| 968593274820 
-0.0%|
    325r bbro    |      5361308             |    11883  -0.2%| 967964046644 
-0.1%|

    Bootstrapped/regtested x86_64-linux, plan to commit it tomorrow if there
are no
    complains.

    gcc/ChangeLog:

            PR tree-optimization/106293
            * tree-vect-loop-manip.cc (vect_loop_versioning): Fix profile
update.
            * tree-vect-loop.cc (vect_transform_loop): Likewise.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/106293
            * gcc.dg/vect/vect-cond-11.c: Check profile consistency.
            * gcc.dg/vect/vect-widen-mult-extern-1.c: Check profile
consistency.

  parent reply	other threads:[~2023-08-07  8:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14  9:08 [Bug tree-optimization/106293] New: " jamborm at gcc dot gnu.org
2022-07-14  9:22 ` [Bug tree-optimization/106293] [13 Regression] " rguenth at gcc dot gnu.org
2022-07-14 12:10 ` rguenth at gcc dot gnu.org
2022-07-14 12:22 ` rguenth at gcc dot gnu.org
2022-07-25  9:44 ` luoxhu at gcc dot gnu.org
2022-07-25  9:46 ` luoxhu at gcc dot gnu.org
2023-01-10 12:12 ` yann at ywg dot ch
2023-01-10 12:45 ` rguenth at gcc dot gnu.org
2023-01-10 15:53 ` cvs-commit at gcc dot gnu.org
2023-01-10 15:54 ` rguenth at gcc dot gnu.org
2023-01-11  7:04 ` cvs-commit at gcc dot gnu.org
2023-04-17 15:11 ` [Bug tree-optimization/106293] [13/14 " jakub at gcc dot gnu.org
2023-04-17 16:15 ` jamborm at gcc dot gnu.org
2023-04-26  6:56 ` rguenth at gcc dot gnu.org
2023-07-27  9:23 ` rguenth at gcc dot gnu.org
2023-07-27 18:01 ` hubicka at gcc dot gnu.org
2023-07-27 21:38 ` hubicka at gcc dot gnu.org
2023-07-28  7:22 ` rguenther at suse dot de
2023-07-28  8:01   ` Jan Hubicka
2023-07-28  7:51 ` cvs-commit at gcc dot gnu.org
2023-07-28  8:01 ` hubicka at ucw dot cz
2023-07-28 12:09 ` rguenther at suse dot de
2023-07-31  7:44 ` hubicka at gcc dot gnu.org
2023-07-31 15:39 ` jamborm at gcc dot gnu.org
2023-08-01 10:40 ` hubicka at gcc dot gnu.org
2023-08-02  8:48 ` hubicka at gcc dot gnu.org
2023-08-02  9:42 ` rguenth at gcc dot gnu.org
2023-08-04 10:09 ` [Bug tree-optimization/106293] [13 regression] " hubicka at gcc dot gnu.org
2023-08-07  8:56 ` cvs-commit at gcc dot gnu.org [this message]
2023-08-10 16:01 ` hubicka at gcc dot gnu.org
2024-05-21  9:11 ` jakub at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-106293-4-jEPBuHHiM4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).