From: Richard Biener <rguenther@suse.de>
To: Filip Kastl <fkastl@suse.cz>
Cc: gcc-patches@gcc.gnu.org, Jan Hubicka <hubicka@ucw.cz>
Subject: Re: [RFC] gimple ssa: SCCP - A new PHI optimization pass
Date: Fri, 1 Sep 2023 10:53:46 +0000 (UTC) [thread overview]
Message-ID: <nycvar.YFH.7.77.849.2309011041540.22006@jbgna.fhfr.qr> (raw)
In-Reply-To: <ZPG4sp57Scr7Ask5@localhost.localdomain>
On Fri, 1 Sep 2023, Filip Kastl wrote:
> > That's interesting. Your placement at
> >
> > NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
> > NEXT_PASS (pass_phiopt, true /* early_p */);
> > + NEXT_PASS (pass_sccp);
> >
> > and
> >
> > NEXT_PASS (pass_tsan);
> > NEXT_PASS (pass_dse, true /* use DR analysis */);
> > NEXT_PASS (pass_dce);
> > + NEXT_PASS (pass_sccp);
> >
> > isn't immediately after the "best" existing pass we have to
> > remove dead PHIs which is pass_cd_dce. phiopt might leave
> > dead PHIs around and the second instance runs long after the
> > last CD-DCE.
> >
> > So I wonder if your pass just detects unnecessary PHIs we'd have
> > removed by other means and what survives until RTL expansion is
> > what we should count?
> >
> > Can you adjust your original early placement to right after
> > the cd-dce pass and for the late placement turn the dce pass
> > before it into cd-dce and re-do your measurements?
>
> So I did this
>
> NEXT_PASS (pass_dse);
> NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
> NEXT_PASS (pass_sccp);
> NEXT_PASS (pass_phiopt, true /* early_p */);
> NEXT_PASS (pass_tail_recursion);
>
> and this
>
> NEXT_PASS (pass_dse, true /* use DR analysis */);
> NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
> NEXT_PASS (pass_sccp);
> /* Pass group that runs when 1) enabled, 2) there are loops
>
> and got these results:
>
> 500.perlbench_r
> Started with (1) 30318
> Ended with (1) 26219
> Removed PHI % (1) 13.52002110957187149600
> Started with (2) 39043
> Ended with (2) 38941
> Removed PHI % (2) .26125041620777092000
>
> 502.gcc_r
> Started with (1) 148361
> Ended with (1) 140464
> Removed PHI % (1) 5.32282742769326170700
> Started with (2) 216209
> Ended with (2) 215367
> Removed PHI % (2) .38943799749316633500
>
> 505.mcf_r
> Started with (1) 342
> Ended with (1) 304
> Removed PHI % (1) 11.11111111111111111200
> Started with (2) 437
> Ended with (2) 433
> Removed PHI % (2) .91533180778032036700
>
> 523.xalancbmk_r
> Started with (1) 62995
> Ended with (1) 58289
> Removed PHI % (1) 7.47043416144138423700
> Started with (2) 134026
> Ended with (2) 133193
> Removed PHI % (2) .62152119737961291100
>
> 531.deepsjeng_r
> Started with (1) 1402
> Ended with (1) 1264
> Removed PHI % (1) 9.84308131241084165500
> Started with (2) 1928
> Ended with (2) 1920
> Removed PHI % (2) .41493775933609958600
>
> 541.leela_r
> Started with (1) 3398
> Ended with (1) 3060
> Removed PHI % (1) 9.94702766333137139500
> Started with (2) 4473
> Ended with (2) 4453
> Removed PHI % (2) .44712720769058797300
>
> 557.xz_r
> Started with (1) 47
> Ended with (1) 44
> Removed PHI % (1) 6.38297872340425532000
> Started with (2) 43
> Ended with (2) 43
> Removed PHI % (2) 0
>
> These measurements don't differ very much from the previous. It seems to me
> that phiopt does output some redundant PHIs but the vast majority of the
> eliminated PHIs are generated in earlier passes and cd_dce isn't able to get
> rid of them.
>
> A noteworthy information might be that most of the eliminated PHIs are actually
> trivial PHIs. I consider a PHI to be trivial if it only references itself or
> one other SSA name.
Ah. The early pass numbers are certainly intresting - can you elaborate
on the last bit? We have for example loop-closed PHI nodes like
_1 = PHI <_2>
and there are non-trivial degenerate PHIs like
_1 = PHI <_2, _2>
those are generally removed by value-numbering (FRE, DOM and PRE) and SSA
propagation (CCP and copyprop), they are not "dead" so CD-DCE doesn't
remove them.
But we do have passes removing these kind of PHIs.
The issue with the early pass is likely that we have
NEXT_PASS (pass_fre, true /* may_iterate */);
^^
would elimimate these kind of PHIs
NEXT_PASS (pass_early_vrp);
^^
rewrites into loop-closed SSA, adding many such PHIs
NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_dse);
NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
and until here there's no pass eliding the LC SSA PHIs.
You could add a pass_copy_prop after early_vrp, the later sccp
pass shouldn't run into this issue I think so it must be other
passes adding such kind of PHIs.
Maybe you can count single-argument PHIs, degenerate multi-arg PHIs
and "other" PHIs separately as you remove them?
> Here is a comparison of the newest measurements (sccp after cd_dce) with the
> previous ones (sccp after phiopt and dce):
>
> 500.perlbench_r
>
> Started with (1-PREV) 30287
> Started with (1-NEW) 30318
>
> Ended with (1-PREV) 26188
> Ended with (1-NEW) 26219
>
> Removed PHI % (1-PREV) 13.53385941162875161000
> Removed PHI % (1-NEW) 13.52002110957187149600
>
> Started with (2-PREV) 38005
> Started with (2-NEW) 39043
>
> Ended with (2-PREV) 37897
> Ended with (2-NEW) 38941
>
> Removed PHI % (2-PREV) .28417313511380081600
> Removed PHI % (2-NEW) .26125041620777092000
>
> 502.gcc_r
>
> Started with (1-PREV) 148187
> Started with (1-NEW) 148361
>
> Ended with (1-PREV) 140292
> Ended with (1-NEW) 140464
>
> Removed PHI % (1-PREV) 5.32772780338356266100
> Removed PHI % (1-NEW) 5.32282742769326170700
>
> Started with (2-PREV) 211479
> Started with (2-NEW) 216209
>
> Ended with (2-PREV) 210635
> Ended with (2-NEW) 215367
>
> Removed PHI % (2-PREV) .39909399987705635100
> Removed PHI % (2-NEW) .38943799749316633500
>
>
> Filip K
>
>
> P.S. I made a small mistake and didn't compute the benchmark speedup
> percentages right in the previous email. Here are the corrected results. The
> correct percentages are a little bit smaller but very similar. There is still a
> ~2% speedup with 505.mcf_r and 541.leela_r.
>
> 500.perlbench_r
> Without SCCP: 244.151807s
> With SCCP: 242.448438s
> -0.6976679881791663%
>
> 502.gcc_r
> Without SCCP: 211.029606s
> With SCCP: 211.614523s
> +0.27717295742853737%
>
> 505.mcf_r
> Without SCCP: 298.782621s
> With SCCP: 291.671468s
> -2.380042378703145%
>
> 523.xalancbmk_r
> Without SCCP: 189.940639s
> With SCCP: 189.876261s
> -0.03389374719330334%
>
> 531.deepsjeng_r
> Without SCCP: 250.63648s
> With SCCP: 250.988624s
> +0.14049989849840747%
>
> 541.leela_r
> Without SCCP: 346.066278s
> With SCCP: 339.692987s
> -1.8416388435281157%
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
prev parent reply other threads:[~2023-09-01 10:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-24 15:07 Filip Kastl
2023-08-24 15:47 ` Richard Biener
2023-08-24 15:54 ` Jakub Jelinek
2023-08-31 11:26 ` Filip Kastl
2023-08-31 11:44 ` Jakub Jelinek
2023-08-31 12:13 ` Richard Biener
2023-08-31 21:44 ` Andrew Pinski
2023-09-01 6:34 ` Richard Biener
2023-09-01 10:10 ` Filip Kastl
2023-09-01 10:53 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.77.849.2309011041540.22006@jbgna.fhfr.qr \
--to=rguenther@suse.de \
--cc=fkastl@suse.cz \
--cc=gcc-patches@gcc.gnu.org \
--cc=hubicka@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).